Go Back  Xtreme Visual Basic Talk > Legacy Visual Basic (VB 4/5/6) > Knowledge Base > Tutors' Corner > Using ADO with Excel files


Reply
 
Thread Tools Display Modes
  #1  
Old 03-30-2005, 07:55 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Using ADO with Excel files


While Excel is not a proper database, it is possible to connect to Excel files using ADO. This can be a convenient tool for importing Excel data into your database, or in situations where your first thought might be to use Excel automation.

However, Excel is not a true database, and Excel used as a data source has some quirks that may bite you if you don’t watch out – and they range from annoying (wrongly formatted data) to pretty bad (memory leaks). In this tutorial I will explain some of the most common issues that are particular to using ADO with Excel.

This tutorial covers:I assume that you already know the basics of ADO and SQL (if not, there’s an ADO tutorial in Tutor's Corner). I will ignore many of the finer points of ADO (cursor types, lock types etc) just to keep things simple and brief.

Further reading:
KB 257819: How To Use ADO with Excel Data from Visual Basic or VBA
KB 278973: ExcelADO demonstrates how to use ADO to read and write data in Excel workbooks
KB 303814: ExcelADOX demonstrates how to use ADOX to modify Excel workbooks

Last edited by Timbo; 06-05-2006 at 08:06 AM.
Reply With Quote
  #2  
Old 03-30-2005, 07:56 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Why use ADO with Excel

Using ADO can be safer, faster and simpler than automation – and unlike automation, it will work even if Excel is not installed on the computer. ADO is faster than automation, because it runs in-process, as opposed to out-of-process automation. ADO is also preferable for web applications since it is much more stable than server-side automation of Excel.

You can also use ADO from Excel VBA or in combination with Excel automation, to get data from other data sources and put it in Excel. You can use this for nice tricks such as worksheet functions that can read from closed Excel files, filtering data based on user-defined criteria, etc.
Reply With Quote
  #3  
Old 03-30-2005, 07:57 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Connecting to an Excel file

You can connect to an Excel file using either the OLE DB provider for Jet or ODBC. Personally, I have always used Jet.
The connection string looks like this (broken across several lines for ease of reading).
Code:
Provider=Microsoft.Jet.OLEDB.4.0;
Data Source=Z:\MyFiles\Data source.xls;
Extended Properties="Excel 8.0;HDR=Yes"
The first part is fixed. The second part is the full path to your Excel file. The third part (extended properties) provides additional details about the file. Note the extra quotation marks around the Extended Properties part of the connection string.

Excel 8.0 is the Excel file version. This is not the same as the version of Excel that’s installed on the computer. Excel 8.0 applies to all workbooks created in any version of Excel from Excel97 onwards. (Should you find a workbook created in Excel 95, change that to 7.0.) To repeat, do not change the 8.0 to 10.0 just because you’re using Excel 2002.

HDR=Yes tells whether the first row in the Excel table has column headers (field names). Set this to Yes if the first row should be used as field names, and to No if the first row should be included as data. The fields will then be named F1, F2 etc.

Here’s a simple wrapper function for connecting to Excel:
Code:
Private Function GetExcelConnection(ByVal Path As String, _ Optional ByVal Headers As Boolean = True) As Connection Dim strConn As String Dim objConn As ADODB.Connection Set objConn = New ADODB.Connection strConn = "Provider=Microsoft.Jet.OLEDB.4.0;" & _ "Data Source=" & Path & ";" & _ "Extended Properties=""Excel 8.0;HDR=" & _ IIf(Headers, "Yes", "No") & """" objConn.Open strConn Set GetExcelConnection = objConn End Function
You will need version 4.0 of the Jet Provider to connect to an Excel file. If only version 3.51 is installed, you will get an error: “Couldn’t find installable ISAM”.

Excel files are not real databases, so they do not support multiple users. You cannot open multiple connections to the same file. And as long as your connection is open, you will not be able to open the file in Excel.

Warning: You can open an ADO connection to a file that is already open in Excel. In fact, you can even connect to the workbook where the code resides. However, doing so causes a memory leak in Excel. Microsoft recommends creating a temporary copy of the Excel file and connecting to that copy instead. Read more:
KB 319998: Memory leak occurs when you query an open Excel worksheet by using ADO.

Protected files
It is not possible to connect to Excel files that are password-protected; the Password parameter of the Open method will not help. This includes both files that require a password to be opened, and files where it's the workbook structure that's password-protected.

Last edited by herilane; 01-19-2006 at 03:37 PM.
Reply With Quote
  #4  
Old 03-30-2005, 07:58 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Reading data from Excel

The simplest and most common thing you will likely want to do is to read data from Excel files. This can be useful in many situations as an alternative to automation – if Excel is not installed on the machine, if you want to read data from a closed workbook without opening it (in a worksheet function, for example), or if you simply want to be able to use SQL to filter your data.

In databases, data is organized in tables. In Excel workbooks, data are organized in worksheets and ranges. ADO considers both of these to be tables, and you can read both of them. To differentiate sheets from ranges, sheet names have to be followed by a $ sign. Ranges can be specified either by name or by address. You can also refer to worksheet-level named ranges (as opposed to workbook-level names).
Code:
'To read a sheet: strRequest = "SELECT * FROM Sheet1$" 'To refer to a range by its address: strRequest = "SELECT * FROM Sheet1$A1:D10" 'To refer to a single-cell range, pretend it's a multi-cell range 'and specify both the top-left and bottom-right cells: strRequest = "SELECT * FROM Sheet1$A1:A1" 'To read a named range: strRequest = "SELECT * FROM MyDataRange" 'To read a worksheet-level named range strRequest = "SELECT * FROM Sheet1$MyData" 'and then just open a recordset objRS.Open strRequest, objConn
All of this will work on visible as well as hidden and xlVeryHidden sheets.

Note that if you specify a range address, and that range extends outside the used range of the sheet (i.e. the range that would be returned if you pulled in the whole sheet), you will not get blank records for the empty unused rows. For example, if there are data in the first 15 rows in Sheet2, with headers in the first row, then [Sheet2$] would return 14 records and [Sheet2$A1:B100] will also return the same 14 records.
Reply With Quote
  #5  
Old 03-30-2005, 07:58 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Some pitfalls around data types

Excel is not a database, so it does not provide ADO with information about the data types for each column / field. When you open an Excel table, Jet will scan the first 8 rows of the table and attempt to guess at the data types.

This works fine as long as the data are not mixed, i.e. as long as each column only has data of one type. If the column mixes data types, the majority decides, and the minority are killed off. For example, if the data mixes strings and numbers, and there are more strings than numbers, then the strings will be returned properly, but the numbers will be replaced with null values.

You can override this behaviour, so that all data are read as text, by setting another extended property: IMEX=1. That tells the driver to use Import mode, which in turn forces it to follow the registry setting ImportMixedTypes=Text.

Read more about this problem and the suggested solution:
KB 257819: How to use ADO with Excel data (section titled “A Caution about Mixed Data Types)
KB 194124: Excel Values Returned as NULL Using DAO OpenRecordset

You may also have trouble if you have string data of varied length. If the first 8 rows contain short strings (< 255 characters), the driver will identify the field as a text field. If there are longer strings later on, after the first 8 rows, those will not fit into the space allocated. This can be fixed by changing the TypeGuessRows registry setting. Read more here:
KB 189897: Data Truncated to 255 Characters with Excel ODBC Driver
Reply With Quote
  #6  
Old 03-30-2005, 07:59 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Getting a list of available tables (worksheets and ranges)

What if you don’t know in advance which tables (worksheets) the workbook will contain? For example, you might want to get all the worksheets, but not know how many there are. No problem – you can get a list of tables using ADO.

You can take advantage of the naming convention for worksheets to separate worksheets from named ranges. If the table name ends with $, it's a worksheet, otherwise it's a named range.
Code:
Set objConn = GetExcelConnection("c:\temp\data source.xls") Set objRS = objConn.OpenSchema(adSchemaTables) Do While Not objRS.EOF strTable = objRS.Fields("table_name").Value If Right$(strTable, 1) = "$" Then strWorksheetList = strWorksheetList & vbCrLf & strTable Else strRangeList = strRangeList & vbCrLf & strTable End If objRS.MoveNext Loop MsgBox "Worksheets:" & strWorksheetList & vbCrLf & vbCrLf & "Ranges:" & strRangeList
The contents of the list appears to vary depending on whether the workbook is open (not particularly logically):
  • For a closed workbook, this list will not include hidden sheets. However it will include named ranges in hidden sheets, and hidden names.
  • For a workbook that is already open in Excel (on the same computer), the list will include hidden sheets as well.
Reply With Quote
  #7  
Old 03-30-2005, 07:59 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Adding data

The usual SQL commands like UPDATE and INSERT INTO work nicely with Excel tables. Obviously the Excel file should not be open elsewhere if you try to do this, or you will get some rather bizarre behaviour, in addition to the memory leak mentioned above. (The changes will be applied to a read-only copy which gets opened in a hidden instance of Excel. Definitely not something I’d recommend.)

For INSERT INTO, the table should be set up beforehand – the column headers should be in place before you try to add data to the table. The data type for each field will be read from existing data in the table (see above under “reading data from Excel”). If there is no data (only column headers), then all fields will be treated as text.

More information:
KB 295646: How To Transfer Data from ADO Data Source to Excel with ADO.
Reply With Quote
  #8  
Old 03-30-2005, 08:00 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Deleting data

This is where Excel really stops working like a database. Effectively you cannot delete records from an Excel file with ADO.

A simple “DELETE * FROM” will simply raise an error. With some creativity (such as using DROP TABLE) you can get to a point where it looks almost like you’ve deleted the records, but at best you can clear the cells’ contents. The cells will still be in use, so if you add records to the sheet, they will be added below the blank lines in the worksheet.

My recommendation is to not use ADO to delete data from an Excel workbook.
Reply With Quote
  #9  
Old 03-30-2005, 08:00 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Creating new worksheets and workbooks

You can add worksheets to existing Excel files using the SELECT * INTO command.
The following will create a new sheet named Sheet2 in the file named data source.xls, and copy the contents of Sheet1 into that new sheet:
Code:
strRequest = "SELECT * INTO [C:\temp\data source.xls].[Sheet2] FROM [Sheet1$]" objConn.Execute strRequest
Important: note how the name of the existing worksheet is followed by a $ sign, but the name of the new sheet is not. If the sheet already exists, this will raise an error.

With slight modifications, the same approach works for creating new workbooks as well. We just need to add a bit that tells ADO that the new database should be an Excel workbook:
Code:
strRequest = "SELECT * INTO [Excel 8.0;DATABASE=c:\temp\data copy.xls].[Sheet1] FROM [Sheet1$]"
Since this file creation method completely bypasses Excel itself, the newly-created file may not look like the files you create manually in Excel. In particular this will ignore the “Sheets in new workbook” and “Standard font size” settings in Excel.

More information:
KB 295646: How To Transfer Data from ADO Data Source to Excel with ADO.
Reply With Quote
  #10  
Old 03-30-2005, 08:00 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Replacing data

Instead of creating new sheets every time, you may have an Excel report that you want to refresh regularly by replacing old data with new data. This would allow you to refer to the data in formulas in other sheets etc. Most of the time this can be done more easily from the Excel side (pulling instead of pushing), but ADO may be faster because you don’t have to automate Excel.

Because we cannot delete records in Excel, you won’t be able to refresh the table by deleting the old records and then inserting new ones in their place. The best approach in that case seems to be to use two recordsets. Open the source and the target, loop through both recordsets and transfer the contents of the source to the target, field by field. If your result set may change in size, watch out for recordsets of different sizes – remember, you cannot make the initial target recordset larger than the current used range in the sheet, so you may have to switch from replacing to adding when you reach the end of the “available space”.
Reply With Quote
  #11  
Old 03-30-2005, 08:01 AM
herilane's Avatar
herilane herilane is offline
Unashamed geek

Retired Moderator
* Expert *
 
Join Date: Jul 2003
Location: London, England
Posts: 8,988
Default Transferring a recordset into Excel

Until now, all I’ve talked about is how your application can manipulate Excel files from the outside. Excel itself has one very useful method for working with recordsets: Range.CopyFromRecordset. You can use ADO to get data from an external source, and then put the entire recordset in your worksheet with a single command, with no tedious record-by-record, field-by-field looping. Field names have to be copied one-by-one, though.

Here’s a simple example of a VBA procedure that puts the recordset and (optionally) the field names in a worksheet range:
Code:
Public Sub PutRecordsetInSheet(ByVal RS As Recordset, ByVal TopLeft As Range, _ Optional ByVal Headers As Boolean = True) 'Lists field names in the top row if Headers=True, and pastes the records 'below them. Dim objField As ADODB.Field Dim i As Integer If Headers Then For Each objField In RS.Fields i = i + 1 TopLeft.Cells(1, i).Value = objField.Name Next objField TopLeft.Cells(2, 1).CopyFromRecordset RS Else TopLeft.Cells(1, 1).CopyFromRecordset RS End If End Sub
You can read more about this in MSDN:
KB 246335: How To Transfer Data from an ADO Recordset to Excel with Automation
Reply With Quote
  #12  
Old 06-16-2006, 04:32 AM
Timbo's Avatar
Timbo Timbo is offline
Green-Eyed

Super Moderator
* Guru *
 
Join Date: May 2001
Location: Bangkok, Thailand
Posts: 10,261
Default Managing mixed data types with ADO.

The following information was originally published here: http://help.lockergnome.com/office/S...ict224962.html

Since the author gave his general permission for it to be reproduced at will, it has been appended to this article because it contains related un-documented information.
Quote:
Originally Posted by onedaywhen
The relevant registry keys (for Jet 4.0) are in:

Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/

The ImportMixedTypes registry key is always read (whether it is
honored is discussed later). You can test this by changing the key to
ImportMixedTypes=OneDayWhen and trying to use the ISAM: you get the
error, 'Invalid setting in Excel key of the Engines section of the
Windows Registry.' The only valid values are:

ImportMixedTypes=Text
ImportMixedTypes=Majority Type

Data type is determined column by column. 'Majority Type' means a
certain number of rows (more on this later) in each column are scanned
and the data types are counted. Both a cell's value and format are
used to determine data type. The majority data type (i.e. the one with
the most rows) decides the overall data type for the entire column.
There's a bias in favor os numeric in the event of a tie. Rows from
any minority data types found that can't be cast as the majority data
type will be returned with a null value.

For ImportMixedTypes=Text, the data type for the whole column will be:

Jet (MS Access UI): 'Text' data type
DDL: VARCHAR(255)
ADO: adWChar ('a null-terminated Unicode character string')

Note that this is distinct from:

Jet (MS Access UI): 'Memo' data type
DDL: N/A
ADO: adLongVarWChar ('a long null-terminated Unicode string value')

ImportMixedTypes=Text will curtail text at 255 characters as 'Memo' is
cast as 'Text'. For a column to be recognized as 'Memo', majority type
must be detected, meaning the majority of rows detected must contain
256 or more characters.

But how many rows are scanned for each column before is decided that
mixed and/or what the majority type is? There is a second registry
Key, TypeGuessRows. This can be a value from 0-16 (decimal). A value
from 1 to 16 inclusive is the number of rows to scan. A value of zero
means all rows will be scanned.

There is one final twist. A setting of IMEX=1 in the connection
string's extended property determines whether the ImportMixedTypes
value is honored. IMEX refers to IMport EXport mode. There are three
possible values. IMEX=0 and IMEX=2 result in ImportMixedTypes being
ignored and the default value of 'Majority Types' is used. IMEX=1 is
the only way to ensure ImportMixedTypes=Text is honored. The resulting
connection string might look like this:

Provider=Microsoft.Jet.OLEDB.4.0;
Data Source=C:\ db.xls;
Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'

Finally, although it is mentioned in MSDN articles that MAXSCANROWS
can be used in the extended properties of the connection string to
override the TypeGuessRows registry keys, this seems to be a fallacy.
Using MAXSCANROWS=0 in this way never does anything under any
circumstances. Put another way, is has just the same effect as putting
ONEDAYWHEN=0 in the extended properties, being none (not even an
error!). The same applied to ImportMixedTypes i.e. can't be used in
the connection string to override the registry setting.

In summary, use TypeGuessRows to get Jet to detect whether a 'mixed
types' situation exists or use it to 'trick' Jet into detecting a
certaint data type as being the majority type. In the event of a
'mixed types' situation being detected, use ImportMixedTypes to tell
Jet to either use the majority type or coerce all values as 'Text'
(max 255 characters).
Thanks to 'onedaywhen' for distributing this information.
__________________
"He's not the Messiah. He's a very naughty boy!" - Brian's mum

Can't find the answer? >> Try something new!
Become a Professional
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Advertisement:





Free Publications
The ASP.NET 2.0 Anthology
101 Essential Tips, Tricks & Hacks - Free 156 Page Preview. Learn the most practical features and best approaches for ASP.NET.
subscribe
Programmers Heaven C# School Book -Free 338 Page eBook
The Programmers Heaven C# School book covers the .NET framework and the C# language.
subscribe
Build Your Own ASP.NET 3.5 Web Site Using C# & VB, 3rd Edition - Free 219 Page Preview!
This comprehensive step-by-step guide will help get your database-driven ASP.NET web site up and running in no time..
subscribe
 
 
-->