Xtreme Visual Basic Talk

Xtreme Visual Basic Talk (http://www.xtremevbtalk.com/)
-   Tutors' Corner (http://www.xtremevbtalk.com/tutors-corner/)
-   -   File I/O (http://www.xtremevbtalk.com/tutors-corner/123814-file.html)

GavinO 11-26-2003 07:35 PM

File I/O
File I/O in VB: Text Files

Visual Basic affords us an intrinsic system for working with files and the file system that allows us to perform most operations available in Windows Explorer or the DOS command line.

Opening a File

The first thing that you may want to do with a file is to open it. This operation is performed with the Open command:
Open <filename> For <mode> As <filenumber>
<filename> can be any valid path to the file. All of these are valid paths:


<mode> specifies the method by which you want to work with this file. There are 5 file modes:

-Input: This mode allows you to read text item by item from a text file. This mode is useful exclusively for plain text (human-readable) files.
-Output: This mode allows you to write text item by item to a text file. The resulting file is readable in the Input mode. The file is erased when opened in this mode.
-Append: This mode is the same as Output, but the file is not erased. Rather, writing starts at the end of the file. Useful for log files, among other things.
-Binary: Binary access will read however many bytes is necessary to fill
the provided variable. It can read and write to any arbitrary byte. This is useful for reading some data files, such as image files. You specify
your position in the file in Bytes.
-Random: Is similar to Binary, but only allows reading and writing to record boundaries. The size of the record is specified in the open statement, or defaults to 128 bytes if not specified. You specify your position in the file by record number.

<filenumber> is an identifying number for the file. This is passed to subsequent statements to tell them which file to perform their operation on. This also means that you can have many files open at once. When dealing with a lot of files, it may not be easy to know what filenumbers are in use. To help with this, we have the FreeFile() function. By storing the return of this function, we get the lowest unused file number. Here is an example of opening a file using a call to FreeFile():
Dim FF As Integer FF=FreeFile() Open "myfile.txt" For Input As FF

Reading & Writing

Text files are accessed with the Input and Print statements. In older versions of BASIC, these statements included a # sign; this convention has been dropped. The Input and Output statements have this syntax:
Input <filenumber>,<variable> Print <filenumber>,<expression>
In both statements, <filenumber> is the number of the open file. If there is no open file with the number passed, an error will result.

Input reads from the file into the variable passed as <variable>. Strings are read separated by commas or grouped by quotes. If you read a numeric data type (integer, single, etc) VB will start reading from the end of white space (newlines, tabs, and spaces) until it finds a non-digit. Thus, if there are characters instead of digits, 0 will be read. This also means that more than 1 number can be on a line. User defined types and classes cannot be read from text files, as there is no way for VB to figure out how they are read.

Print writes the result of <expression> on a new line in the file. An expression is a literal ("word", 99, 1.2, etc), a variable, or a function (2+2, string1 & string2, etc). The following are all valid expressions:

str$(myinteger) & mystring
"this is a literal"

Sometimes we want to read a particular number of characters from a file into a string, regardless of how they are broken among lines. This is useful for reading an entire file at once, among other things. The Input function is used to do this:
<number> specifies how may characters to get, and <filenumber> is the index that you assigned to the file when you opened it. This is a function, so you assign its return to a variable.

Another statement for writing to text files is Write:
Write <filenumber>,<output>
This statement works in the same way as Print, except that you can pass multiple things as <output>, and certain formatting changes are made when the data is written to the file. These changes can be summarized as follows:

-Individual pieces of data are separated by commas
-Strings are enclosed in quotation marks
-Dates and booleans are converted to a format readable in any locale (other languages and data formats)

These are some example Write statements, and what they write to a file:
Write FF,99,100,75,"foo","bar",23.4 '99,100,75,"foo","bar",23.4 Write FF,True,MyString,45.3 '#TRUE#,"contents of MyString",45.3 Write FF,True,12.4,False '#TRUE#,12.4,#FALSE#
So, the principle difference between Print and Write, is that Print is best suited for creating a report, or some other document that someone would read with an editor or print out. Write was made for saving data and reading it back into the computer, not producing a document. Of
course, you can use print to save data and read it back, but depending
on the data, can be more complicated. If you had 3 strings that made up an entry in an address book for instance, it is easier to put those three strings on a single line in your file, using Write:
Write FF, String1, String2, String3
than using Print
Print FF, String1; ","; String2; ","; String3
In that example, if one of your strings had a comma in it, then the line written with Print would not read back in correctly since Input would see more than 3 strings in the line. The line written using Write would read back correctly because the strings will have double quotes around it. And if perchance one of the Strings had a double quote in the string, the Write will still work because it will put "" (two double quotes) in the string when written to the file (two double quotes together is how you designate a quote character within a string).

It should be mentioned that when using Print, just like it does when printing to a form, or printing to a picturebox, a comma will space the printed data into columns and a semicolon will suppress the CRLf, so that further prints are on the same line.

Closing the File

While VB will close all files when your program ends, it is a good idea to close a file when you are done with it (you can always open it again later). One reason for this is that file numbers are global. You can pass a file number as an argument to other modules and classes, and they can operate on that file as though that module had opened it. Another reason is that should your program close in an uncontrolled manner (sudden process death, hard reboot, etc) damage may occur to the file. The Close statement works as follows:
Close <filenumber>
If <filenumber> isn't the number of an open file, no error will be thrown. Thus, there is never a danger of getting an error while closing a file.
If you don't specify a filenumber, then all open files are closed. You can also specify multiple filenumbers to close multiple files, but not necessarily all.

Special thanks to Guru passel for technical assistance with this tutorial

GavinO 11-26-2003 07:46 PM

File I/O in VB: Binary & Random Files

Binary and Random files are what you might consider 'real' files. They store data in the way that 'big' applications do, at the byte level. If you look at one in a text viewer, it is a garbled mess. To have a chance at making sense of them, you at least need a hex viewer, if not the program intended to read them.

What are they?

Binary files are composed of raw bytes representing the data written to them. There is no extra information stored about what a particular piece of data is; just the data. Something written as an integer might be read as a pair of bytes later, or the first byte of the integer may be read with the preceding byte to form a new integer. This is an efficient way of storing uncompressed data, as it assumes that the program reading the file will sort out what is what and get the appropriate meaning of each byte.

Random files store data in a compact way, like Binary files do, but with more structure. They are organized into units called 'records'. A record itself has no intrinsic structure, but has a size (in bytes) that is specified by you when you open the file. If you do not specify a size, it will default to 128 bytes. When you write certain kinds of data to a Random file, a bit of information about that data is also stored. Variable-length strings, for example, have their length stored with the string so that the appropriate number of characters will be read back. Provisions are made so that user-defined types are dealt with properly. Since you
specify the record len, trying to write or read more data in a record than the record holds will give you a "Bad record length" error. If the data you put in the record is less than the record size, the rest of the bytes in the record are left unmodified. This is not an error, but those bytes have no meaning to the data you've written so are "garbage". Since the garbage (remaining) bytes in a record are not written, if the last record written to the disk is "short", the file will be slightly shorter than the record len * number of records would indicate.
Normally, a User Defined Type (UDT) with a fixed size (strings and arrays, if part of the UDT, are defined with fixed lengths and dimensions) is used with a Random file. That way all the records are the same size and every record is filled. Use the LenB function on the UDT when specifying the record len in the Open statement.
To set the record length, add a term to your Open statement:
Open <filename> For Random As <filenumber> Len <recordlength>
Reading & Writing

For both Binary and Random files, reading and writing are done with the Get and Put statements, respectively:
Get <filenumber>,<address>,<variable> Put <filenumber>,<address>,<expression>
In both types of files, the next byte or record that will be written in sequence is stored, so that if you want to write sequentially to a file you can omit the <address> argument. If you specify an address, the position will be adjusted to read and write from there. Get must read into a variable; it can't read and return a value to be evaluated as part of an expression. Put can write the result of most any operation or variable.

Sometimes it is helpful to know where in the file you are, or to move to a particular place without reading or writing anything. This is accomplished with the Seek() function and the Seek statement:
Seek(<filenumber>) Seek <filenumber>,<address>
These addresses work in the same manner as addresses above, in that they count either bytes or records from the beginning of the file, starting with 1.

Special thanks to Guru passel for technical assistance with this tutorial

GavinO 12-04-2003 11:14 AM

File I/O in VB: File System

VB will let us do a lot with the file system without touching the dreaded FileSystemObject.

Messing With Directories

One of the things you may need to do is deal with directories. While these are in some models actually files, VB uses some special commands to deal with them. Here's the syntax:
MkDir <dirname> RmDir <dirname> ChDir <dirname>
The first two make and remove a directory, respectively. <dirname> is a path to the directory to be dealt with. If you want to remove a directory, it must be empty (Making it empty is explained later). ChDir is only loosely related to the other two. This statement changes the current directory, that is, the one used to calculate virtual paths:
ChDir "c:\" Open "MyFile.txt" As 2 'This opens c:\myfile.txt ChDir "c:\mydir" Open "MyFile.txt" As 5 'while this opens c:\mydir\myfile.txt
One directory that you might want to switch to often in the directory that your program was run from. In the IDE, this is the directory that the project file resides in. For a compiled program, it is the directory that the EXE is in (unless specially configured otherwise). To get a path to this directory, we can use one of the properties of the App object, .Path:
ChDir App.Path
The value of this property does not include a last backslash ('\'). This means that to use App.Path in concatenation with a constant or a string with a filename in it, we need to insert the backslash ourselves:
Open App.Path & "\" & sFileName As 1 Open App.Path & "\MyFile.txt" As 2
If you want to know what directory is the current one, you can use the CurDir function. Since there is a current directory on each drive, you can specify which drive to return the current directory for. If no drive letter is specified, then the current drive is assumed.
CurDir [<driveletter>]
If there are multiple characters passed as <driveletter>, only the first one is used. Now that the topic of a current drive has been raised, you may want to change that as well. This is done with the ChDrive statement:
ChDrive <driveletter>
Once again, if multiple characters are passed as <driveletter>, only the first is used.

Messing With Files

It is frequently necessary to delete a file programmatically. This might be a temporary file, a savegame that is being cleared, or any number of other things. VB allows us to delete a file using the Kill statement:
Kill <path>
The path follows the same guidelines as the Open command, that is, it can be an absolute path, virtual path, etc.

Sometimes we don't need to delete a file, but just give it a different name. This is accomplished with the aptly names Name statement. You pass the original path and the new path as arguments:
'Syntax Name <oldpath> As <newpath> 'Examples Name "C:\MyFile.txt" As "ThatFile.txt" 'renames the file to ThatFile.txt Name "C:\ADir\ThisFile.txt" As "C:\ThatDir\ThisFile.txt" 'moves the file without renaming
Changing the extension of a file with the Name statement does nothing to change the actual format, that is, a JPEG file will still be encoded as a JPEG even if you change the extension to .BMP with Name. If you specify a new path that is different than the old path (the directory is different), then the file will be moved to the new location, and if the name is different, renamed. While a file is open, it cannot be moved or renamed.

To copy a file, you use the FileCopy statement. This works much like the Name statement, except that there is a more conventional syntax, and you end up with two files at the end. You cannot copy an open file:
FileCopy <sourcepath>,<targetpath>
If you've used Windows for any period of time (or most any operating system), you'll have noticed that there are certain attributes that files have that you don't see when you read or write the file. These include the last modified date, read-only status, and other things. To retrieve the date on which a file was last modified, we use the FileDateTime function:
FileDateTime <filepath>
This function returns a value of type Date. There are a number of ways of manipulating Date data, but they are beyond the scope of this tutorial. If you assign a Date to the caption of most controls, it will be formatted according to the user's settings on that particular system. The other attributes of files can be retrieved with the GetAttr function. The return of this function is an integer that can be bitmasked to retrieve a particular piece of data:
VB defines the bitmasks that are used to retrieve the various file attributes with constants:

vbNormal    0  The file in normal
vbReadOnly  1  The file is read-only
vbHidden    2  The file is hidden (user can't usually see it)
vbSystem    4  The file is a system file. That makes it important
vbDirectory 16 The file is actually a directory/folder
vbArchive  32 The file has changed since the last time it was backed up

Bitmasking is a simple process. We just use the AND logical operator in conjunction with the mask that we want to check for:
IsReadOnly=GetAttr("MyFile.txt") And vbReadOnly IsArchive=GetAttr("MyFile.txt") And vbArchive
Setting these attributes is accomplished with the SetAttr statement. You pass the file path and the integer representing the state that you want to set the attributes to:
SetAttr <filepath>,<attributes>
To set many attributes, just add the constants together. To just change one attribute without affecting the others (this is preferred), you add or subtract (to set/reset) the attribute you want to deal with:
'Make the file Read-Only and Hidden SetAttr "MyFile.txt",vbReadOnly+vbHidden 'Make the file no longer hidden SetAttr "ThisFile.txt,GetAttr("ThisFile.txt")-vbHidden
The last bit of information that VB can give you about a file is its length. You could, of course, open the file and use the LOF() function, but the FileLen function provides a much cleaner, more efficient method of doing this:
FileLen <filepath>
If the file is open, then FileLen will return the length of the file before it was opened, so if you have since appended data to the end, this amount will not be indicated.

Finding Files

A common question about the forum is how to find a file based on certain criteria. A common answer is to shell the DOS 'dir' command and pipe the output to a file. There is a better way, with the Dir$() function:
This function returns the first file that matches the given filepath. To retrieve subsequent files, call Dir$() again with no arguments. The <filepath> can include wildcards (either * or ?) and any kind of path. The attributes can also be used to filter the file as well. These attributes are slightly different from those used with the *Attr commands:

vbNormal    0  Normal
vbHidden    2  Hidden files
vbSystem    4  System files
vbVolume    8  Volume label; other flags are ignored if this is present
vbDirectory 16 Find directories (as well as files)

When you get directories, you will also get the '..' directory (the previous directory). Note that the list of files and directories returned from the Dir$() statement is not sorted. This makes it desirable to store the returns in an array so that you might sort the files yourself before presenting them to the user. This also give you the opportunity to break the extension off of the file names.

Edit by Iceplug: See also: A quick note on App.Path

GavinO 02-23-2004 05:17 PM

File I/O in VB: Fast Access

While techniques for reading files thus far have been fairly efficient for reading data from a file once, they quickly begin to show a speed disadvantage once you start accessing parts of a file multiple times. To remedy this, we can read the entire file into an array or a string, and manipulate it further once its in memory.

Creating & Filling the Buffer

The place in memory where we put the contents of the file is called the buffer. The two principle ways to do this are with an array or with a string. Which you select depends on what you want to do. If you're reading a text file and want to read it in the form of words or lines, you'd read it into a string and Split it. If you wanted to access the file as a series of integers (Actually, any integer type, such as Byte, Integer, and Long), you want an array. In either case, you need to declare a data structure that can be resized to match the size of the file. This demonstrates how to do this with both a dynamic array and a variable-length string:
Dim MyArray() As <type> Dim MyString As String Dim FF As Integer FF=FreeFile() Open("MyFile.txt") For Binary As FF 'For an array, do this: ReDim MyArray(LOF(FF)/Len(<type>) Get FF,,MyArray 'For a string, do this: MyString=Space(LOF(FF)) Get FF,,MyString Close FF
Note that if you're using a Byte array (as is usually done), you can leave the Len() part out of the ReDim line.

Accessing the Buffer

You use a buffer much as you would a file. The difference is that accessing a random location in memory is faster than accessing a random location in a file on disk. To access the nth byte of the file through the buffer, just index into the array:
As for using a string, you usually want to split the file using some delimiter. This delimiter is most commonly a newline (vbNewLine or vbCrLf), space (' '), comma (','), or pipe ('|'). VB provides the friendly Split() function to break up the string:
'The array should be dynamic Dim MyArray() As String Dim MyString As String 'Imagine that we read this from a file MyString="This,list,is,comma delineated" 'Break the string by commas MyArray=Split(MyString,",")
The result of the above code is:

MyArray(3)="comma delineated"

Once you have the string broken into an array, you can easily index into it, as seen above. After a bit of use, it becomes easy to see how this 'speed reading' technique can help your applications gain speed in certain applications.

Closing Notes

On a modern computer, there is a certain tradeoff point where buffering a file no longer provides a speed advantage. Once the buffer reaches around 10-20% of system memory, Windows will begin to use hard disk space instead of memory, defeating the purpose of the buffer operation. Through testing, however, the optimal buffer size has been determined to be much lower. Whether due to memory architecture or how VB works, the optimal buffer size has been determined to be around 64K (65536 bytes). Testing was performed on a system running Windows 2000 with 512Mb of memory.

Special thanks to Loquin for his assistance

Gruff 02-25-2008 05:31 PM

Reading Lines from the end of a text file.

A typical method to Get the last few lines of a text file is to read the file serially.
You can either read the entire thing into memory which is wasteful of resources or read in each line one at a time which can waste time.

A different approach is to use the seek() method to jump to the bottom of the file and read just the lines you desire.
To do this you need to open the file as binary.

The process outline is:

1) Determine the File size.
2) Estimate roughly the number of bytes you wish to read.
3) Open the file as binary.
4) Move the file pointer to the position of: <FileSize> - <number of bytes to read>
5) Read <number of bytes to read> into a buffer of characters
6) Create an array of strings by splitting the Buffer using the vbCrLf character pair as a delimiter
7) Return the number of lines of text from the array of strings


This snippet of code loads a user defined block of text from the end of the file. Then parses that into lines of text. (For my needs that was about 1K.)

Private Sub GetLastFewLines(ByRef LST As ListBox, ByVal sFileName As String, ByVal BufferSize As Integer, ByVal LineCount As Integer) '*** Task: Fills a listbox with a text from the end of a file by using a user defined filename, buffer size, and text line count. *** Dim FSize As Long Dim sBuffer As String Dim sLines() As String Dim i As Integer FSize = FileLen(sFileName) Open sFileName For Binary As #1 ' Set the Buffer size If FSize <= (BufferSize * 2) Then ' File size is not much bigger than the block so create a buffer as large as the entire file. sBuffer = Space$(FSize) ' The file pointer is at the start of the file by default Else ' Create a Buffer of the user defined size sBuffer = Space$(BufferSize) ' Move the file pointer to the beginning of a block of text at the end of the File Seek #1, (FSize - BufferSize) End If ' Read the chunk of text into the buffer Get #1, , sBuffer Close #1 ' Nip off last empty lines if they exists '--- If Right$(sBuffer, 2) = vbCrLf then Do sBuffer = Left$(sBuffer, Len(sBuffer) - 2) Loop Until Right$(sBuffer, 2) <> vbCrLf '--- ' Create Array of strings sLines = Split(sBuffer, vbCrLf) ' Determine amount of lines to list If UBound(sLines) < LineCount Then LineCount = UBound(s) + 1 LST.Clear ' Fill List For i = UBound(sLines) - (LineCount - 1) To UBound(sLines) LST.AddItem sLines(i) Next i End Sub


Gruff 04-17-2013 01:18 PM

Reading Full Lines from a Text File
Reading Full Lines from a Text File

I thought is might be a good idea to add a sub section
on reading full lines of text from a text file serially.

You open a text file for input as you would in GavinO's excellent basic File IO tutorial and use the 'Line Input' statement in a loop.

Open FileName for Input as <Filenumber> Do While Not EOF(<FileNumber>) Line Input <FileNumber>, <String Variable> ' Process <String Variable> Loop Close

The entire line of text upto but not including the NewLine Characters is returned.
If you want the NewLine characters in the string variable you will have to add them yourself.

Dim sHold as string Open FileName for Input as <Filenumber> Do While Not EOF(<FileNumber>) Line Input <FileNumber>, sHold TextBox.text = TextBox.Text & sHold & vbCrLf Loop Close

All times are GMT -6. The time now is 08:43 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Search Engine Optimisation provided by DragonByte SEO v2.0.15 (Lite) - vBulletin Mods & Addons Copyright © 2018 DragonByte Technologies Ltd.
All site content is protected by the Digital Millenium Act of 1998. Copyright©2001-2011 MAS Media Inc. and Extreme Visual Basic Forum. All rights reserved.
You may not copy or reproduce any portion of this site without written consent.