out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
Go Back  Xtreme Visual Basic Talk > > > out-of-memory with just 136Mb txt


Reply
 
Thread Tools Display Modes
  #1  
Old 06-18-2013, 05:17 AM
StorM_GmA StorM_GmA is offline
Regular
 
Join Date: Oct 2007
Posts: 96
Default out-of-memory with just 136Mb txt


Ok, the idea is simple: I want to edit the contents of some .txt files but some of them are using vbLf for new-lines. So the first thing I have to do is to replace vbLf with vbCrLf. Leaving aside the unnecessary code, I do this:

Code:
Open App.Path & "/" & txtPath.Text & ".txt" For Binary As #1 'I open the file as binary
Dim strBuff As String: strBuff = Space(LOF(1))               'I create a string variable and put as many spaces as the size of the .txt
For m = 1 To Len(strBuff)                                    'I go through the string char-by-char
    DoEvents
    If (Mid(strBuff, m, 1) = vbLf) Then                      'finding vbLf
        strBuff = Left(strBuff, m - 1) & vbCrLf & Mid(strBuff, m + 1, Len(strBuff)) 'and trying to replace them. Here is the problem
        m = m + 1 
     End If
Next m
Everytime I am trying to do anything in the strBuff variable, I am getting Out Of Memory error.

I have tried the replace() function
Code:
strBuff = replace(strBuff, vbLf, vbCrLf)
I have tried using different variables for storing the processed data
Code:
strTemp = replace(strBuff, vbLf, vbCrLf)
strBuff = strTemp
I have tried a different approach to replace, by getting the text on the left and right side of the string (ignoring the vbLf in the middle) and adding vbCrLf between them
Code:
strBuff = Left(strBuff, m - 1) & vbCrLf & Mid(strBuff, m + 1, Len(strBuff))

or

strTemp = Left(strBuff, m - 1) & vbCrLf & Mid(strBuff, m + 1, Len(strBuff))
strBuff = strTemp
Everything gives me Out Of Memory error. Any ideas why is this happening?
Reply With Quote
  #2  
Old 06-18-2013, 09:22 AM
Flyguy's Avatar
Flyguyout-of-memory with just 136Mb txt Flyguy is offline
Lost Soul

Super Moderator
* Guru *
 
Join Date: May 2001
Location: Vorlon
Posts: 19,164
Default

Why do you need to replace the vbLF with vbCrLF before processing the data?

Could you not modify the data line by line?
Reply With Quote
  #3  
Old 06-18-2013, 09:47 AM
StorM_GmA StorM_GmA is offline
Regular
 
Join Date: Oct 2007
Posts: 96
Default

no because if the .txt uses vbLf, there is no "line". My .txt files contain data like this:

Code:
soijfs
asfoasidv
asdosdj
23rwer9
sd-s0dfaa
And I want to store each line in array, like this:

Code:
lst(0) = soijfs
lst(1) = asfoasidv
lst(2) = asdosdj
lst(3) = 23rwer9
lst(4) = sd-s0dfaa
But if the .txt file uses vbLf, when I try to execute:
Code:
Line Input #readFromFile, lst(m)
It stores all data in the first array, like:
Code:
lst(0) = soijfsasfoasidvasdosdj23rwer9sd-s0dfaa
lst(1) = empty
lst(2) = empty
lst(3) = empty
lst(4) = empty
because apparently, VB needs to find vbCrLf to move to the next line.
Reply With Quote
  #4  
Old 06-18-2013, 10:30 AM
Gruff's Avatar
Gruffout-of-memory with just 136Mb txt Gruff is offline
Bald Mountain Survivor

Retired Moderator
* Expert *
 
Join Date: Aug 2003
Location: Oregon, USA - deceased
Posts: 6,440
Default

This should work as long as you have enough memory to hold the entire file.

Code:
Dim sLines() as string Dim AllText As String Dim FF As Integer FF=FreeFile() Open("MyFile.txt") For Binary As FF AllText=Space(LOF(FF)) Get FF,,AllText Close FF ' Create a zero based array of strings. sLines = split(AllText,vbLF) ' Clear Alltext AllText = "" ' Show First Line of text. msgbox sLines(0) ' Show Last Line of text msgbox sLines(ubound(sLines)
__________________
Burn the land and boil the sea
You can't take the sky from me


~T

Last edited by Gruff; 06-18-2013 at 10:46 AM.
Reply With Quote
  #5  
Old 06-18-2013, 12:08 PM
StorM_GmA StorM_GmA is offline
Regular
 
Join Date: Oct 2007
Posts: 96
Default

Thank you very much. From a quick test, it seemed to work (altho it took 5-10mins to split a 136mb .txt into 14.344.391 array elements with an i7 @ 4.5GHz and 32GB RAM). There are still 2 problems:

1) I can't control split() to add some DoEvents so the program appears as Not Responding during the process. I would also like to add a progress bar since the process takes so long.

2) When I stop the program and re-run it, I get Out Of Memory error in the line where I store the entire .txt to a string:

Code:
Dim strBuff As String: strBuff = Space(LOF(1))
Get #1, , strBuff    <----HERE
If i close visual studio and open it again, I don't get Out Of Memory error. Any ideas why is that happening?
Reply With Quote
  #6  
Old 06-18-2013, 12:55 PM
Gruff's Avatar
Gruffout-of-memory with just 136Mb txt Gruff is offline
Bald Mountain Survivor

Retired Moderator
* Expert *
 
Join Date: Aug 2003
Location: Oregon, USA - deceased
Posts: 6,440
Default

No Idea. Perhaps 20 year old VB6 has memory limitations when using Get on binary reading.

You are not going to have much luck using doevents when reading the the file.
Your main thread is fully occupied reading the file.

You do not say what you are doing with the file data once you read it.

Are you in control of populating the file?
Are you processing the line and outputting it to another file?
Are you trying to store the data in a control on a form?
Are you just searching for particular lines?
Does your file grow and grow continuously?

What is the purpose of your program?

Depending on the above questions there may be better solutions.



As far as locking up goes with your current code.... You could read your file in chunks.
and process each chunk before you load the next chunk.

This may take a bit longer than loading the entire file but should aleviate lockup and out of memory errors. You could possibly have a working progress bar in this scenario.

Calculate your chunk size and set your progress bar min and max based on how many chunks are in your total file size. After reading a chunk adjust your progress bar index.
__________________
Burn the land and boil the sea
You can't take the sky from me


~T
Reply With Quote
  #7  
Old 06-18-2013, 01:34 PM
Flyguy's Avatar
Flyguyout-of-memory with just 136Mb txt Flyguy is offline
Lost Soul

Super Moderator
* Guru *
 
Join Date: May 2001
Location: Vorlon
Posts: 19,164
Default

Somewhere in the code library is a VB6 class which is capable of reading DOS and Unix files, line by line. Can't find it at the moment.

Otherwise I will upload it it tomorrow.
Reply With Quote
  #8  
Old 06-18-2013, 01:41 PM
StorM_GmA StorM_GmA is offline
Regular
 
Join Date: Oct 2007
Posts: 96
Default

Ok. I am collecting .txt files containing lists of names of various things like all names from shakespeare books, name of cities, towns, drinks etc. From the .txt I have collected so far, I have noticed some things that I want to change:

1) remove comments (most of the time it's the source of the file and they all start with # symbol)
2) remove records that contain white spaces (for example Sex on the Beach drink should be removed)
3) remove white spaces from the beginning or the end of a record
4) sort records
5) remove duplicates
6) save the records into a new .txt file

So the idea is:
  1. open .txt
  2. store each line of .txt into an array
  3. do all the edits mentioned above to the array
  4. save array into a new .txt

All my problems are in step 2.

My program works perfectly with almost all my .txt, no matter if they're using vbCrlf or vbLf. All the problems arise when I try to edit a 136MB file which uses vbLf. I created a 150MB .txt with random data but with vbCrLf as line-seperator and my program was working perfectly.

vbLf is my arch enemy!

I use Vim to open large .txt files and it has a find-and-replace function. I used it to replace Lf (\n) with CrLf (\r\n) and it gave me an Out Of Memory error too!!!
Reply With Quote
  #9  
Old 06-18-2013, 02:00 PM
Gruff's Avatar
Gruffout-of-memory with just 136Mb txt Gruff is offline
Bald Mountain Survivor

Retired Moderator
* Expert *
 
Join Date: Aug 2003
Location: Oregon, USA - deceased
Posts: 6,440
Default

If Flyguy can find the library he is talking about you should be able to read items line by line just as you would with a file containing vbCrLf line ends.

Otherwise the chunk method I mentioned earlier would definitely work.
One option would be to read in a chunk, process the lines of text, output the cleaned up chunk to your new file, read in a chunk, ...

Basically once you read a chunk you split it by vbLf. Process all but the last line.
Read the next chunk and append it to the last line. This handles partial lines in the last line in the chunk.

Code:
Dim Hold as string Dim Chunk as string Dim sLines() as string Do Dim i as integer ' Read Chunk here Hold = Hold & Chunk sLines = split(Hold,vbLf) For i = 0 to uBound(sLines) - 1 ' Process lines Next i Hold = sLines(ubound(sLInes)) Until ChunkCount = ChunkTotal ' Process last line here

BTW you might want to consider writing your final output to a true database.

That way you could slice, dice, sort, search your end data with ease using simple Queries.
__________________
Burn the land and boil the sea
You can't take the sky from me


~T

Last edited by Gruff; 06-18-2013 at 02:08 PM.
Reply With Quote
  #10  
Old 06-18-2013, 07:07 PM
StorM_GmA StorM_GmA is offline
Regular
 
Join Date: Oct 2007
Posts: 96
Default

I wish Flyguy will find the code! The output should be in .txt because it is gonna be used as feed to another program. Funny thing is that the other prorgam is in Linux so vblf would be perfect!

At some point i considered the chunk method because I found a function in FSO that can read a specific amount of data from a file. The first problem I faced when I tried it is that if I read chunks of, let's say, 1000 chars and the .txt uses vbCrLf (which means asc(13) & asc(10)), who tells me that the 1000th char isn't asc(13) and accidentally chop a vbCrLf in two? The length of the chunks was hard-coded so I couldn't increase/decrease it or put if-statements in it.

This is so frustrating I'm considering learning a new programming language just to make it happen! I'm using VB just because it's easy to draw the form...
Reply With Quote
  #11  
Old 06-19-2013, 01:58 AM
Flyguy's Avatar
Flyguyout-of-memory with just 136Mb txt Flyguy is offline
Lost Soul

Super Moderator
* Guru *
 
Join Date: May 2001
Location: Vorlon
Posts: 19,164
Default

Code:
'---------------------------------------------------------------------------------------
' Module    : clsFile
' DateTime  : 18-11-2005
' Author    : Will Barden
' Purpose   :
'---------------------------------------------------------------------------------------
Option Explicit

Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (pDst As Any, pSrc As Any, ByVal ByteLen As Long)

Private m_bIsUnix As Boolean
Private m_lBufferSize As Long
Private m_iID As Integer
Private m_bFileOpen As Boolean

Private m_lFileSize As Long
Private m_lBytesRead As Long
Private m_bEOF As Boolean

Private m_lBlockPointer As Long
Private m_lNofBlocks As Long
Private m_lBlock As Long

Public Property Get Progress() As Double
  Progress = 100# * m_lBytesRead / m_lFileSize
  If Progress > 100 Then Progress = 100
End Property

'---------------------------------------------------------------------------------------
' Procedure : EndOfFile
' DateTime  : 18-11-2005
' Author    : Will Barden
' Purpose   :
'---------------------------------------------------------------------------------------
Public Property Get EndOfFile() As Boolean
  If m_bFileOpen Then
    If m_bIsUnix Then
      EndOfFile = m_bEOF
    Else
      EndOfFile = EOF(m_iID)
    End If
  End If
End Property

'---------------------------------------------------------------------------------------
' Procedure : CloseTextStream
' DateTime  : 18-11-2005
' Author    : Will Barden
' Purpose   :
'---------------------------------------------------------------------------------------
Public Sub CloseTextStream()
  If m_bFileOpen Then
    Close m_iID
    m_bFileOpen = False
  End If
End Sub

'---------------------------------------------------------------------------------------
' Procedure : OpenTextStream
' DateTime  : 18-11-2005
' Author    : Will Barden
' Purpose   :
'---------------------------------------------------------------------------------------
Public Function OpenTextStream(sFilename As String, Optional bUnix As Boolean) As Boolean
  On Error GoTo errHandler
  
  If Not m_bFileOpen Then
    
    m_bIsUnix = bUnix
    
    If Len(Dir$(sFilename)) > 0 Then
      If bUnix Then
        OpenTextStream = OpenUnixFile(sFilename)
      Else
        OpenTextStream = OpenDosFile(sFilename)
      End If
    End If
  
    If Not OpenTextStream Then
      Close #m_iID
      m_bFileOpen = False
    End If
  
  End If
  
  Exit Function

errHandler:
End Function

'---------------------------------------------------------------------------------------
' Procedure : ReadLine
' DateTime  : 18-11-2005
' Author    : Will Barden
' Purpose   :
'---------------------------------------------------------------------------------------
Public Function ReadLine() As String
  If m_bFileOpen Then
    If m_bIsUnix Then
      ReadLine = ReadUnixLine
      m_lBytesRead = m_lBytesRead + Len(ReadLine) + 1
    Else
      ReadLine = ReadDosLine
      m_lBytesRead = m_lBytesRead + Len(ReadLine) + 2
    End If
  End If
End Function

'---------------------------------------------------------------------------------------
' Procedure : OpenDosFile
' DateTime  : 18-11-2005
' Author    : Will Barden
' Purpose   :
'---------------------------------------------------------------------------------------
Private Function OpenDosFile(sFilename As String) As Boolean
  On Error GoTo errHandler
  
  m_iID = FreeFile
  Open sFilename For Input As m_iID
  m_bFileOpen = True
  OpenDosFile = m_bFileOpen
  Exit Function

errHandler:
End Function

'---------------------------------------------------------------------------------------
' Procedure : OpenUnixFile
' DateTime  : 18-11-2005
' Author    : Will Barden
' Purpose   :
'---------------------------------------------------------------------------------------
Private Function OpenUnixFile(sFilename As String) As Boolean
  On Error GoTo errHandler
  
  m_iID = FreeFile
  Open sFilename For Binary As m_iID
  
  m_lFileSize = LOF(m_iID)
  m_lNofBlocks = m_lFileSize \ m_lBufferSize
  m_lBlock = 0
  m_lBlockPointer = -1
  
  m_bFileOpen = True
  OpenUnixFile = True
  Exit Function

errHandler:
End Function

'---------------------------------------------------------------------------------------
' Procedure : ReadDosLine
' DateTime  : 18-11-2005
' Author    : Will Barden
' Purpose   :
'---------------------------------------------------------------------------------------
Private Function ReadDosLine() As String
  Dim sLine As String
  
  If Not EOF(m_iID) Then
    Line Input #m_iID, sLine
    ReadDosLine = sLine
  End If
End Function

Private Function ReadUnixLine() As String
  Dim bFound As Boolean
  Dim lUbound As Long, lStart As Long, lLen As Long
  Static bBuffer() As Byte
  
  ' Get a new block of data
  If m_lBlockPointer = -1 Then
    If m_lBlock > m_lNofBlocks Then
      m_bEOF = True
      Exit Function
    End If
    bBuffer = ReadBlock(m_lBlock)
    m_lBlock = m_lBlock + 1
  End If
  
  ' Start of the new string
  lStart = m_lBlockPointer + 1
  
  lUbound = UBound(bBuffer)
  Do Until m_lBlockPointer = lUbound Or bFound
    m_lBlockPointer = m_lBlockPointer + 1
    bFound = (bBuffer(m_lBlockPointer) = 10)
  Loop
  
  If bFound Then
    ' End of line found, build string 
    lLen = m_lBlockPointer - lStart
    If lLen > 0 Then ReadUnixLine = BytesToString(bBuffer, lStart, lLen)
  Else
    ' No EOL, 1st part from current buffer, 2nd part from second buffer
    m_lBlockPointer = -1
    lLen = lUbound - lStart + 1
    If lLen > 0 Then
      ReadUnixLine = BytesToString(bBuffer, lStart, lLen) & ReadUnixLine()
    Else
      ReadUnixLine = ReadUnixLine()
    End If
  End If
  
End Function

Private Function ReadBlock(lBlock As Long) As Byte()
  Dim lLen As Long
  Dim bBuffer() As Byte
  
  If lBlock = m_lNofBlocks Then
    lLen = m_lFileSize Mod m_lBufferSize
    ReDim bBuffer(lLen - 1)
  Else
    ReDim bBuffer(m_lBufferSize - 1)
  End If
  
  Get #m_iID, , bBuffer
  
  ReadBlock = bBuffer
End Function


'---------------------------------------------------------------------------------------
' Procedure : BytesToString
' DateTime  : 24/7/02
' Author    : Will Barden
' Purpose   : converts a part of a byte array to a string
'---------------------------------------------------------------------------------------
Private Function BytesToString(ByRef bArr() As Byte, ByVal StartIndex As Long, ByVal Length As Long) As String
  BytesToString = Space$(Length)
  CopyMemory ByVal BytesToString, bArr(StartIndex), Length
End Function

Private Sub Class_Initialize()
  m_lBufferSize = 102400
End Sub

Private Sub Class_Terminate()
  ' Just to be sure
  If m_bFileOpen Then Close m_iID
End Sub
Reply With Quote
  #12  
Old 04-07-2014, 11:30 AM
loquin's Avatar
loquinout-of-memory with just 136Mb txt loquin is offline
Google Hound

Retired Moderator
* Guru *
 
Join Date: Nov 2001
Location: Arizona, USA
Posts: 12,400
Default

When you're processing a text file in chunks (assuming that the length of any line is less thant he chunk size) the final array element resulting from the SPLIT (i.e. last line of the first chunk) could contain
  • the entire line
  • an empty line
  • The entire line plus the vbCR character
  • Just the vbCR character

The first line in the subsequent chunk (and subsequent array) could be
  • the remainder of the prior line (the most likely occurrence)
  • an empty string, meaning that the vbNewLine was the first two characters in the chunk, and that the preceeding text line was, in fact, a complete line
  • vblf
and, in all three of these cases, IF you you have saved the prior partial line (in all cases except the first chunk) and you concatenate the prior chunk's last line to the subsequent chunk's first line, you have a complete line to work with. And, on the first chunk, it doesn't matter of you concatenate the prior line, because it would be. by default, an empty string anyway.

What you do is to add a variable (call it sTemp for this description) to hold the final array element of the chunk. Of course, initially, it will be an empty string which is what you want. Inside your chunk loop, read the chunk, concatenate the new string variable to the beginning of the chunk (sChunk = sTemp & sChunk,) then split the chunk on vbNewLine.

After the chunk is read, split and processed (except for the final array element,) assign that last array element to the temp variable, then set the last array element equal to an empty string. (You now JOIN the array into a single string variable using vbLF as your delimiter (since it's going to a Linux box for further processing,) and save the chunk out to your target output file.)

The next chunk is read, the [usually] partial trailing line from the prior chunk is inserted at the beginning of the next chunk (which will reconstruct the broken line, if there was one.) Now, Split on vbNewLine and repeat.

On the finalchunk, after (and outside) the chunk loop, Chunksize is reduced to fit the remaining characters left in the source file; and the last line in the resulting array is processed, rather than being saved in sTemp.

At the beginning, I like to calculate the number of chunks to process and the remainder chunk size by an integer division of filesize by chunksize, and Modulus of filesize by chunksize, respectively. That way, you can use a for/next loop for processing all but the final (partial) chunk. Obviously, if the remainder chunksize is 0, sTemp contains the entire last line of the file, so only it needs to be saved to the end of the target file.
__________________
Lou
"I have my standards. They may be low, but I have them!" ~ Bette Middler
"It's a book about a Spanish guy called Manual. You should read it." ~ Dilbert
"To understand recursion, you must first understand recursion." ~ unknown

Last edited by loquin; 04-07-2014 at 11:52 AM. Reason: clarification
Reply With Quote
Reply

Tags
out of memory, replace, vbcrlf, vblf


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump

Advertisement:





Free Publications
The ASP.NET 2.0 Anthology
101 Essential Tips, Tricks & Hacks - Free 156 Page Preview. Learn the most practical features and best approaches for ASP.NET.
subscribe
Programmers Heaven C# School Book -Free 338 Page eBook
The Programmers Heaven C# School book covers the .NET framework and the C# language.
subscribe
Build Your Own ASP.NET 3.5 Web Site Using C# & VB, 3rd Edition - Free 219 Page Preview!
This comprehensive step-by-step guide will help get your database-driven ASP.NET web site up and running in no time..
subscribe
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
 
out-of-memory with just 136Mb txt
out-of-memory with just 136Mb txt
 
-->