Find/Extract string from a text file
Find/Extract string from a text file
Find/Extract string from a text file
Find/Extract string from a text file
Find/Extract string from a text file
Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file
Find/Extract string from a text file Find/Extract string from a text file
Find/Extract string from a text file
Go Back  Xtreme Visual Basic Talk > > > Find/Extract string from a text file


Reply
 
Thread Tools Display Modes
  #1  
Old 06-07-2011, 11:24 AM
zatarboy zatarboy is offline
Regular
 
Join Date: Oct 2004
Posts: 63
Default Find/Extract string from a text file


I've got an application that passes commands to a terminal window and saves all the output to a text file


Here is one of the commands my application passes to the terminal
MyProcess.StandardInput.WriteLine("host " + device)

The output of which is
"HostA has address Y.Y.Y.Y"


this along with a whole bunch of other text is saved to a text file...my question is how do I find the sting "HostA has address Y.Y.Y.Y" in that text file and then extract the IP address and assign it as a string variable?

Thanks
Reply With Quote
  #2  
Old 06-08-2011, 06:42 AM
DrPunk's Avatar
DrPunkFind/Extract string from a text file DrPunk is offline
Senior Contributor

* Expert *
 
Join Date: Apr 2003
Location: Never where I want to be
Posts: 1,403
Default

A lot kinda depends on the size of the file, i.e. how you'd read the file in the first place to then allow you to search it.

Also, bare in mind that the reading of the file might get in the way of writing to the file. It might be worth making a copy of the file first so that you read the copy of the file rather than the file that is being written to. And then delete your copy when you're done with it.

But once you've read it and have the lines of the file (you might read the file all in one go and then split it by carriage return line feeds to get each line, or you might read the file line by line and check each line as you read it) then you could see if the string Contains "has address" and work from there.

Going on the assumption that the line contains only ("HostA has address Y.Y.Y.Y") then you could have...

Code:
' Assumes you have the line of the file in a variable called fileLine

' Instead of writing "has address" lots of time, use a variable (could be a constant, but
' you could put this into a function and pass what you want to check
Dim toCheck as String = "has address"

' A variable to store the position of the check in the string if we find it to extract the 
' information with
dim start as Integer

' Variables to store the information in
Dim host as string
dim address as string

' Check to see if the string contains what we are looking for (in assumed variable fileLine)
' To avoid case problems, just upper everything.
If fileLine.ToUpper.Contains(toCheck.ToUpper) Then

    ' Find in the string where it is
    start = fileLine.ToUpper.IndexOf(toCheck.ToUpper)

    ' Get the host name from the line. It's at the start of the line, up to the position of
    ' the check minus 1 (to allow for the space between host address and "has address")
    host = fileLine.SubString(0, start - 1)

    ' Get the address from the line. It starts after the check string 
    ' (start + toCheck.Length) plus one (to allow for the space between "has address"
    ' and the address) to the end of the line.
    address = fileLine.SubString(start + toCheck.Length + 1)
End If
__________________
There are no computers in heaven!
Reply With Quote
  #3  
Old 06-08-2011, 09:43 AM
AtmaWeapon's Avatar
AtmaWeaponFind/Extract string from a text file AtmaWeapon is offline
Fabulous Florist

Forum Leader
* Guru *
 
Join Date: Feb 2004
Location: Austin, TX
Posts: 9,500
Default

This is a good place to use the oft-overused regular expression. I'm not going to say it's easier than DrPunk's suggestion, but depending on your level of famliarity it can be more elegant.

Regular expressions are a programming language used to describe patterns in strings. They're really useful when you're looking for a string that matches a very specific pattern and you want to extract certain pieces of it.

When writing a regular expression, it helps to describe the pattern in pieces, then convert those pieces into the regular expression (regex) language. Let's start with a full description of the string:
  1. A hostname, which probably consists of alphanumeric characters only.
  2. The string "has address" with whitespace on either side.
  3. An IP address, which consists of four sets of up to three digits separated by periods.
Now let's convert those descriptions into regex:
  1. \w+
  2. \s*has address\s*
  3. \d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}
"\w" is a shortcut for "word characters". This includes a-z, A-Z, 0-9, and the _ character. That's probably a good match for hostnames; if you need dots you can adjust. "\s" is a shortcut for whitespace characters, including spaces, tabs, and sometimes line breaks. "\d" means digit, it only matches 0-9. "+" means "at least one of the previous character class". "*" means "zero or more of the previous character class." "{1,3}" means "one, two, or three of the previous character class." Put it all together and you get:

"At least one word character followed by the string "has address" with optional whitespace on either side followed by four groups of 1-3 digits."

You can actually compress #3 using some clever regex constructs, but we'll ignore that for simplicity.

That's an adequate description of your pattern. Now there's one thing left. Parenthesis in regex perform grouping; if you surround something with parenthesis you can later extract the string that matched the pattern in those parenthesis. Since you want the IP, you'd need to use this pattern:
Code:
\w+\s*has address\s*(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})
Now, to use it. I'm going to assume you already have a technique to provide each line of the file. DrPunk discussed some techniques but I suggest either a loop using System.IO.StreamReader.ReadLine() or the System.IO.File.ReadAllLines() method, depending on the file's size. If you're unclear on that, it's a topic for another thread. Here's what a function that gets the IP from a line if it matches the pattern looks like (note you probably have to put "Imports System.Text.RegularExpressions" at the top of your file):
Code:
Private Const Pattern As String = "\w+\s*has address\s*(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})"
Private _regex As New Regex(Pattern)

Function GetIpOrNull(ByVal line As String) As String
    Dim match As Match = _regex.Match(line)
    If match.Success Then
        Dim ip As String = match.Groups(1).Value
        Return ip
    Else
        Return Nothing
    End If
End Function
The Match() function creates a Match object that represents the first match, if any, of the regex. Its Success property indicates if there was a match. If the line you give this function matches the pattern, the IP is extracted from the match's Groups property (there is always a group 0 that is the entire match; that's why I used 1 for the first group.) If the match was not successful, the function returns Nothing to indicate the line had no IP address to extract.

Is it better than using Substring()? Not really, but it's not worse either. Sometimes it's more expressive about what you're looking for. It's an alternative.
__________________
.NET Resources
My FAQ threads | Tutor's Corner | Code Library
I would bet money 2/3 of .NET questions are already answered in one of these three places.
Reply With Quote
  #4  
Old 06-08-2011, 11:27 AM
zatarboy zatarboy is offline
Regular
 
Join Date: Oct 2004
Posts: 63
Default

Thanks for the replies

I've decided to go for the option I understand best which was provided by DrPunk...Here is my code:

Code:
Public Class Form1

    Dim toCheck As String = "has address"
    Dim start As Integer
    Dim host As String
    Dim address As String


    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim str As String
        str = "yahoo.com has address 87.87.127.118 and there are no other associated addresses"

        If str.ToUpper.Contains(toCheck.ToUpper) Then
            start = str.ToUpper.IndexOf(toCheck.ToUpper)
            host = str.Substring(0, start - 1)
            address = str.Substring(start + toCheck.Length + 1)
            MsgBox(address)
        End If
    End Sub

End Class
It works great but displays everything after the IP address as well "87.87.127.118 and there are no other associated addresses"
Reply With Quote
  #5  
Old 06-09-2011, 03:07 AM
DrPunk's Avatar
DrPunkFind/Extract string from a text file DrPunk is offline
Senior Contributor

* Expert *
 
Join Date: Apr 2003
Location: Never where I want to be
Posts: 1,403
Default

Yeah, my solution assumed there was nothing after the address (the substring without a length parameter gets everything to the end of the string).

It's difficult to suggest a solution without knowing exactly what else there might be in that string. Your new addition to the string suggests to me that there might even be more than one IP Address you need to get out of the string (i.e. if there ARE other associated addresses).

The simplest solution to what you've given would be to look for the first occurrence of a space character (which will find you the space after the address) and substring it from the start to the position of the space - 1.
__________________
There are no computers in heaven!
Reply With Quote
  #6  
Old 06-09-2011, 04:21 AM
zatarboy zatarboy is offline
Regular
 
Join Date: Oct 2004
Posts: 63
Default

Thanks DrPunk but my knowledge on VB/programming is quite limited
I'm most familiar with using constants for the method...eg Substring(3, 3).
Reply With Quote
  #7  
Old 06-09-2011, 05:05 AM
DrPunk's Avatar
DrPunkFind/Extract string from a text file DrPunk is offline
Senior Contributor

* Expert *
 
Join Date: Apr 2003
Location: Never where I want to be
Posts: 1,403
Default

Well it's not much different to what we've already done.

Once you've read the address, look for a space in it. If we find one then substring the address with where we found the space.

E.g.

Code:
' A variable to store our look for a space
dim firstSpace as integer

' This is already part of the code and it's after this line that we're interested in.
address = str.Substring(start + toCheck.Length + 1)

' After reading the address, look for a space
firstSpace = address.IndexOf(" ")

' If firstSpace comes back as -1 then it means it didn't find one in which case we don't
' need to do anything
If firstSpace > -1 then
    address = address.substring(0, firstSpace)
End If
That should remove everything after the IP address.
__________________
There are no computers in heaven!
Reply With Quote
  #8  
Old 06-09-2011, 05:14 AM
zatarboy zatarboy is offline
Regular
 
Join Date: Oct 2004
Posts: 63
Default

That worked brilliantly!!!...I'm still going to have to read through it thoroughly and muck around with it to understand it a bit better

Code:
Public Class Form1

    Dim toCheck As String = "has address"
    Dim start As Integer
    Dim host As String
    Dim address As String
    Dim firstSpace As Integer


    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim str As String
        str = "yahoo.com has address 87.87.127.118 and there are no other associated addresses"

        If str.ToUpper.Contains(toCheck.ToUpper) Then
            start = str.ToUpper.IndexOf(toCheck.ToUpper)
            host = str.Substring(0, start - 1)
            address = str.Substring(start + toCheck.Length + 1)
            firstSpace = address.IndexOf(" ")
            If firstSpace > -1 Then
                address = address.Substring(0, firstSpace)
            End If
            MsgBox(address)
        End If
    End Sub

End Class
Reply With Quote
  #9  
Old 06-09-2011, 09:42 AM
zatarboy zatarboy is offline
Regular
 
Join Date: Oct 2004
Posts: 63
Default

Ok looks like I've still got a minor problem...str is no longer a single line string. It's multi-line as I've read text from a text file into it. So MsgBox(address) now shows:

192.168.0.10
user1@jumpserver:~$
Reply With Quote
  #10  
Old 06-09-2011, 10:20 AM
AtmaWeapon's Avatar
AtmaWeaponFind/Extract string from a text file AtmaWeapon is offline
Fabulous Florist

Forum Leader
* Guru *
 
Join Date: Feb 2004
Location: Austin, TX
Posts: 9,500
Default

With every complication you add the regex becomes more and more appropriate.

It already supports having text after the address and ignores that text. The whitespace character class can include line breaks with minor tweaks. But if you insist on using the more primitive methods, you're starting to approach the neighborhood where a small parser becomes easier than a tunnel of substrings. We're talking maybe 1 line of change to the code I posted, likely it'd just take a new parameter to the Regex constructor.

If you really don't want to use Regex, I think we're in a feedback loop and these waste time. I think you can help us out a lot if you'd post a real example of what the file looks like, preferably attached so we know the forum doesn't eat any formatting. That'd help us catch all the little gotchas that exist in the file. Otherwise there's no telling how many iterations we'll have to walk through. Of course, if you control how the file is created the ideal would be to make it stick to a definite format. Too many more little gotchas and it starts to become easier to write a tokenizing parser than figure out how to write the maze of Substring() calls you'll require.
__________________
.NET Resources
My FAQ threads | Tutor's Corner | Code Library
I would bet money 2/3 of .NET questions are already answered in one of these three places.
Reply With Quote
  #11  
Old 06-09-2011, 11:01 AM
zatarboy zatarboy is offline
Regular
 
Join Date: Oct 2004
Posts: 63
Default

Hi AtmaWeapon,

I haven't been coding for long and haven't come across Regex which was why I was hesitated to use it...but I am willing to try it

Here is the copy of the text file...sorry for taking up your time...hope this speeds things up.
Attached Files
File Type: txt test.txt (692 Bytes, 6 views)
Reply With Quote
  #12  
Old 06-09-2011, 02:03 PM
AtmaWeapon's Avatar
AtmaWeaponFind/Extract string from a text file AtmaWeapon is offline
Fabulous Florist

Forum Leader
* Guru *
 
Join Date: Feb 2004
Location: Austin, TX
Posts: 9,500
Default

I say the best thing to do is to read the file line by line in this situation. When you said there were multiple lines, I was expecting something like this could happen:
Code:
yahoo.com has address
1.1.1.1
and there are...
Where you don't know where the line breaks could occur. It looks like if a line break occurs, it has to be at the end of one of the interesting lines. If that's true, it'd be best to separate the file into lines so that "end of string" is synonymous with "end of where the IP is located."

Depending on how you're loading your file, this could be easy. If you're loading the file all at once using something like File.ReadAllText(), switch to File.ReadAllLines() to get a string array where each element is a line. If you're using a StreamReader to read in the file contents, you're already getting it one line at a time. That'll have to be something figured out separately.

Let's say you've got all the file's lines read and you have a function that processes each line. Here's how we'd separate the pieces without regex:
Code:
* Take $line as a parameter
* Suppose $target is "has address"
* If $line contains $target (case-insensitive):
    * The hostname is all of the string from the beginning to the index of
            $target.
    * The IP address is all of the string past the end of $target.
This translates to code well. Don't bother with .ToUpper() every single time you want to make a comparison; you can hijack the IndexOf() method that takes a StringComparison to get the same thing. You need the index of that string anyway. For example:
Code:
Function GetHostOrNull(ByVal line As String) As String
    Dim targetIndex As Integer = line.IndexOf(Target, StringComparison.OrdinalIgnoreCase)
    If targetIndex <> -1 Then
        Dim hostLength As Integer = targetIndex
        Dim host As String = line.Substring(0, hostLength)
        Return host
    End If

    Return Nothing
End Function

Function GetIPOrNull(ByVal line As String) As String
    Dim targetIndex As Integer = line.IndexOf(Target, StringComparison.OrdinalIgnoreCase)
    If targetIndex <> -1 Then
        Dim targetLength = Target.Length
        Dim ipIndex = targetIndex + targetLength
        Dim ip As String = line.Substring(ipIndex)
        Return ip
    End If

    Return Nothing
End Function
Homework: Figure out how to combine the two functions. Fun!
__________________
.NET Resources
My FAQ threads | Tutor's Corner | Code Library
I would bet money 2/3 of .NET questions are already answered in one of these three places.
Reply With Quote
  #13  
Old 06-13-2011, 10:57 AM
zatarboy zatarboy is offline
Regular
 
Join Date: Oct 2004
Posts: 63
Default

Ok it took a while but I managed to get my head around RegEx
This isn't all my code...just the bit that reads the test.txt file and finds the IP address...Thanks to all of you for your time and patience

Code:
Imports System
Imports System.Collections.Generic
Imports System.Text
Imports System.Text.RegularExpressions
Imports System.IO

Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim fName As String = "C:\Users\first.last\Documents\test.txt"
        Dim testTxt As New StreamReader(fName)
        Dim allRead As String = testTxt.ReadToEnd()
        testTxt.Close()

        Dim m As Match = Regex.Match(allRead, _
                 "\w+\s*has address\s*(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})", _
                 RegexOptions.IgnoreCase)

        If (m.Success) Then
            Dim key As String = m.Groups(1).Value
            MessageBox.Show(key)
        End If

    End Sub
End Class
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump

Advertisement:





Free Publications
The ASP.NET 2.0 Anthology
101 Essential Tips, Tricks & Hacks - Free 156 Page Preview. Learn the most practical features and best approaches for ASP.NET.
subscribe
Programmers Heaven C# School Book -Free 338 Page eBook
The Programmers Heaven C# School book covers the .NET framework and the C# language.
subscribe
Build Your Own ASP.NET 3.5 Web Site Using C# & VB, 3rd Edition - Free 219 Page Preview!
This comprehensive step-by-step guide will help get your database-driven ASP.NET web site up and running in no time..
subscribe
Find/Extract string from a text file
Find/Extract string from a text file
Find/Extract string from a text file Find/Extract string from a text file
Find/Extract string from a text file
Find/Extract string from a text file
Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file Find/Extract string from a text file
Find/Extract string from a text file
Find/Extract string from a text file
 
Find/Extract string from a text file
Find/Extract string from a text file
 
-->