Search through a text file

chris58950

New member
Joined
Jul 2, 2003
Messages
2
I need to program a way to search through a log (text) file for a specific word, should I load the file into a string variable, then use the InStr method to search for a specific word? This file can be very large at times (2-30 MB). Is there a limitation on how much data can fit into a string? What would be the best way to do this? Thanks for your help.
 
You could either

1) Open the file (with the StreamReader) and use the StreamReaders ReadToEnd() method to put it into a String. Then use the Strings IndexOf() function to find the substring.

or

2) Open the file (StreamReader again) and read each line individually, using the StreamReader.ReadLine() method inside a loop. When you read in one line, search the one line for the substring (using IndexOf, as in the first point). If its there, you can stop the loop; if its not, you keep looping. If you get to the end of the file, then you know the substring is not part of the file.
 
Id read in far more than one line at a time if the file even had a chance of exceeding a few hundred kilobytes, which you said it will. In other words I wouldnt use ReadToEnd() or ReadLine(), since they are slower than frozen molasses. Additionally, using ReadToEnd() (or ReadLine() on a file with few or no carriage returns/line feeds) would cause a huge memory allocation to take place, spiking a very small application up to 150MB+ if a 30MB file was loaded.

The following is an alternative that searches an entire 60MB file in under a second (in my tests):

Code:
Dim bufferSize As Integer = 102400
Dim file As StreamReader = New StreamReader(New FileStream("D:\Desktop\Log.txt", _
    FileMode.Open, FileAccess.Read, FileShare.Read), _
    System.Text.Encoding.Unicode, False, bufferSize)
Dim wordToFind As String = "Pencil"
Dim buffer(bufferSize * 2) As Char, counter As Integer

While file.Peek >= 0
    counter += 1
    file.Read(buffer, bufferSize, bufferSize)

    Dim content As String = New String(buffer)

    Dim location As Integer = content.IndexOf(wordToFind, bufferSize - wordToFind.Length)

    If location > 0 Then
        MessageBox.Show("Text found at: " _
            & ((counter - 1) * bufferSize + (location - bufferSize)).ToString(), _
            Application.ProductName)
        Exit While
    End If

    Array.Copy(buffer, bufferSize, buffer, 0, bufferSize)
End While

file.Close()
 
Searcg Through a Text File

Hi Derek,

I modified your code slightly for my needs, but Im having a problem. When the code gets to the IF statement for checking if the location > 0, the location variable has a value of &HFFFFFFFFFFF. Im not sure how this can happen if the data type of this variable is integer. I thought it would throw an exception (unless this is a reference type). In any case, it doesnt work. Please help. Thank You

----------------------

If file.Exists(strLogPath & strFileName) Then
Try

Dim file As StreamReader = New StreamReader(New FileStream(strLogPath & strFileName, _
FileMode.Open, FileAccess.Read, FileShare.Read), _
System.Text.Encoding.Unicode, False, bufferSize)
Dim wordToFind As String = "FAIL"
Dim buffer(bufferSize * 2) As Char, counter As Integer

While file.Peek >= 0
counter += 1
file.Read(buffer, bufferSize, bufferSize)

Dim content As String = New String(buffer)

Dim location As Integer = content.IndexOf(wordToFind, bufferSize - wordToFind.Length)

If location > 0 Then
MessageBox.Show("Text found at: " _
& ((counter - 1) * bufferSize + (location - bufferSize)).ToString(), _
Application.ProductName)
Exit While
Return "Errors Encountered"
Exit While
ElseIf location = 0 Then
Return "No Errors Found"
End If

Array.Copy(buffer, bufferSize, buffer, 0, bufferSize)
End While

file.Close()

Catch ex As Exception
MsgBox("Could not search through log file. This is what went wrong: " & ex.GetBaseException.Message)
End Try

End If
 
Back
Top