Binary File Reading Code Optimization
Hi guys,
I am having a problem with my binary file reading, and wonder if anybody knows a better way to achieve what I am getting at. I am trying to read in a binary database file record by record. Each record is split into fields, and each record contains different data types (which are known at runtime). I have to cast each data field to an appropriate .NET type, and perform a calculation on each one. So far, so good, except the performance is not what I had hoped.
In the database there are around 50 million records, and each is 32 bytes. I need to complete the full read in less than 45 seconds- and so far I cannot get it to run in less than 150 seconds.
I am reading the fields like this (binary reader is already assigned):
So the idea is there is a custom structure which points to the current record, and a queue which reads and holds 100 records which is incrementally dequeued, and then refilled. This is the code for the queue refilling:
And finally this is the code for the custom structure that holds the data for indvidual records:
The actual data casts are running reasonably quickly, but the data reading is just not fast enough. Does anyone have any ideas on how I can speed this up?
Thanks,
Adam
Hi guys,
I am having a problem with my binary file reading, and wonder if anybody knows a better way to achieve what I am getting at. I am trying to read in a binary database file record by record. Each record is split into fields, and each record contains different data types (which are known at runtime). I have to cast each data field to an appropriate .NET type, and perform a calculation on each one. So far, so good, except the performance is not what I had hoped.
In the database there are around 50 million records, and each is 32 bytes. I need to complete the full read in less than 45 seconds- and so far I cannot get it to run in less than 150 seconds.
I am reading the fields like this (binary reader is already assigned):
Code:
Public Function Read() As Boolean
If Me.cursor >= Me.recordcount Then
Return (False)
End If
Try
instantiate custom structure to hold byte array for record
Me.currentRecord = New DataRecord(Me.recordsize)
buffer holds System.Collections.Queue containing next 100 records
If Me.buffer.Count = 0 Then
Me.RefillBuffer(Me.buffer)
End If
assign byte array inside custom structure to current record by pulling
next byte array from queue
Me.currentRecord.data = CType(Me.buffer.Dequeue, Byte())
increment record counter
Me.cursor += 1
Return (True)
Catch ex As Exception
Throw New System.Data.DataException("File is not accessible.")
End Try
End Function
So the idea is there is a custom structure which points to the current record, and a queue which reads and holds 100 records which is incrementally dequeued, and then refilled. This is the code for the queue refilling:
Code:
Public Function RefillBuffer(ByRef buffer As Queue)
For i as integer = 0 To 99
add record to queue if records remaining
If Me.currentfillpointer < Me.recordcount Then
buffer.Enqueue(CType(Me.dbfReader.ReadBytes(Me.recordsize), Byte()))
Me.currentfillpointer += 1
Else
Exit For
End If
Next
End Function
And finally this is the code for the custom structure that holds the data for indvidual records:
Code:
Public Structure FoxproDataRecord
Public data As Byte()
Private length As Integer
constructor to pass in record length
Public Sub New(ByVal dataLength As Integer)
length = dataLength
data = New Byte(dataLength) {}
End Sub
End Structure
The actual data casts are running reasonably quickly, but the data reading is just not fast enough. Does anyone have any ideas on how I can speed this up?
Thanks,
Adam
Last edited by a moderator: