Speeding up serialization?

Arokh

Well-known member
Joined
Apr 11, 2006
Messages
124
[Posting frenzy he he]

Ive been using serialization for some time now and
it is perfect for easily storing data.

But some issue have come up:
Currently I have ~3000 entires (with each ~7.5kb) with file information packed into an array,
which is de- serialized upon opening/closing the program.

For deserialzing it almost takes 8 seconds which is already too long for my purposes and the entries will continue to grow.

Here is the Class I use for storing Fileinformation:
Code:
    <Serializable()> Class FileClass
        Public FilePath As New FilePathClass
        Public ID(1) As Long
        Public Data As New DataClass

        <System.Serializable()> Class DataClass
            Public DataContents As New Dictionary(Of String, Object)

            Default Public Property Data(ByVal Path As String) As Object
                Get
                End Get
                Set(ByVal Value As Object)
                End Set
            End Property
        End Class
    End Class
Most of the data is contained by DataContents in DataClass.

Im using the Binaryserialization since the XML one doesnt work (the dictionary object isnt supported, if I understood the error corretly).

So is there some way to speed it up,
or is serialzation not meant to store greater amounts of data?
If not, what alternative ways to store it are there?

My Serialization Methods:
Code:
    Private Shared Function DeSerializeObj(ByVal FilePath As String) As Object
        Dim SettingFileStream As IO.Stream = IO.File.OpenRead(FilePath)
        Dim SettingLoader As New Runtime.Serialization.Formatters.Binary.BinaryFormatter

        Try
            Return SettingLoader.Deserialize(SettingFileStream)

        Catch ex As Exception
            MsgBox("Coudnt load " & FilePath & ". Specific settings reseted.")
            Return Nothing
        Finally
            SettingFileStream.Close()
        End Try
    End Function
    Private Shared Sub SerializeObj(ByVal Obj As Object, ByVal FilePath As String)
        If Not My.Computer.FileSystem.DirectoryExists(My.Computer.FileSystem.GetParentPath(FilePath)) Then
            My.Computer.FileSystem.CreateDirectory(My.Computer.FileSystem.GetParentPath(FilePath))
        End If

        Dim SettingFileStream As IO.Stream = IO.File.Create(FilePath)
        Dim SettingSaver As New Runtime.Serialization.Formatters.Binary.BinaryFormatter

        SettingSaver.Serialize(SettingFileStream, Obj)
        SettingFileStream.Close()
    End Sub
 
Serialization is slow because it is so easy to use

The reason serialization is so easy to use is that the .NET serialization process embeds additional information with the underlying useful data which tells the deserializer what it is deserializing. This will include information about the types of object being deserialized, and may involve the data being encoded. All of this adds overhead.

To reduce this additional overhead, you can write your own bespoke code for reading and writing data. This will involve explicitly writing each object field to the Stream when you save, and reading each field from the Stream when you load.
 
How long is it taking to serialize the data? Is this a similar amount of time compared to deserializing the data?

In your data class what are you storing in the dictionary - object can incur overheads in terms of run time casting etc.
 
I tried serialization in one of my projects too, but it was too slow so I ended up looping through all the items and saving them a little like this:
Code:
    Public Sub SaveRecording(ByVal FilePath As String)

        Ensure that the target does not exist.
        System.IO.File.Delete(FilePath)

        Declare those variables!
        Dim sFileName As String = FilePath
        Dim fs As New System.IO.FileStream(sFileName, _
            System.IO.FileMode.Create, System.IO.FileAccess.Write, System.IO.FileShare.ReadWrite)
        Dim Compresser As New ICSharpCode.SharpZipLib.BZip2.BZip2OutputStream(fs)

        Write in some text 
        AddText(Compresser, "BLAA BLAA BLAA")
        AddText(Compresser, Environment.NewLine)

        Write something else
        Dim TmpBytes As List(Of Byte) = New List(Of Byte)
        TmpBytes.Add(32)
        TmpBytes.Add(13)
        Compresser.Write(TmpBytes.ToArray, 0, TmpBytes.Count)

        Close everything
        Compresser.Close()
        fs.Close()

    End Sub


    Private Sub AddText(ByVal fs As ICSharpCode.SharpZipLib.BZip2.BZip2OutputStream, ByVal value As String)
        Dim info As Byte() = New System.Text.UTF8Encoding(True).GetBytes(value)
        fs.Write(info, 0, info.Length)
    End Sub

The data I needed to save was some times more than 100 MB uncompressed, so compressing greatly reduced the time it took to save the file.

By loopin through the items your self you can easily detect the saving progress. Deserialization also seems to crash if you change your applications or some variables name.
 
Sorry for replying so late.

The dictionary DataContents stores mostly Strings,
but I use it more like a "treeview".
The dictionary can contains other dictionaries,
providing something like a folder structure.
Which is why it takes so long, I guess.

I guess Ithe only choice I have is the way data is stored in DataContents,
to speed things up, I just hoped/hope for an easier way.


How long is it taking to serialize the data?
Mhh, it also takes some time but not as long as deserializing.
 
Back
Top