Retriving data from an .CSV file.

mike55

Well-known member
Joined
Mar 26, 2004
Messages
726
Location
Ireland
Hi,

Have a .CSV file, and I am basically doing a select all on the file to retrieve all the data. The file has a total of twelve rows, however when I do a Rows.count on the dataset I get a total of 23. The .CSV file is accessed using Microsoft Excel.

The code that I am using to access the .CSV file is a follows:
Code:
  Read the data stored in the .CSV file into a dataset.
    Public Function readFromCSV(ByVal filetable As String, ByRef data As DataSet) As Boolean
        readFromCSV = True
        Dim ds As New DataSet
        Dim fileLocation As String
        Dim sql_select As String
        Dim obj_oledb_con As System.Data.Odbc.OdbcConnection
        Dim obj_oledb_da As System.Data.Odbc.OdbcDataAdapter
        fileLocation = getUploadDownloadDetails()     Get the location where the file is stored.
        conn_excel_str = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + fileLocation Connection string.
        data.Dispose()
        data.Clear()

        Try
            obj_oledb_con = New System.Data.Odbc.OdbcConnection(conn_excel_str)  Creat a new connection.
            obj_oledb_con.Open()    Open the connection.
            sql_select = "select * from [" + filetable + "]"
            obj_oledb_da = New System.Data.Odbc.OdbcDataAdapter(sql_select, obj_oledb_con) Execute the statement.
            obj_oledb_da.Fill(ds)   Fill the dataset with the data returned.
           obj_oledb_con.Close()    Close the connection.
           data = ds
        Catch e As Exception
            readFromCSV = False
        End Try
    End Function

Any suggestions on why such a problem is occuring of what am I doing wrong.

Mike55
 
HJB417 said:
post the content of the csv file please.

Here is the file, I have checked if the dataset has any pre-existing data data, but it doesnt so I have been reduced to using a stop key: i.e. if the first column has a -1, then the prevous row was the last row.

Mike55
 

Attachments

The output window shows my results. Take notice of the connection string.
Code:
using System;
using System.Diagnostics;
using System.Data;
using HB.Data;
using HB.Data.Helper;

namespace ConsoleApplication13
{
	class Class1
	{
		static void Main(string[] args)
		{
			Debug.WriteLine(new string(=, 20));
			Debug.WriteLine("Begin My Code");
			Debug.WriteLine(new string(=, 20));
			string tableName;
			DbConnectionInfo connInfo = HB.Utility.CsvFileHelper.CreateDbConnectionInfo(@"C:\TrialDownload2.csv", true, out tableName);
			Debug.WriteLine("Connection String: " + connInfo.ConnectionString);
			GenericDbCommandHelper cmd = connInfo.CreateCommandHelper(string.Format("SELECT * FROM [{0}]", tableName));
			DataSet ds = cmd.ExecuteQuery();
			Debug.WriteLine("DataSet.Tables.Count: " + ds.Tables.Count);
			Debug.WriteLine("DataSet.Tables[0].Rows.Count: " + ds.Tables[0].Rows.Count);
			Debug.WriteLine("DataSet.Tables[0].Columns.Count: " + ds.Tables[0].Columns.Count);
		}
	}
}

Output Window said:
DefaultDomain: Loaded c:\winnt\microsoft.net\framework\v1.1.4322\mscorlib.dll, No symbols loaded.
ConsoleApplication13: Loaded C:\Documents and Settings\hasani\My Documents\Visual Studio Projects\ConsoleApplication13\bin\Debug\ConsoleApplication13.exe, Symbols loaded.
ConsoleApplication13.exe: Loaded c:\winnt\assembly\gac\system\1.0.5000.0__b77a5c561934e089\system.dll, No symbols loaded.
ConsoleApplication13.exe: Loaded c:\documents and settings\hasani\my documents\visual studio projects\consoleapplication13\bin\debug\hb.dll, Symbols loaded.
ConsoleApplication13.exe: Loaded c:\winnt\assembly\gac\system.data\1.0.5000.0__b77a5c561934e089\system.data.dll, No symbols loaded.
ConsoleApplication13.exe: Loaded c:\winnt\assembly\gac\system.xml\1.0.5000.0__b77a5c561934e089\system.xml.dll, No symbols loaded.
====================
Begin My Code
====================
ConsoleApplication13.exe: Loaded RegexAssembly2_0, No symbols loaded.
ConsoleApplication13.exe: Loaded RegexAssembly2_0.dll, No symbols loaded.
Connection String: Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\;Extended Properties="text;HDR=YES;FMT=Delimited"
ExecuteQuery: SELECT * FROM [TrialDownload2.csv]
DataSet.Tables.Count: 1
DataSet.Tables[0].Rows.Count: 11
DataSet.Tables[0].Columns.Count: 24
The program [4864] ConsoleApplication13.exe has exited with code 0 (0x0).
 
Hi,

Have tried the connection string that you used, however I am getting the following error:
Code:
	Message	"ERROR [IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified"	String

Here is my connection string:
Code:
"Provider={Microsoft Text Driver (*.txt; *.csv)};Data Source=C:\Suretxtlog\Templates\FFA61F0A-6.csv;Extended Properties=text;HDR=YES;FMT=Delimited"

Is there any specific import statements that I need, consider that it is a web project that I am working with?

Mike55
 
oh, try this connection string

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Suretxtlog\Templates\;Extended Properties="text;HDR=YES;FMT=Delimited"
 
Ok, have tried the connection string that you sent, I am getting the following error message: Message "Could not find installable ISAM."
 
make sure the extended properties is in quotes. If you look at my connection string -> Extended Properties="text;HDR=YES;FMT=Delimited"
 
Ok, with your help the oledb connection is up and running, there is no problem in selecting the data from the .CSV file. However, the original problem still exists, when I do a dataset.tables(0).rows.count I am told that there are a total of 24 rows even though I only have 12 rows in the .csv file, There are however 24 columns in use in the file, by any chance is the system mistaking columns for rows. If I do a dataset.tables(0).columns.count I also get 24, which is correct.

Here is the oledb code:
Code:
Public Sub fetchData(ByVal fileLocation As String, ByVal fileName As String, ByRef data As DataSet)
        Dim myConnection As System.Data.OleDb.OleDbConnection
        Dim myCommand As System.Data.OleDb.OleDbCommand
        Dim myAdapter As System.Data.OleDb.OleDbDataAdapter
        Dim myDs As New DataSet
        Dim commandstring As String
        Dim sqlSelect As String
        Try
            commandstring = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Suretxtlog\Templates\;Extended Properties="  " " text;HDR=YES;FMT=Delimited"
            commandstring = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileLocation + ";Extended Properties=""text;HDR=Yes;IMEX=1"""
            commandstring = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fileLocation + ";Extended Properties=""text;HDR=YES;FMT=Delimited"""

            myConnection = New System.Data.OleDb.OleDbConnection(commandstring)
            myConnection.Open()
            sqlSelect = "select * from [" + fileName + "]"
            myAdapter = New System.Data.OleDb.OleDbDataAdapter(sqlSelect, myConnection)
            myAdapter.Fill(myDs)
            data = myDs
        Catch ex As Exception
        Finally
            myConnection.Close()
        End Try
    End Sub
 
Ok, I have solved the problem with the number of rows, apparently, when I get the row count, I am getting double the exact number of row, so I simple divide by 2. The other problem relates to the first column of the .CSV file.

Since the file is .csv, then the datatype is of type text, am I correct is assuming this?
The first column in my .CSV file is called Custom_ID, if I put a integer into that column, then that value is picked up without any problem. However, if I put text into that column i.e. abc, I cannot seem to pick-up the value? any suggestions.

Mike55.
 
Ok, i think I know what my problem is, unfortunately I havent a bulls clue on how to solve it. I select all my data from the .csv file directly into a dataset via a oledb adapter. The problem arises when you have data of two different type in a particular column: an int and a string. From what I have figured out, the minority data type in a particular column is not picked up, now whether this problem is due to my .CSV file, my select statement, my adapter, or my dataset, I dont know but I have a feeling it may be the dataset. Any suggests or has anyone ever come across such a problem before.
 
Ive had that problem.

I managed to fix it by manually editing the .CSV file, and enclosing the integer values with single quotes, in the affected column. This forces the column to be a string column, and the OLE adapter therefore recognises all the appropriate values.

As a side issue - the problem was actually created by opening the CSV file in Excel and saving it again - this stripped the quotes from the numbers in the first place... bah!

B.
 
penfold69 said:
Ive had that problem.

I managed to fix it by manually editing the .CSV file, and enclosing the integer values with single quotes, in the affected column. This forces the column to be a string column, and the OLE adapter therefore recognises all the appropriate values.

As a side issue - the problem was actually created by opening the CSV file in Excel and saving it again - this stripped the quotes from the numbers in the first place... bah!

B.

Yea, I managed to solve the problem by using a StreamReader. I then read each line in the .CSV file one at a time validating each line as I go. Might be a bit slower and a lot more code, but at least I am guaranteed that it will work.

Here is the code to read each line at a time:
Code:
dim file as system.io.file
dim reader as system.io.streamreader
dim line as string
reader = file.opentext("[I]particular file[/I]")
while reader.peek <> -1
line = reader.readline
end while

Mike55
 
"And they said that working with .csv files would be simple"

Ok, firstly I tried to select all the data in the .CSV file into a dataset, however MS Excel started to screw around with the data and so I was missing a lot of data. I also found that I had to put in a stop value inorder to determine what row was my last row.

Next I changed my code so that I read in the .CSV file one line at a time, that solved the problem of loosing data. The problem I still have is related to the stop key. Ok, I am looping through the .CSV file one row at a time. I then pass the row select into a string variable which is then passed to a method that counts the number of commas, where each comma indicates the end of a cell in the row. The problem for the last line is when I pass it in, it comes in correctly, however as I am looping through the line the column heads from the .CSV file append themselves to the line.

Any suggestions on why this is happening or is it bloody MS Excel to blame again??

Mike55
 
Back
Top