String Parsing using known delimiters? [C++]

Shaitan00

Well-known member
Joined
Aug 11, 2003
Messages
343
Location
Hell
The age old issue of string parsing comes up again ...
I have a text file that contains lines that are SUPPOSED to follow a set format, specifically:
string, string, long string int string double int

The delimiters are therefore:
Comma (,) for the first two fields
Spaces for all other fields

Strings like this would be valid:
Jon, Jack, 100 CPN 5 KTE 1.00 10
Jon, Jack 100 CPN 5 KTE 1.00 10 // notice the extra spaces

Whereas something like these would be considered invalid:
Jon Jack 100 CPN 5 KTE 1.00 10 // missing the commas
Jon, Jack, 100 CPN 5 KTE 1.00 // missing the last field "10"
Jon, Jack, 100CPN 5 KTE 1.00 10 // missing space between "100" and "CPN"

The goal is to EXTRACT each section and store them, and if possible determine when a string is INVALID (does not follow format).
I have a class with the following data members:
Code:
class A
{
private:
	// Record
	string A
	string B
	long C;
	string D;
	string E;
	string F;
	double G;
	int H;

public:
	A(string sLine);	// constructor
};

A::A(string sLine)
{
	// somehow parse the string here and determine if it is valid //
}


So, how can I parse the string (sLine) and extract each piece into there components (A, B, C, D, E, F, G, H)...
I was thinking of using the old method of simply doing substring searches but I find it very error prone and long ... is there a better way to accomplish this?

Anything anyone would recommend?
Any help would be much appreciated...
Thanks,
 
In this sort of case I tend to scan the string char by char seeking for delimiters, extracting substrings. In the event of anything unexpected (extra delimiters, invalid number format, early string termination), I would consider the input invalid.

Regular expressions might also be an option.
 
marble_eater: RegEx would rock ... but I cant use BOOST or any 3rd party applications - it must be something included with VS 2008 ... Does VS 2008 have a built in REGEX?
 
As far as I know, this would only be the case if your app is managed (DotNet).
 
Well - I am using C++ VS2008 - I assume this is automatically DotNet no?
But I try to #include <regex> and it doesnt find it ...

any clues?
 
Just because you are using VS 2008 deosnt mean that the app is DotNet. If you are writing a "C++/CLI" app, then it will be DotNet. If you arent sure, a simple way to check is see if you can access anything in the System namespace. I dont have much experience with C++/CLI, but from what I gather, libraries are included via a using statement, rather than an include.
C#:
#include <iostream>
#using <mscorlib.dll>
#using <System.Text.dll>
 
// ...
 
System::Text::RegularExpressions::Regex ^x = 
    gcnew System::Text::RegularExpressions::Regex("pattern"):
The above should compile only if your app is managed (I havent tested it either way).
 
Humm - guess it isnt a DotNet application then ...
When I tested your code I get the following error:
Error 1 fatal error C1190: managed targeted code requires a /clr option
 
Back
Top