How to write a UTF8 Unicode file with Byte Order Marks in C/C++

EDN Admin · Oct 20, 2006

What is the correct way to write utf8 byte order markers (BOM) to a UTF file with WriteFile ?
I tried this...
TCHAR* sbuffer = TEXT("Some text in a utf8 file");
char * smarker = (char *) malloc(4);
smarker[0] = 0xEF;
smarker[1] = 0xBB;
smarker[2] = 0xBF;
smarker[3] = 0x00;
WriteFile(hFile, smarker, 3, &dwBytesWritten, NULL); // write the bom
free(smarker);
WriteFile(hFile, sbuffer, (_tcslen(sbuffer) + 1) * sizeof(TCHAR), &dwBytesWritten, NULL); // write the data + null
When the file opens in notepad the BOM does not display (good), but each character has a space showing after it. ie..
"S o m e T e x t ...."
My original need to do this was: If I dont write the boms, the file can be read ok from scripting.filesystemobject, but when I try to read the file with .nET streadreader, the framework attempts to discover the unicode type by the BOMs and the lack of BOMS means a valid unicode file gets read into a string buffer as a screwed up x 0 x 0 x 0 ascii string.
 My only objective here is to write a standard UTF8 encoded unicode file from a TCHAR string. There must be an easy way ?,,,,.... ??? ,.,,
 

View the full article

How to write a UTF8 Unicode file with Byte Order Marks in C/C++

EDN Admin

Well-known member

Similar threads