Garbled Unicode text using wcin standard wide-character console input stream

thoatson · Jun 30, 2018

I have a small test program that takes as input a word to be looked up in a lexicon (translation dictionary), as well as the lexicon to use. The word (i.e. lemma) is in non-Roman script, so I have used Unicode. The problem is that the lemma doesn't seem to be handled correctly.

Here's the code:

wstring lexicon;
wstring lemma;

wcout << L"Lexicon to search (twot, bdb, louwnida, cgednt): ";
wcin >> lexicon;
wcout << L"Lemma to find: ";
wcin >> lemma;
wcout << L"Displaying information for " + lemma + L"\n";

Under Project->Properties, I have:

...
Character Set - Use Unicode Character Set
Preprocessor Definitions - WIN32;_DEBUG;_CONSOLE;
Treat WChar_t as Built in Type - Yes (/Zc:wchar_t)

However, when the code runs, I get the following output:

Lexicon to search (twot, bdb, louwnida, cgednt): cgednt
Lemma to find: οὐρανός
Displaying information for ???α???

Looking at the data within the debugger Watch window, it looks the same. Any idea why the word is getting garbled?

Continue reading...

Garbled Unicode text using wcin standard wide-character console input stream

thoatson

Guest