Garbled Unicode text using wcin standard wide-character console input stream

  • Thread starter Thread starter thoatson
  • Start date Start date
T

thoatson

Guest
I have a small test program that takes as input a word to be looked up in a lexicon (translation dictionary), as well as the lexicon to use. The word (i.e. lemma) is in non-Roman script, so I have used Unicode. The problem is that the lemma doesn't seem to be handled correctly.

Here's the code:

wstring lexicon;
wstring lemma;

wcout << L"Lexicon to search (twot, bdb, louwnida, cgednt): ";
wcin >> lexicon;
wcout << L"Lemma to find: ";
wcin >> lemma;
wcout << L"Displaying information for " + lemma + L"\n";


Under Project->Properties, I have:

...
Character Set - Use Unicode Character Set
Preprocessor Definitions - WIN32;_DEBUG;_CONSOLE;
Treat WChar_t as Built in Type - Yes (/Zc:wchar_t)

However, when the code runs, I get the following output:

Lexicon to search (twot, bdb, louwnida, cgednt): cgednt
Lemma to find: οὐρανός
Displaying information for ???α???

Looking at the data within the debugger Watch window, it looks the same. Any idea why the word is getting garbled?

Continue reading...
 
Back
Top