Friday, July 20, 2007
How do I convert a wchar_t to a char?
How do I convert a wchar_t to a char? How do I convert a char to a wchar_t?
It depends! Welcome to the wonderful world of string conversion.
You are probably asking this because you have a std::wstring or wchar_t* and need to pass it to a function which takes a std::string or char*. You just need to know the name of the conversion function to use, right? The problem is that the way to do the conversion varies depending on the context of your code.
Before asking how to convert a wchar_t to a char, there are other questions you should be asking first such as "What encoding is my wchar_t using?" and more important "What encoding does my function expect that char param to be in?"
Distinguish Storage From Encoding
char's and wchar_t's just represent storage space, defined by the compiler. For example, Microsoft VC defines a char as 1 byte of storage, and a wchar_t as 2 bytes. Some versions of GCC define wchar_t as 4 bytes.
What matters is the encoding used for the data contained within those bytes. Is it ascii? If so, what code page is being used? Or maybe it's unicode? If so, what unicode encoding is being used?
The encoding tells you how to interpret the data, and thus how you'll need to convert it.
Encoding Within The Wide String
You need to know the encoding of the wide string, ie. the way to interpret the data stored in each wchar_t. This is typically UTF-16 or UCS-2 when wchar_t's are 2 bytes, UTF-32 when wchar_t's are 4 bytes.
For example, let's say you had an array of wchar_t on Windows and the first wchar_t's value in hex was 2D25. This is probably UTF-16 encoding, and represents the Georgian small letter 'hoe' (http://www.unicode.org/charts/PDF/U2D00.pdf).
But who knows? Although very unlikely, it's possible that the string was read in from a source which was stuffed with 2 UTF-8 characters in each wchar_t, in which case the value 2D25 represents '-' (2D) and '%' (25).
The point is, in order to know what encoding your source string is in, you must understand the context of the code. How did you obtain this string? If it was from a Windows file function, then probably it is UTF-16 encoded. If it was from a 3rd-party library, then consult the 3rd party documentation just to be sure.
Encoding Within The Narrow String
Next, you need to know the encoding that the function expects the char* parameter to be in. Again, it's all about context. Consult the function documentation. Old unix-style functions such as unlink and rmdir generally expect the char* string to be ascii-encoded using the current locale of the OS. Other functions from 3rd-party libraries might expect the char* to be a UTF-8 encoded string, etc.
_Be careful!_ It's easy to confuse us-ascii with UTF-8 because the first 128 characters (hex values 00 to 7F) represent the same symbols. For example, value 6B represents 'k' in both us-ascii and UTF-8. It's only once you get into higher values that they get out of synch. This is why developers often think they got the used the right encoding, until their product ships internationally, and some important executives freak out because the é, ç and ä in their names are garbled.
Time To Convert!
Once you know the platform, source encoding and destination encoding, you are ready to convert. There are numerous different conversion utilities on the web, so start searching! If you google "Convert UTF16 to UTF8 on Windows" for example, this can yield better results than "convert wchar_t to char".
I did 5 minutes of googling just now and was able to find a few links to get you started:
Converting from UTF16 to UTF8 and vice versa on Windows:
http://www.codeproject.com/useritems/UtfConverter.asp
Converting from UTF16 to a given Windows code page:
http://msdn2.microsoft.com/en-us/library/ms776420.aspx
Converting to ascii using the current code page on Unix:
http://www.scit.wlv.ac.uk/cgi-bin/mansec?3C+wcstombs
Lossless and Lossy Conversions
_WARNING:_ If you are converting from one unicode encoding to another (for example, UTF-16 to UTF-8), then this will be a lossless conversion. You can convert back and forth as many times as you need without losing any encoding information.
If on the other hand you convert a unicode string to an ascii string, this is a lossy conversion. You will not be able to convert back without knowledge of the ascii code page or locale used for the initial conversion.
For example, if I send a char* ascii string over the wire to another machine, the receiving end will not be able to convert it back to a wchar_t* unicode string without knowing what locale my machine was in when I built the ascii string in the first place.
If there's one principle to remember when working with string conversion, it's that a good programmer is aware of the context in which he's working at all times. Take the extra minute to understand the source and destination encodings, the platform and the locale, and you will be rewarded with a lower bug count once your localized product hits international markets.
It depends! Welcome to the wonderful world of string conversion.
You are probably asking this because you have a std::wstring or wchar_t* and need to pass it to a function which takes a std::string or char*. You just need to know the name of the conversion function to use, right? The problem is that the way to do the conversion varies depending on the context of your code.
Before asking how to convert a wchar_t to a char, there are other questions you should be asking first such as "What encoding is my wchar_t using?" and more important "What encoding does my function expect that char param to be in?"
Distinguish Storage From Encoding
char's and wchar_t's just represent storage space, defined by the compiler. For example, Microsoft VC defines a char as 1 byte of storage, and a wchar_t as 2 bytes. Some versions of GCC define wchar_t as 4 bytes.
What matters is the encoding used for the data contained within those bytes. Is it ascii? If so, what code page is being used? Or maybe it's unicode? If so, what unicode encoding is being used?
The encoding tells you how to interpret the data, and thus how you'll need to convert it.
Encoding Within The Wide String
You need to know the encoding of the wide string, ie. the way to interpret the data stored in each wchar_t. This is typically UTF-16 or UCS-2 when wchar_t's are 2 bytes, UTF-32 when wchar_t's are 4 bytes.
For example, let's say you had an array of wchar_t on Windows and the first wchar_t's value in hex was 2D25. This is probably UTF-16 encoding, and represents the Georgian small letter 'hoe' (http://www.unicode.org/charts/PDF/U2D00.pdf).
But who knows? Although very unlikely, it's possible that the string was read in from a source which was stuffed with 2 UTF-8 characters in each wchar_t, in which case the value 2D25 represents '-' (2D) and '%' (25).
The point is, in order to know what encoding your source string is in, you must understand the context of the code. How did you obtain this string? If it was from a Windows file function, then probably it is UTF-16 encoded. If it was from a 3rd-party library, then consult the 3rd party documentation just to be sure.
Encoding Within The Narrow String
Next, you need to know the encoding that the function expects the char* parameter to be in. Again, it's all about context. Consult the function documentation. Old unix-style functions such as unlink and rmdir generally expect the char* string to be ascii-encoded using the current locale of the OS. Other functions from 3rd-party libraries might expect the char* to be a UTF-8 encoded string, etc.
_Be careful!_ It's easy to confuse us-ascii with UTF-8 because the first 128 characters (hex values 00 to 7F) represent the same symbols. For example, value 6B represents 'k' in both us-ascii and UTF-8. It's only once you get into higher values that they get out of synch. This is why developers often think they got the used the right encoding, until their product ships internationally, and some important executives freak out because the é, ç and ä in their names are garbled.
Time To Convert!
Once you know the platform, source encoding and destination encoding, you are ready to convert. There are numerous different conversion utilities on the web, so start searching! If you google "Convert UTF16 to UTF8 on Windows" for example, this can yield better results than "convert wchar_t to char".
I did 5 minutes of googling just now and was able to find a few links to get you started:
Converting from UTF16 to UTF8 and vice versa on Windows:
http://www.codeproject.com/useritems/UtfConverter.asp
Converting from UTF16 to a given Windows code page:
http://msdn2.microsoft.com/en-us/library/ms776420.aspx
Converting to ascii using the current code page on Unix:
http://www.scit.wlv.ac.uk/cgi-bin/mansec?3C+wcstombs
Lossless and Lossy Conversions
_WARNING:_ If you are converting from one unicode encoding to another (for example, UTF-16 to UTF-8), then this will be a lossless conversion. You can convert back and forth as many times as you need without losing any encoding information.
If on the other hand you convert a unicode string to an ascii string, this is a lossy conversion. You will not be able to convert back without knowledge of the ascii code page or locale used for the initial conversion.
For example, if I send a char* ascii string over the wire to another machine, the receiving end will not be able to convert it back to a wchar_t* unicode string without knowing what locale my machine was in when I built the ascii string in the first place.
If there's one principle to remember when working with string conversion, it's that a good programmer is aware of the context in which he's working at all times. Take the extra minute to understand the source and destination encodings, the platform and the locale, and you will be rewarded with a lower bug count once your localized product hits international markets.
Labels: char, string conversion, wchar_t
Comments:
<< Home
In my opinion, you can always use this source if you want to get a review on ExpressVPN. It was useful at least for me and my friends
you can use a short function wchar_t array into a char array. remember one thing the character is not ANSI code is (0 - 127) are replaced by "?"
size_t to_narrow(const wchar_t * src, char * dest, size_t dest_len){
size_t i;
wchar_t code;
i = 0;
while (src[i] != '\0' && i < (dest_len - 1)){
code = src[i];
if (code < 128)
dest[i] = char(code);
else{
dest[i] = '?';
if (code >= 0xD800 && code <= 0xD8FF)
// lead surrogate, skip the next code unit, which is the trail
i++;
}
i++;
}
dest[i] = '\0';
return i - 1;
}
size_t to_narrow(const wchar_t * src, char * dest, size_t dest_len){
size_t i;
wchar_t code;
i = 0;
while (src[i] != '\0' && i < (dest_len - 1)){
code = src[i];
if (code < 128)
dest[i] = char(code);
else{
dest[i] = '?';
if (code >= 0xD800 && code <= 0xD8FF)
// lead surrogate, skip the next code unit, which is the trail
i++;
}
i++;
}
dest[i] = '\0';
return i - 1;
}
After all considerations, it is possible and beneficial to purchase essays online. Students of the twenty first century need not struggle in outsourcing the online paper writing services necessary material that will aid with their essay writing tasks. You can gain access to books which will give you the gist of your essay papers.
Short-term loans suppose that a borrower will be bad credit loans guaranteed approval direct lenders able to make the repayment on the payday, usually the loan is given for one or two weeks but not longer than one month. The amount of such loan varies from $100 to $1000. That’s why you should evaluate your financial situation and get the loan if your problem requires the money solution within the guaranteed payday loans no matter what amount offered.
Your style is so unique in comparison to other folks I have read stuff from. Many thanks for posting when you’ve got the opportunity, Guess I’ll just book mark this site. 야동
Also feel free to visit may web page check this link 국산야동
Also feel free to visit may web page check this link 국산야동
Great web site you have here.. It’s difficult to find high quality writing like yours nowadays. I honestly appreciate people like you! Take care!! 국산야동
Also feel free to visit may web page check this link 한국야동
Also feel free to visit may web page check this link 한국야동
I’m impressed, I have to admit. Rarely do I encounter a blog that’s both educative and interesting 한국야동
Also feel free to visit may web page check this link 야동
Also feel free to visit may web page check this link 야동
Very nice post. I just stumbled upon your blog and wanted to say
that I’ve truly enjoyed surfing around your blog posts.
스포츠토토
that I’ve truly enjoyed surfing around your blog posts.
스포츠토토
You may want to get fit not mater in which life stage you are, but the problem is we don't find enough time to do effort. I was also the same but then I read the Kathryn Dennis Weight Loss journey and some other transformations, such as Jenny Doan Weight Loss, Retta Weight Loss, Miranda May Weight Loss and Action Bronson Weight Loss Journey.
Books offer us so many things without asking for anything in return. Books leave a deep impact on us and are responsible for uplifting our mood. That's great, you may also like to check these post as well:
The Daily Laws by Robert Greene PDF
Jannat Kay Pattay Novel PDF
Bahishti Zewa PDF
Ya Chahatein Novel PDF
Hasil Novel PDF
The Daily Laws by Robert Greene PDF
Jannat Kay Pattay Novel PDF
Bahishti Zewa PDF
Ya Chahatein Novel PDF
Hasil Novel PDF
You should check out this write my assignment cheap when you need help with an essay, college paper, thesis, or any other writing assignment.
Thanks for sharing this information. Really the information is very unique. Thanks for the post and effort! Please keep sharing more such blog.
ทางเข้าเล่น igoal
ทางเข้าเล่น igoal
The seafood in a Cajun seafood boil in Charlotte is typically seasoned with a flavorful blend of Cajun spices. These may include a combination of paprika, cayenne pepper, garlic powder, onion powder, black pepper, thyme, and other secret ingredients that give the seafood its signature spicy and aromatic taste.
I have just read this blog and I’ll surely come back for more posts, and also this article gives the light in which we can observe the reality of the topic. Thanks for this nice article, we are selling a best leather jacket in discounted prices, visit our website for more collection: Red Carhartt Jacket
Great Post, after read it i really like it, Drinks and beverage all information are available on this link for more information click here smart water parent company
Tired of struggling with pay someone to do my online class? Let me handle them for you! I'm a skilled online class taker from doyouronlineclass.com, committed to delivering excellent results.
Whether you're a casual gamer looking for a quick thrill or a dedicated enthusiast seeking intense challenges, Manaapk.com has something to offer for everyone. Its regularly updated content keeps players engaged and excited, while the strong community aspect fosters a sense of camaraderie among gamers. If you're on the hunt for the ultimate gaming destination, look no further than Manaapk.com Action Games
I have just read this blog and I’ll surely come back for more posts, and also this article gives the light in which we can observe the reality of the topic. Thanks for this nice article! Wrap yourself in luxury and embrace the essence of timelessness - our leather jackets are a timeless investment Lakers Jacket
Great work done. Nice website. Love it. This is really nice. Own the night, own the day - our versatile leather jackets effortlessly transition from dusk till dawn Leather Bomber Jacket mens
Just prove your steady income (even if it’s unofficial), and that’s it! You are free to get funds even with a bad credit report. Make sure you increase your chances of $400 Payday Loans Bad Credit approval by checking your valid information.
Whether you're a casual gamer looking for a quick thrill or a dedicated enthusiast seeking intense challenges, Manaapk.com has something to offer for everyone. Its regularly updated content keeps players engaged and excited, while the strong community aspect fosters a sense of camaraderie among gamers. https://vertexminds.com/about
The word ‘vertex’ in our name Vertex Minds symbolizes our philosophy and approach for whatever we do for our esteemed clients: to always strive and give our best to reach the highest point or the top or apex in our offerings, products, advisory, and services.
The word ‘vertex’ in our name Vertex Minds symbolizes our philosophy and approach for whatever we do for our esteemed clients: to always strive and give our best to reach the highest point or the top or apex in our offerings, products, advisory, and services.
Thanks for sharing this with so much of detailed information, its much more to learn from your article. Keep sharing such good stuff. Turn heads and make hearts skip a beat - our leather jackets exude charm and magnetism that can't be ignored Men biker jacket
It's time to turn your dream home into a reality. Find the perfect furniture at CashAndCarryBeds. black friday furniture sale
Looking to elevate your real estate career? Register with JoinDash and gain access to our licensed brokerage program, where you can earn a remarkable 100% commission. Don't miss out – start maximizing your income today! agent dash
we have the most popular and stylists <a href="https://rleatherjackets.com/product-category/men-outfits/men-varsity-jackets/</a>Varsity Jackets For Men And Women</strong> which will adore you look more.
Upgrade your sleeping experience with CashAndCarryBeds' exquisite collection. Find the perfect bed that matches your style and ensures a good night's sleep. #SleepBetter #LuxuriousSleep #StylishBeds cheap mirrored furniture
Wow, what a comprehensive guide! I\'ve bookmarked this post for future reference because there\'s so much valuable information here. It\'s clear that the author has a deep understanding of the topic, and I\'m grateful for their willingness to share their expertise.
real leather jacket mens
real leather jacket mens
What an extensive guide, wow! This page has so much useful information that I have bookmarked it for future use. The author's in-depth knowledge of the subject is evident, and I appreciate their willingness to impart that knowledge.
Whether you're a casual gamer looking for a quick thrill or a dedicated enthusiast seeking intense challenges, Manaapk.com has something to offer for everyone. Its regularly updated content keeps players engaged and excited, while the strong community aspect fosters a sense of camaraderie among gamers. If you're on the hunt for the ultimate gaming destination, look no further than apkino.com adventure
Hello folks! I am glad to see post, it is awsome and very interesting. Keep sharing and check my website also.
are you looking for the miami vice event ticket then you can visit here to purchase the miami vice ticket: vice event
are you looking for a womens new clothing then you can visit here for buy a new womens shorts: judy blue shorts
هل تبحث عن أفضل شركة تنظيف كنب ومجالس في الدمام؟
لا عليك سوى الاتصال بشركتنا التي تهتم بالنظافة المثالية والتعقيم الرائع وجميع الأماكن المحيطة بالكنب حيث يتم التنظيف من خلال عمالة مدربة على أعلى مستوى ليس هذا فقط بل تعمل الشركة على السرعة في إنجاز العمل وهذا بفضل أجهزتها المطورة.
شركة تنظيف كنب بالدمام
لا عليك سوى الاتصال بشركتنا التي تهتم بالنظافة المثالية والتعقيم الرائع وجميع الأماكن المحيطة بالكنب حيث يتم التنظيف من خلال عمالة مدربة على أعلى مستوى ليس هذا فقط بل تعمل الشركة على السرعة في إنجاز العمل وهذا بفضل أجهزتها المطورة.
شركة تنظيف كنب بالدمام
شركة تسليك مجاري بالخبر هي الأفضل على الإطلاق حيث تقدم خدمة تسليك المجاري بطريقة احترافية فهي متخصصة في هذا المجال مع افضل فريق عمل للتخلص من انسداد المجاري والبالوعات.
شركة تسليك مجاري بالخبر
شركة تسليك مجاري بالخبر
تعد شركة تنظيف مكيفات بالخبر من أفضل الشركات التي تفوقت عن غيرها من الشركات الأخرى التي تعمل في مجال صيانة جميع المكيفات بمختلف أنواعها وأحجامها المختلفة حيث توفر الشركة العديد من الخدمات المميزة والتي يبحث عنها الكثير من السادة العملاء الذين لديهم أجهزة المكيفات ذات الأشكال والماركات المتنوعة.
شركة تنظيف مكيفات بالخبر
Post a Comment
شركة تنظيف مكيفات بالخبر
<< Home