








  另外,Windows NT是使用Unicode进行开发的,整个系统都是基于Unicode的。如果调用一个API函数并给它传递一个ANSI(ASCII字符集以及由此派生并兼容的字符集,如:GB2312,通常称为ANSI字符集)字符串,那么系统首先要将字符串转换成Unicode,然后将Unicode字符串传递给操作系统。如果希望函数返回ANSI字符串,系统就会首先将Unicode字符串转换成ANSI字符串,然后将结果返回给您的应用程序。进行这些字符串的转换需要占用系统的时间和内存。如果用Unicode来开发应用程序,就能够使您的应用程序更加有效地运行。


字符  A  N  和
ANSI码  41H  4eH  cdbaH
Unicode码  0041H  004eH  548cH


  对宽字符的支持其实是ANSI C标准的一部分,用以支持多字节表示一个字符。宽字符和Unicode并不完全等同,Unicode只是宽字符的一种编码方式。



typedef unsigned short wchar_t;




wchar_t *str1=L" Hello";




size_t __cdel wchlen(const wchar_t*);

0x0048 0x0065 0x006c 0x006c 0x006f

48 00 65 00 6c 00 6c 00 6f 00





#ifdef  _UNICODE
typedef wchar_t     TCHAR;
#define __T(x)      L##x
#define _T(x)       __T(x)
#define __T(x)      x
typedef char            TCHAR;

  可见,这些宏根据”_UNICODE” 定义与否,分别展开为ANSI或Unicode字符。 tchar.h头文件中定义的宏可以分为两类:


未定义_UNICODE(ANSI字符) 定义了_UNICODE(Unicode字符)
TCHAR  char  wchar_t
_T(x)  x  L##x

“##”是ANSI C标准的预处理语法,它叫做“粘贴符号”,表示将前面的L添加到宏参数上。也就是说,如果我们写_T(“Hello”),展开后即为L“Hello”



未定义_UNICODE(ANSI字符) 定义了_UNICODE(Unicode字符)
_tcschr  strchr  wcschr
_tcscmp  strcmp  wcscmp
_tcslen  strlen  wcslen

四、使用Win32 API进行Unicode编程

Win32 API中定义了一些自己的字符数据类型。这些数据类型的定义在winnt.h头文件中。例如:

typedef char CHAR;
typedef unsigned short WCHAR;  // wc,   16-bit UNICODE character
Win32 API在winnt.h头文件中定义了一些实现字符和常量字符串的宏进行ANSI/Unicode通用编程。同样,只例举几个最常用的: 
#ifdef  UNICODE typedef WCHAR TCHAR, *PTCHAR;typedef LPWSTR LPTCH, PTCH;typedef LPWSTR PTSTR, LPTSTR;typedef LPCWSTR LPCTSTR;#define __TEXT(quote) L##quote     // r_winnt#else       /* UNICODE */               // r_winnttypedef char TCHAR, *PTCHAR;typedef LPSTR LPTCH, PTCH;typedef LPSTR PTSTR, LPTSTR;typedef LPCSTR LPCTSTR;#define __TEXT(quote) quote        // r_winnt#endif     /* UNICODE */                // r_winnt从以上头文件可以看出,winnt.h根据是否定义了UNICODE(没有下划线),进行条件编译。   Win32 API也定义了一套字符串函数,它们根据是否定义了“UNICODE”分别展开为ANSI和Unicode字符串函数。如:lstrlen。API的字符串操作函数和C++的操作函数可以实现相同的功能,所以,如果需要的话,建议您尽可能使用C++的字符串函数,没必要去花太多精力再去学习API的这些东西。  也许您从来没有注意到,Win32 API实际上有两个版本。一个版本接受MBCS字符串,另一个接受Unicode字符串。例如:其实根本没有SetWindowText()这个API函数,相反,有SetWindowTextA()和SetWindowTextW()。后缀A表明这是MBCS函数,后缀W表示这是Unicode版本的函数。这些API函数的头文件在winuser.h中声明,下面例举winuser.h中的SetWindowText()函数的声明部分: 
#ifdef UNICODE
#define SetWindowText  SetWindowTextW
#define SetWindowText  SetWindowTextA
#endif   // !UNICODE

  细心的读者可能已经注意到了UNICODE和_UNICODE的区别,前者没有下划线,专门用于Windows头文件;后者有一个前缀下划线,专门用于C运行时头文件。换句话说,也就是在ANSI C++语言里面根据_UNICODE(有下划线)定义与否,各宏分别展开为Unicode或ANSI字符,在Windows里面根据UNICODE(无下划线)定义与否,各宏分别展开为Unicode或ANSI字符。


  VC++ 6.0支持Unicode编程,但默认的是ANSI,所以开发人员只需要稍微改变一下编写代码的习惯便可以轻松编写支持UNICODE的应用程序。
  使用VC++ 6.0进行Unicode编程主要做以下几项工作:





  在没有定义UNICODE和_UNICODE时,所有函数和类型都默认使用ANSI的版本;在定义了UNICODE和_UNICODE之后,所有的MFC类和Windows API都变成了宽字节版本了。


  因为MFC应用程序有针对Unicode专用的程序入口点,我们要设置entry point。否则就会出现连接错误。
  设置entry point的方法是:打开[工程]->[设置…]对话框,在Link页的Output类别的Entry Point里填上wWinMainCRTStartup。



  微软提供了一些ANSI和Unicode兼容的通用数据类型,我们最常用的数据类型有_T ,TCHAR,LPTSTR,LPCTSTR。
  顺便说一下,LPCTSTR和const TCHAR*是完全等同的。其中L表示long指针,这是为了兼容Windows 3.1等16位操作系统遗留下来的,在Win32 中以及其它的32位操作系统中,long指针和near指针及far修饰符都是为了兼容的作用,没有实际意义。P(pointer)表示这是一个指针;C(const)表示是一个常量;T(_T宏)表示兼容ANSI和Unicode,STR(string)表示这个变量是一个字符串。综上可以看出,LPCTSTR表示一个指向常固定地址的可以根据一些宏定义改变语义的字符串。比如:

TCHAR* szText=_T(“Hello!”);
TCHAR szText[]=_T(“I Love You”);
LPCTSTR lpszText=_T(“大家好!”);





ANSI/Unicode操作函数以_tcs开头 _tcscpy(C运行期库);
ANSI/Unicode操作函数以lstr开头 lstrcpy(Windows函数);



void CUnicodeDlg::OnButton1()
{TCHAR* str1=_T("ANSI和UNICODE编码试验");m_disp=str1;UpdateData(FALSE);









Unicode, MBCS and Generic text mappings

By Chris Maunder


In order to allow your programs to be used in international markets it is worth making your application Unicode or MBCS aware. The Unicode character set is a "wide character" (2 bytes per character) set that contains every character available in every language, including all technical symbols and special publishing characters. Multibyte character set (MBCS) uses either 1 or 2 bytes per character and is used for character sets that contain large numbers of different characters (eg Asian language character sets).

Which character set you use depends on the language and the operating system. Unicode requires more space than MBCS since each character is 2 bytes. It is also faster than MBCS and is used by Windows NT as standard, so non-Unicode strings passed to and from the operating system must be translated, incurring overhead. However, Unicode is not supported on Win95 and so MBCS may be a better choice in this situation. Note that if you wish to develop applications in the Windows CE environment then all applications must be compiled in Unicode.

Using MBCS or Unicode

The best way to use Unicode or MBCS - or indeed even ASCII - in your programs is to use the generic text mapping macros provided by Visual C++. That way you can simply use a single define to swap between Unicode, MBCS and ASCII without having to do any recoding.

To use MBCS or Unicode you need only define either _MBCS or _UNICODE in your project. For Unicode you will also need to specify the entry point symbol in your Project settings as wWinMainCRTStartup. Please note that if both _MBCS and _UNICODE are defined then the result will be unpredictable.

Generic Text mappings and portable functions

The generic text mappings replace the standard char or LPSTR types with generic TCHAR or LPTSTR macros. These macros will map to different types and functions depending on whether you have compiled with Unicode or MBCS (or neither) defined. The simplest way to use the TCHAR type is to use the CString class - it is extremely flexible and does most of the work for you.

In conjunction with the generic character type, there is a set of generic string manipulation functions prefixed by _tcs. For instance, instead of using the strrev function in your code, you should use the _tcsrev function which will map to the correct function depending on which character set you have compiled for. The table below demonstrates:

#define Compiled Version Example
_UNICODE Unicode (wide-character) _tcsrev maps to _wcsrev
_MBCS Multibyte-character _tcsrev maps to _mbsrev
None (the default: neither _UNICODE nor _MBCS defined) SBCS (ASCII) _tcsrev maps to strrev

Each str* function has a corresponding tcs* function that should be used instead. See the TCHAR.H file for all the mapping and macros that are available. Just look up the online help for the string function in question in order to find the equivalent portable function.

Note: Do not use the str* family of functions with Unicode strings, since Unicode strings are likely to contain embedded null bytes.

The next important point is that each literal string should be enclosed by the TEXT() (or _T()) macro. This macro prepends a "L" in front of literal strings if the project is being compiled in Unicode, or does nothing if MBCS or ASCII is being used. For instance, the string _T("Hello") will be interpreted as "Hello" in MBCS or ASCII, and L"Hello" in Unicode. If you are working in Unicode and do not use the _T() macro, you may get compiler warnings.

Note that you can use ASCII and Unicode within the same program, but not within the same string.

All MFC functions except for database class member functions are Unicode aware. This is because many database drivers themselves do not handle Unicode, and so there was no point in writing Unicode aware MFC classes to wrap these drivers.

Converting between Generic types and ASCII

ATL provides a bunch of very useful macros for converting between different character format. The basic form of these macros is X2Y(), where X is the source format. Possible conversion formats are shown in the following table.

String Type Abbreviation
Generic (LPTSTR) T
Const C

Thus, A2W converts an LPSTR to an LPWSTR, OLE2T converts an LPOLESTR to an LPTSTR, and so on.

There are also const forms (denoted by a C) that convert to a const string. For instance, A2CT converts from LPSTR to LPCTSTR.

When using the string conversion macros you need to include the USES_CONVERSION macro at the beginning of your function:

CollapseCopy Code
void foo(LPSTR lpsz)
{USES_CONVERSION;...LPTSTR szGeneric = A2T(lpsz)//Do something with szGeneric   ...

Two caveats on using the conversion macros:

  1. Never use the conversion macros inside a tight loop. This will cause a lot of memory to be allocated each time the conversion is performed, and will result in slow code. Better to perform the conversion outside the loop and pass the converted value into the loop.
  2. Never return the result of the macros directly from a function, unless the return value implies making a copy of the data before returning. For instance, if you have a function that returns an LPOLESTR, then do not do the following:
    CollapseCopy Code
    LPTSTR BadReturn(LPSTR lpsz)
    {USES_CONVERSION;//do something    return A2T(lpsz);

    Instead, you should return the value as a CString, which would imply a copy of the string would be made before the function returns:

    CollapseCopy Code
    CString GoodReturn(LPSTR lpsz)
    {USES_CONVERSION;//do something    return A2T(lpsz);

Tips and Traps

The TRACE statement

The TRACE macros have a few cousins - namely the TRACE0, TRACE1, TRACE2 and TRACE3 macros. These macros allow you to specify a format string (as in the normal TRACE macro), and either 0,1,2 or 3 parameters, without the need to enclose your literal format string in the _T() macro. For instance,

CollapseCopy Code
TRACE(_T("This is trace statement number %d/n"), 1);

can be written

CollapseCopy Code
TRACE1("This is trace statement number %d/n", 1);

Viewing Unicode strings in the debugger

If you are using Unicode in your applciation and wish to view Unicode strings in the debugger, then you will need to go to Tools | Options | Debug and click on "Display Unicode Strings".

The Length of strings

Be careful when performing operations that depend on the size or length of a string. For instance, CString::GetLength returns the number of characters in a string, NOT the size in bytes. If you were to write the string to a CArchive object, then you would need to multiply the length of the string by the size of each character in the string to get the number of bytes to write:

CollapseCopy Code
CString str = _T("Hello, World");
archive.Write( str, str.GetLength( ) * sizeof( TCHAR ) ); 

Reading and Writing ASCII text files

If you are using Unicode or MBCS then you need to be careful when writing ASCII files. The safest and easiest way to write text files is to use the CStdioFile class provided with MFC. Just use the CString class and the ReadString and WriteString member functions and nothing should go wrong. However, if you need to use the CFile class and it's associated Read and Write functions, then if you use the following code:

CollapseCopy Code
CFile file(...);
CString str = _T("This is some text");
file.Write( str, (str.GetLength()+1) * sizeof( TCHAR ) ); 

instead of

CollapseCopy Code
CStdioFile file(...);
CString str = _T("This is some text");

then the results will be Significantly different. The two lines of text below are from a file created using the first and second code snippets respectively:

(This text was viewed using WordPad)

Not all structures use the generic text mappings

For instance, the CHARFORMAT structure, if the RichEditControl version is less than 2.0, uses a char[] for the szFaceName field, instead of a TCHAR as would be expected. You must be careful not to blindly change "..." to _T("...") without first checking. In this case, you would probably need to convert from TCHAR to char before copying any data to the szFaceName field.

Copying text to the Clipboard

This is one area where you may need to use ASCII and Unicode in the same program, since the CF_TEXT format for the clipboard uses ASCII only. NT systems have the option of the CF_UNICODETEXT if you wish to use Unicode on the clipboard.

Installing the Unicode MFC libraries

The Unicode versions of the MFC libraries are not copied to your hard drive unless you select them during a Custom installation. They are not copied during other types of installation. If you attempt to build or run an MFC Unicode application without the MFC Unicode files, you may get errors.

(From the online docs) To copy the files to your hard drive, rerun Setup, choose Custom installation, clear all other components except "Microsoft Foundation Class Libraries," click the Details button, and select both "Static Library for Unicode" and "Shared Library for Unicode."


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Chris Maunder


Chris is the Co-founder, Administrator, Architect, Chief Editor and Shameless Hack who wrote and runs The Code Project. He's been programming since 1988 while pretending to be, in various guises, an astrophysicist, mathematician, physicist, hydrologist, geomorphologist, defence intelligence researcher and then, when all that got a bit rough on the nerves, a web developer. He is a Microsoft Visual C++ MVP both globally and for Canada locally.

His programming experience includes C/C++, C#, SQL, MFC, ASP, ASP.NET, and far, far too much FORTRAN. He has worked on PocketPCs, AIX mainframes, Sun workstations, and a CRAY YMP C90 behemoth but finds notebooks take up less desk space.

He dodges, he weaves, and he never gets enough sleep. He is kind to small animals.

Chris was born and bred in Australia but splits his time between Toronto and Melbourne, depending on the weather. For relaxation he is into road cycling, snowboarding, rock climbing, and storm chasing.

Other popular C / C++ Language articles:

  • Member Function Pointers and the Fastest Possible C++ Delegates

    A comprehensive tutorial on member function pointers, and an implementation of delegates that generates only two ASM opcodes!
  • How a C++ compiler implements exception handling
    An indepth discussion of how VC++ implements exception handling. Source code includes exception handling library for VC++.
  • A Beginner's Guide to Pointers
    An article showing the use of pointers in C and C++
  • XML class for processing and building simple XML documents
    Link CMarkup into your VC++ app and avoid complex XML tools and dependencies
  • PugXML - A Small, Pugnacious XML Parser
    Discussion of techniques for fast, robust, light-weight XML parsing.

Shared  links :



  1. Windows环境下Unicode编程总结和将ANSI转换到Unicode 将Unicode转换到ANSI

    Windows环境下Unicode编程总结 UNICODE环境设置 在安装Visual Studio时,在选择VC++时需要加入unicode选项,保证相关的库文件可以拷贝到system32下. UN ...

  2. VC++动态链接库(DLL)编程深入浅出(zz)

    1.概论 先来阐述一下DLL(Dynamic Linkable Library)的概念,你可以简单的把DLL看成一种仓库,它提供给你一些可以直接拿来用的变量.函数或类.在仓库的发展史上经历了" ...

  3. [转]C++学习:VC++动态链接库(DLL)编程深入浅出(zz)

    转自: 1.概论 先来阐述一下DLL(Dynamic Linkable Librar ...

  4. VC++动态链接库(DLL)编程深入浅出

    深度好文作为入门理解非常不错 1.概论 先来阐述一下DLL(Dynamic Linkable Library)的概念,你可以简单的把DLL看成一种仓库,它提供给你一些可以直接拿来用的变量.函数或类.在 ...

  5. C++ UNICODE 编程从入门到精通

    C++ UNICODE 编程从入门到精通 每件事情,首先从简单的开始,再一步一步深入了解,不但容易接受,而且是从它的根本掌握它. UNICODE编程其实很简单,初学会觉得复杂而已.真的很简单的,开始吧 ...

  6. windows环境下unicode编程总结

    windows环境下unicode编程总结 UNICODE环境设置 在安装Visual Studio时,在选择VC++时需要加入unicode选项,保证相关的库文件可以拷贝到system32下. UN ...

  7. VC-基础:VC++动态链接库(DLL)编程深入浅出

    1.概论 先来阐述一下DLL(Dynamic Linkable Library)的概念,你可以简单的把DLL看成一种仓库,它提供给你一些可以直接拿来用的变量.函数或类.在仓库的发展史上经历了" ...

  8. 收藏:UNICODE编程

    UNICODE编程    [ Date: 2005-8-14 16:57:00 | 作者: 四年缘尽 ] 发信人: gege (呖咕呖咕我发财), 信区: C++ 标  题: UNICODE编程 发信 ...

  9. VC串口通信编程-2

    VC串口通信编程 (2009-07-08 13:48:40) 转载▼ Win32串口编程(转:韩耀旭) 在工业控制中,工控机(一般都基于Windows平台)经常需要与智能仪表通过串口进行通信.串口通信 ...

  10. VC++动态链接库(DLL)编程(四)――MFC扩展 DLL

    VC++动态链接库(DLL)编程(四) ――MFC扩展 DLL 作者:宋宝华   前文我们对非MFC DLL和MFC规则DLL进行了介绍,现在开始详细 ...


  1. 如何使用SAP零售系统中的LISTING?【中英文对照版】
  2. linux sftp命令连接数,linux记录sftp命令
  3. wireshark抓包数据:理解与分析
  4. 哈希表及哈希表查找相关概念(转)
  5. centos安装kvm
  6. P1-0:项目框架搭建
  7. slot是什么?有什么作用?
  8. saspython知乎_sas比spss好用在哪里?
  9. 5个界面效果很炫的JavaScript UI 框架
  10. linux php5.3安装教程,Linux下安装MySql+Apache2+PHP5.3.1教程_PHP教程
  11. 涨见识!Java String转int还有这种写法
  12. UART0串口编程系列之前奏篇
  13. YOLO-zht训练-未完待续
  14. 2019阿里秋招一道笔试题(关于火柴拼出最大数字) - Android开发岗
  15. 关于abd.exe 报错的解决方法总结
  16. 解决OBS录屏黑屏问题
  17. 董孝魁:通证经济重塑企业价值,区块链应赋能服务实体
  18. win10恢复linux引导文件,easybcd误删Win10启动项,UEFI恢复引导
  19. Unity灯光(light)
  20. 【Leetcode】232.用栈实现队列


  1. Opencv绘制HSV颜色直方图
  2. 【火炉炼AI】机器学习042-NLP文本的主题建模
  3. 解决Ubuntun 12.04编译 WARNING: 'automake1.12' is missing on your system
  4. arm9 adc及触摸屏
  5. PHP程序员面临的成长瓶颈
  6. DNS寻址以及IP解析
  7. postman - github下载地址
  8. 数据结构与算法(一):概论
  9. 动态网页技术--JSP(7)
  10. POJ-2762 Going from u to v or from v to u?