UnicodeDecodeText Function  
 
INT WINAPI UnicodeDecodeText(
  UINT nCodePage,  
  LPCSTR lpUtf8Text,  
  INT nLength,  
  LPTSTR lpString,  
  INT nMaxLength  
);

The UnicodeDecodeText function decodes UTF-8 encoded text and returns the contents in a string buffer.

Parameters

nCodePage
A value that specifies the code page used when decoding the text. This parameter can be set to the value of any code page that is available in the operating system. It is recommended that you use CP_ACP which specifies the system default ANSI code page. This parameter is only used when decoding UTF-8 encoded text to a multi-byte string.
lpUtf8Text
A null-terminated string that contains the UTF-8 encoded text to be decoded. This parameter must always specify a pointer to an 8-bit character string, regardless if the project is configured to use the Unicode or Multi-Byte character set. This parameter cannot be NULL.
nLength
The number of bytes in the UTF-8 encoded string to be decoded. If this parameter is -1, the length of the UTF-8 encoded text is determined by counting the number of characters up to the terminating null character.
lpString
A null-terminated string buffer that will contain the decoded UTF-8 text when the function returns. The string buffer must be large enough to contain all of the encoded text, and cannot be a NULL a pointer.
nMaxLength
The maximum number of characters that can be copied into the string buffer. The contents of the lpString buffer will always be null-terminated, and the maximum size must be large enough to include the null character. This value must be greater than zero.

Return Value

If the UTF-8 encoded text is successfully decoded, the return value is the number of characters copied to the string buffer. If the text cannot be decoded, or the string buffer is not large enough to store all of the decoded text, the function will return zero. To get extended error information, call GetLastError.

Remarks

There are two versions of this function, UnicodeDecodeTextA which returns a localized multi-byte string and UnicodeDecodeTextW which converts the UTF-8 encoded text to UTF-16 text. Your project configuration typically determines which version of this function is used by default.

If the value of the nLength parameter is larger than the number of UTF-8 characters, the function will not check beyond the terminating null character. If the length of the lpUtf8Text string is unknown, specify a length of -1 and the function will decode the entire contents of the string up to the terminating null character.

When calling UnicodeDecodeTextA to convert UTF-8 encoded text to a localized multi-byte string, it is recommended that you specify CP_ACP (zero) as the code page value unless you know it contains Unicode characters that cannot be represented using the default ANSI code page. Using CP_ACP will ensure the UTF-8 text is decoded using the current locale and language settings.

When calling UnicodeDecodeTextW to convert UTF-8 encoded text to UTF-16 text, the code page parameter is ignored and should always be a value of zero.

This function performs a strict check on the UTF-8 encoded text and will fail if the encoding is malformed, or in the case of being converted to a multi-byte string, if it cannot be decoded using the specified code page. It will not simply replace invalid character sequences with a default character.

Requirements

Minimum Desktop Platform: Windows 7 Service Pack 1
Minimum Server Platform: Windows Server 2008 R2 Service Pack 1
Header File: cstools11.h
Import Library: csncdv11.lib
Unicode: Implemented as Unicode and ANSI versions

See Also

IsUnicodeText, UnicodeEncodeText