UnicodeEncodeText

UnicodeEncodeText Function

INT WINAPI UnicodeEncodeText(

UINT nCodePage,

LPCTSTR lpString,

INT nLength,

LPSTR lpUtf8Text,

INT nMaxLength

);

The UnicodeEncodeText function encodes a string and returns UTF-8 encoded text.

Parameters

nCodePage

A value that specifies the code page used when encoding the text. This parameter can be set to the value of any code page that is available in the operating system. It is recommended that you use CP_ACP which specifies the system default ANSI code page. This parameter is only used when encoding a multi-byte string.

lpString

A null-terminated string that contains the text to be encoded. This parameter cannot be NULL.

nLength

The number of characters in the string to be encoded. If this parameter is -1, the length of the string is determined by counting the number of characters up to the terminating null character.

lpUtf8Text

A pointer to a character buffer which will contain the UTF-8 encoded text when the function returns. This parameter must always specify a pointer to an 8-bit character string, regardless if the project is configured to use the Unicode or Multi-Byte character set. This parameter cannot be NULL.

nMaxLength

The maximum number of bytes that can be copied into the UTF-8 text buffer. The contents of the lpUtf8Text buffer will always be null-terminated, and the maximum size must be large enough to include the null byte. This value must be greater than zero.

Return Value

If the string is successfully encoded, the return value is the number of characters copied to the output buffer. If the text cannot be encoded, or the output buffer is not large enough to store all of the encoded text, the function will return zero. To get extended error information, call GetLastError.

Remarks

There are two versions of this function, UnicodeEncodeTextA which converts a multi-byte string to UTF-8 encoded text, and UnicodeEncodeTextW which converts a UTF-16 string to a UTF-8 string. Your project configuration typically determines which version of this function is used by default.

If the value of the nLength parameter is larger than the number of characters in lpString, the function will not check beyond the terminating null character. If the length of lpString is unknown, specify a length of -1 and the function will encode the entire contents of the string up to the terminating null character.

When calling UnicodeEncodeTextA to convert a multi-byte string to UTF-8 encoding, it is recommended that you specify CP_ACP (zero) as the code page value unless you know it contains ANSI characters from a different code page. Using CP_ACP will ensure the string is encoded using the current locale and language settings.

This function performs a strict check on the multi-byte input string and will fail if it contains a malformed multi-byte sequence or characters that cannot be converted to UTF-8 using the specified code page. It will not simply replace invalid character sequences with a default character.

When calling UnicodeEncodeTextW to convert UTF-16 text to UTF-8 encoded text, the code page parameter is ignored and should always be a value of zero. The text will be normalized prior to being converted to UTF-8 using canonical composition, where decomposed characters are combined to create their canonical precomposed equivalent.

Requirements

Minimum Desktop Platform: Windows 7 Service Pack 1
Minimum Server Platform: Windows Server 2008 R2 Service Pack 1
Header File: cstools11.h
Import Library: csncdv11.lib
Unicode: Implemented as Unicode and ANSI versions

Parameters

Return Value

Remarks

Requirements

See Also