CodePage Property

Gets and sets the code page used when converting text to and from Unicode.

Syntax

object.CodePage [= value ]

Remarks

The CodePage property is an integer value which specifies how text is encoded. Any valid code page identifier may be specified. Some common values are:

Value Description

Text sent and received using a string should be converted using the ANSI code page for the current locale.

Text sent and received using a string should be converted using the system default OEM code page. The OEM code page typically contains characters that are used by console applications and are based on character sets commonly used by MS-DOS. You should not use this code page unless you know the server is sending text which includes OEM characters.

Text sent and received using a string should be converted using the Windows ANSI code page for western European languages. This code page is commonly used by legacy Windows applications for English and some other western languages. It should be noted that while this code page is similar to ISO 8859-1 character encoding, it is not identical.

Text sent and received using a string should be converted using the ISO 8859-1 code page for western European languages. This code page is commonly referred to as Latin-1 and is similar to the Windows 1252 code page.

Data that is sent and received using a string should be converted using UTF-7 encoding. If this code page is specified, data written to the socket will be encoded as UTF-7 encoded Unicode. All data received from the server will be converted from UTF-7. It is not recommended that you use this code page unless you know that the remote host is sending UTF-7 encoded text.

Data that is sent and received using a string should be converted using UTF-8 encoding. If this code page is specified, data written to the socket will be encoded as UTF-8 encoded Unicode. All data received from the server will be converted from UTF-8 to UTF-16 Unicode. Because UTF-8 is backwards compatible with the ASCII character set, it is safe to use this encoding option when sending and receiving ASCII text.

A complete list of available code page identifiers can be found in Microsoft's documentation for the Win32 API.

All data exchanged with an FTP server is sent and received as 8-bit bytes, typically referred to as "octets" in networking terminology. However, the internal string type used by ActiveX controls are Unicode, with each character represented using 16 bits. When you send and receive data using the String data type, they will automatically be converted to a stream of bytes.

By default, strings are converted to an array of bytes using UTF-8 encoding, mapping the 16-bit Unicode characters to 8-bit bytes. Similarly, when reading data into a string buffer, the stream of bytes received from the remote host are converted to Unicode before they are returned to your application.

If the text you receive appears to corrupted or characters are being replaced with question marks or other symbols, it is likely the file on the server is using a different character encoding. Most applications use UTF-8 encoding to represent non-ASCII characters; however, some text files may use a localized character set rather than using Unicode. Using the GetText and PutText methods in combination with this property will change how that text is converted to Unicode.

Strings are only guaranteed to be safe when sending and receiving text. Using a string data type is not recommended when uploading or downloading binary data. If possible, you should always use a byte array when using the GetData and PutData methods.

This property value directly corresponds to Windows code page identifiers, and will accept any valid code page in addition to the values listed above. Setting this property to an invalid code page will result in an error.

Although strings in Visual Basic are internally managed as Unicode, the default common controls used in Visual Basic 6.0 do not support Unicode. Those controls, such as buttons, text boxes and labels, will automatically convert the Unicode text to ANSI using the current code page. This means that text in the end-user's native language (depending on system settings) may display correctly, although text in other languages using different character sets may not. Also note that the VB6 IDE is not Unicode aware and may display corrupted string values or invalid characters, such as with tooltip values when debugging.

For Unicode support in Visual Basic 6.0, it's recommended that you use third-party controls. An alternative that some developers have used is the Microsoft Forms 2.0 Object Library (FM20.DLL) that is part of Microsoft Office. It includes a collection of controls that support Unicode, however they are not redistributable and Microsoft has stated that their use with VB6 is unsupported.

Data Type

Integer (Int32)

Syntax

Remarks

Data Type

See Also