EncodingStatic

EncodingStatic


Provides static methods used to retrieve existing encodings and convert between encodings.


Public:

Properties:

NameDescription
 ASCII (get) Gets an encoding for the ASCII (7-bit) character set.  
 BigEndianUnicode (get) Gets an encoding for the UTF-16 format that uses the big endian byte order.  
 Default (get) Gets an encoding for the operating systems current ANSI code page.  
 Unicode (get) Gets an encoding for the UTF-16 format using the little endian byte order.  
 UTF32 (get) Gets an encoding for the UTF-32 format using the little endian byte order.  
 UTF7 (get) Gets an encoding for the UTF-7 format.  
 UTF8 (get) Gets an encoding for the UTF-8 format.  

Methods:

NameDescription
 Convert Converts a set of bytes from one encoding to another encoding.  
 GetEncoding Returns the encoding associated with the specified code page identifier or name. Optional parameters specify an error handler for characters that cannot be encoded and byte sequences that cannot be decoded.  
 GetEncodings Returns a list of minimal information about each encoding.  

Remarks

Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. In contrast, decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters.

Note that Encoding is intended to operate on Unicode characters instead of arbitrary binary data, such as byte arrays. If your application must encode arbitrary binary data into text, it should use a protocol such as uuencode, which is implemented by methods such as Convert.ToBase64CharArray.

VBCorLib provides the following implementations of the Encoding class to support current Unicode encodings and other encodings:

The Encoding class is primarily intended to convert between different encodings and Unicode. Often one of the derived Unicode classes is the correct choice for your application.

Your applications use the GetEncoding method to obtain other encodings. They should use the GetEncodings method to get a list of all encodings.

If the data to be converted is available only in sequential blocks (such as data read from a stream) or if the amount of data is so large that it needs to be divided into smaller blocks, your application should use the Decoder or the Encoder provided by the GetDecoder method or the GetEncoder method, respectively, of a derived class.

The UTF-16 and the UTF-32 encoders can use the big endian byte order (most significant byte first) or the little endian byte order (least significant byte first). For example, the Latin Capital Letter A (U+0041) is serialized as follows (in hexadecimal):

The GetPreamble method retrieves an array of bytes that includes the byte order mark (BOM). If this byte array is prefixed to an encoded stream, it helps the decoder to identify the encoding format used.

For more information on byte order and the byte order mark, see The Unicode Standard at the Unicode home page.

Note that the encoding classes allow errors to:

Your applications are recommended to throw exceptions on all data stream errors. An application either uses a "throwonerror" flag when applicable or uses the EncoderExceptionFallback and DecoderExceptionFallback classes. Best fit fallback is often not recommended because it can cause data loss or confusion and is slower than simple character replacements. For ANSI encodings, the best fit behavior is the default.

Examples

The following example converts a string from one encoding to another.

Public Sub Main()
    Dim UnicodeString   As String
    Dim AsciiEncoding   As Encoding
    Dim UnicodeEncoding As Encoding
    Dim AsciiBytes()    As Byte
    Dim UnicodeBytes()  As Byte
    Dim AsciiChars()    As Integer
    Dim AsciiString     As String
    
    Set Console.OutputEncoding = Encoding.UTF8
    UnicodeString = t("This string contains the unicode character Pi (\u03a0)")
    
    ' Create two different encodings.
    Set AsciiEncoding = Encoding.ASCII
    Set UnicodeEncoding = Encoding.Unicode
    
    ' Convert the string into a byte array.
    UnicodeBytes = UnicodeEncoding.GetBytes(UnicodeString)
    
    ' Perform the convertion from one encoding to the other.
    AsciiBytes = Encoding.Convert(UnicodeEncoding, AsciiEncoding, UnicodeBytes)
    
    ' Convert the new Byte() into a Char() and then into a string.
    AsciiChars = AsciiEncoding.GetChars(AsciiBytes)
    AsciiString = NewString(AsciiChars)
    
    ' Display the strings created before and after the conversion.
    Console.WriteLine "Original string: " & UnicodeString
    Console.WriteLine "Ascii converted string: " & AsciiString
    Console.ReadKey
End Sub

' This example code produces the following output.
'
'    Original string: This string contains the unicode character Pi (Π)
'    Ascii converted string: This string contains the unicode character Pi (?)

See Also

Project CorLib Overview

Class EncodingStatic Overview

Encoding

ASCIIEncoding

UTF7Encoding

UTF8Encoding

UTF32Encoding

UnicodeEncoding