UTF32Encoding

UTF32Encoding


Represents a UTF-32 encoding of Unicode characters.


Implements:

Encoding 
IObject 

Public:

Properties:

NameDescription
 BodyName (get) Gets the name of the current encoding that can be used with the mail agent body tags.  
 CodePage (get) gets the code page identifier of the current Encoding.  
 DecoderFallback (get) Gets the current DecoderFallback instance used by the encoding.  
 DecoderFallback (set) Sets the DecoderFallback to be used by this encoding instance.  
 EncoderFallback (get) Gets the current EncoderFallback instance used by the encoding.  
 EncoderFallback (set) Sets the EncoderFallback to be used by this encoding instance.  
 EncodingName (get) gets the human-readable description of the current encoding.  
 HeaderName (get) gets a name for the current encoding that can be used with mail agent header tags.  
 IsBrowserDisplay (get) Gets if this encoding can be used by browsers to display text.  
 IsBrowserSave (get) Gets if this encoding can be used to save data with this encoding.  
 IsMailNewsDisplay (get) Gets if this encoding can be used to display mail and news by mail and news clients.  
 IsMailNewsSave (get) Gets if this encoding can be used to save date by mail and news clients.  
 IsReadOnly (get) Gets a value indicating whether the current encoding is read-only.  
 IsSingleByte (get) Gets if the current encoding uses single-byte code points.  
 WebName (get) Gets the encoding name registered with the Internet Assigned Numbers Authority.  
 WindowsCodePage (get) Gets the Windows Operating Systems code page for this encoding.  

Methods:

NameDescription
 Clone Creates a clone of the current Encoding instance.  
 Equals Returns a boolean indicating if the value and this object instance are the same instance.  
 GetByteCount Returns the number of bytes that would be produced from the set of characters using this encoding.  
 GetBytes Encodes a set of characters into an array of bytes.  
 GetBytesEx Encodes a set of characters into an array of bytes, returning the number of bytes produced.  
 GetCharCount Returns the number of characters that would be produced by decoding a byte array.  
 GetChars Decodes a set of bytes into a set of characters.  
 GetCharsEx Decodes a set of bytes into the supplied Integer array.  
 GetDecoder Obtains a decoder that converts an ASCII encoded sequence of bytes into a sequence of Unicode characters.  
 GetEncoder Obtains an encoder that converts a sequence of Unicode characters into an ASCII encoded sequence of bytes.  
 GetHashCode Returns a pseudo-unique number identifying this instance.  
 GetMaxByteCount Returns the maximum number of bytes that can be created from a specific number of characters.  
 GetMaxCharCount Returns the maximum number of characters than can be decoded from the number of bytes specified.  
 GetPreamble Returns a Unicode byte order mark encoded in UTF-32 format, if the UTF32Encoding object is configured to supply one.  
 GetString Decodes a range of bytes into a String.  
 ToString Returns a string representation of this object instance.  

Remarks

Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. Decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters.

The UTF-32 encoding represents each code point as a 32-bit integer.

The GetByteCount method determines how many bytes result in encoding a set of Unicode characters, and the GetBytes method performs the actual encoding.

Likewise, the GetCharCount method determines how many characters result in decoding a sequence of bytes, and the GetChars and GetString methods perform the actual decoding.

UTF32Encoding corresponds to the Windows code pages 12000 (little endian byte order) and 12001 (big endian byte order).

The encoder can use the big endian byte order (most significant byte first) or the little endian byte order (least significant byte first). For example, the Latin Capital Letter A (code point U+0041) is serialized as follows (in hexadecimal):

It is generally more efficient to store Unicode characters using the native byte order. For example, it is better to use the little endian byte order on little endian platforms, such as Intel computers.

The GetPreamble method retrieves an array of bytes that can include the byte order mark (BOM). If this byte array is prefixed to an encoded stream, it helps the decoder to identify the encoding format used.

For more information on byte order and the byte order mark, see The Unicode Standard at the Unicode home page.

Public Sub Main()
    Dim U32LE       As UTF32Encoding
    Dim U32withED   As UTF32Encoding
    Dim U32noED     As UTF32Encoding
    Dim MyStr       As String
    Dim MyBytes()   As Byte
    
    Set Console.OutputEncoding = Encoding.UTF8
    
    ' Create an instance of UTF32Encoding using little-endian byte order.
    ' This will be used for encoding.
    Set U32LE = NewUTF32Encoding(False, True)
    
    ' Create two instances of UTF32Encoding using big-endian byte order: one with error detection and one without.
    ' These will be used for decoding.
    Set U32withED = NewUTF32Encoding(True, True, True)
    Set U32noED = NewUTF32Encoding(True, True, False)
    
    ' Create byte arrays from the same string containing the following characters:
    '    Latin Small Letter Z (U+007A)
    '    Latin Small Letter A (U+0061)
    '    Combining Breve (U+0306)
    '    Latin Small Letter AE With Acute (U+01FD)
    '    Greek Small Letter Beta (U+03B2)
    '    a high-surrogate value (U+D8FF)
    '    a low-surrogate value (U+DCFF)
    MyStr = t("za\u0306\u01FD\u03B2\uD8FF\uDCFF")
    
    ' Encode the string using little-endian byte order.
    ReDim MyBytes(0 To U32LE.GetByteCount(MyStr) - 1)
    U32LE.GetBytesEx MyStr, 0, Len(MyStr), MyBytes, 0
    
    ' Decode the byte array with error detection.
    Console.WriteLine "Decoding with error detection:"
    PrintDecodedString MyBytes, U32withED
    
    ' Decode the byte array without error detection.
    Console.WriteLine "Decoding without error detection:"
    PrintDecodedString MyBytes, U32noED
    
    Console.ReadKey
End Sub

Private Sub PrintDecodedString(ByRef Bytes() As Byte, ByVal Enc As Encoding)
    On Error GoTo Catch
    
    Console.WriteLine "  Decoded string: {0}", Enc.GetString(Bytes)
    
    GoTo Finally
    
Catch:
    Dim Ex As Exception
    Catch Ex, Err
    Console.WriteLine Ex.ToString
Finally:
    Console.WriteLine
End Sub

See Also

Project CorLib Overview

Class UTF32Encoding Overview

ASCIIEncoding

UTF7Encoding

UTF8Encoding

UnicodeEncoding