UTF8Encoding: GetDecoder

GetDecoder

Obtains a decoder that converts a UTF-8 encoded sequence of bytes into a sequence of Unicode characters.



 Public Function GetDecoder ( ) As Decoder

Return Values

Decoder -  A Decoder that converts a UTF-8 encoded sequence of bytes into a sequence of Unicode characters.

Remarks

The Decoder.GetChars method converts sequential blocks of bytes into sequential blocks of characters, in a manner similar to the GetCharsEx method of this class. However, a Decoder maintains state information between calls so it can correctly decode byte sequences that span blocks. The Decoder also preserves trailing bytes at the end of data blocks and uses the trailing bytes in the next decoding operation. Therefore, GetDecoder and GetEncoder are useful for network transmission and file operations, because those operations often deal with blocks of data instead of a complete data stream.

If error detection is enabled, that is, the ThrowOnInvalidCharacters parameter of the constructor is set to True, error detection is also enabled in the Decoder returned by this method. If error detection is enabled and an invalid sequence is encountered, the state of the decoder is undefined and processing must stop.

Examples

The following example demonstrates how to use the GetDecoder method to obtain a UTF-8 decoder. The decoder converts a sequence of bytes into a sequence of characters.

Public Sub Main()
    Dim Chars()             As Integer
    Dim Bytes()             As Byte
    Dim UTF8Decoder         As Decoder
    Dim CharCount           As Long
    Dim CharsDecodedCount   As Long
    Dim c                   As Variant

    Set Console.OutputEncoding = Encoding.UTF8
    Bytes = NewBytes(99, 204, 128, 234, 130, 160)
    Set UTF8Decoder = Encoding.UTF8.GetDecoder
    
    CharCount = UTF8Decoder.GetCharCount(Bytes, 0, CorArray.Length(Bytes))
    ReDim Chars(0 To CharCount - 1)
    CharsDecodedCount = UTF8Decoder.GetChars(Bytes, 0, CorArray.Length(Bytes), Chars, 0)
    
    Console.WriteLine "{0} characters used to decode bytes.", CharsDecodedCount
    
    Console.WriteValue "Decoded chars: "
    For Each c In Chars
        Console.WriteValue "[{0:$}]", c
    Next
    
    Console.WriteLine
    Console.ReadKey
End Sub

' This code produces the following output.
'
'    3 characters used to decode bytes.
'    Decoded chars: [c][`][ꂠ]

See Also

Project CorLib Overview

Class UTF8Encoding Overview

Decoder

GetChars

GetCharsEx

GetString

GetCharCount

GetEncoder