UTF8Encoding: GetEncoder

GetEncoder

Obtains an encoder that converts a sequence of Unicode characters into a UTF-8 encoded sequence of bytes.



 Public Function GetEncoder ( ) As Encoder

Return Values

Encoder -  An Encoder that converts a sequence of Unicode characters into a UTF-8 encoded sequence of bytes.

Remarks

The Encoder.GetBytes method converts sequential blocks of characters into sequential blocks of bytes, in a manner similar to the GetBytesEx method. However, an Encoder maintains state information between calls so it can correctly encode character sequences that span blocks. The Encoder also preserves trailing characters at the end of data blocks and uses the trailing characters in the next encoding operation. For example, a data block might end with an unmatched high surrogate, and the matching low surrogate might be in the next data block. Therefore, GetDecoder and GetEncoder are useful for network transmission and file operations, because those operations often deal with blocks of data instead of a complete data stream.

If error detection is enabled, that is, the ThrowOnInvalidCharacters parameter of the constructor is set to true, error detection is also enabled in the Encoder returned by this method. If error detection is enabled and an invalid sequence is encountered, the state of the encoder is undefined and processing must stop.

Examples

The following example demonstrates how to use the GetEncoder method to obtain an encoder to convert a sequence of characters into a UTF-8 encoded sequence of bytes.

Public Sub Main()
    Dim Chars()             As Integer
    Dim Bytes()             As Byte
    Dim UTF8Encoder         As Encoder
    Dim ByteCount           As Long
    Dim BytesEncodedCount   As Long
    Dim b                   As Variant
    
    Chars = NewChars("a", "b", "c", &H300, &HA0A0)
    Set UTF8Encoder = Encoding.UTF8.GetEncoder
    
    ByteCount = UTF8Encoder.GetByteCount(Chars, 2, 3, True)
    ReDim Bytes(0 To ByteCount - 1)
    BytesEncodedCount = UTF8Encoder.GetBytes(Chars, 2, 3, Bytes, 0, True)
    
    Console.WriteLine "{0} bytes used to encode characters.", BytesEncodedCount
    
    Console.WriteValue "Encoded bytes: "
    For Each b In Bytes
        Console.WriteValue "[{0}]", b
    Next
    
    Console.WriteLine
    Console.ReadKey
End Sub

' This code produces the following output.
'
'    6 bytes used to encode characters.
'    Encoded bytes: [99][204][128][234][130][160]

See Also

Project CorLib Overview

Class UTF8Encoding Overview

Encoder

GetBytesEx

GetByteCount

GetDecoder