EncodingStatic: Unicode (get)

Unicode

Gets an encoding for the UTF-16 format using the little endian byte order.



 Public Property Get Unicode ( ) As UnicodeEncoding

Return Values

UnicodeEncoding -  An encoding for the UTF-16 format using the little endian byte order.

Remarks

The UnicodeEncoding object that is returned by this property may not have the appropriate behavior for your application. It uses replacement fallback to replace each string that it cannot encode and each byte that it cannot decode with a question mark ("?") character. Instead, you can call the NewUnicodeEncoding(Boolean, Boolean, Boolean) constructor to instantiate a little endian UnicodeEncoding object whose fallback is either an EncoderFallbackException or a DecoderFallbackException, as the following example illustrates.

Public Sub Main()
    Dim Bytes() As Byte
    Dim Enc     As UnicodeEncoding
    Dim Value   As String
    
    Bytes = NewBytes(&H20, &H0, &H1, &HD8, &H68, &H0, &HA7, &H0)
    Set Enc = NewUnicodeEncoding(False, True, True)
    
    On Error GoTo Catch
    Value = Enc.GetString(Bytes)
    Debug.Print CorString.Format("'{0}'", Value)
    Exit Sub
    
Catch:
    Dim Ex As DecoderFallbackException
    
    Catch Ex, Err
    Debug.Print CorString.Format("Unable to decode {0} at index {1}", ShowBytes(Ex.BytesUnknown), Ex.Index)
End Sub

Private Function ShowBytes(ByRef Bytes() As Byte) As String
    Dim Byt As Variant
    Dim ReturnString As New StringBuilder
    
    For Each Byt In Bytes
        ReturnString.AppendFormat "0x{0:X2} ", Byt
    Next
    
    ShowBytes = Trim$(ReturnString.ToString)
End Function

' The example displays the following output:
'        Unable to decode 0x01 0xD8 at index 4

Read Only.

Examples

The following example determines the number of bytes required to encode a character array, encodes the characters, and displays the resulting bytes.

Public Sub Main()
    Dim Chars() As Integer
    Dim U7      As Encoding
    Dim U8      As Encoding
    Dim U16LE   As Encoding
    Dim U16BE   As Encoding
    Dim U32     As Encoding
    
    ' The characters to encode:
    '    Latin Small Letter Z (U+007A)
    '    Latin Small Letter A (U+0061)
    '    Combining Breve (U+0306)
    '    Latin Small Letter AE With Acute (U+01FD)
    '    Greek Small Letter Beta (U+03B2)
    '    a high-surrogate value (U+D8FF)
    '    a low-surrogate value (U+DCFF)
    Chars = NewChars("z", "a", ChrW$(&H306), ChrW$(&H1FD), ChrW$(&H3B2), ChrW$(&HD8FF), ChrW$(&HDCFF))
    
    Set U7 = Encoding.UTF7
    Set U8 = Encoding.UTF8
    Set U16LE = Encoding.Unicode
    Set U16BE = Encoding.BigEndianUnicode
    Set U32 = Encoding.UTF32
        
    PrintCountsAndBytes Chars, U7
    PrintCountsAndBytes Chars, U8
    PrintCountsAndBytes Chars, U16LE
    PrintCountsAndBytes Chars, U16BE
    PrintCountsAndBytes Chars, U32
End Sub

Private Sub PrintCountsAndBytes(ByRef Chars() As Integer, ByVal Enc As Encoding)
    Dim IBC     As Long
    Dim IMBC    As Long
    Dim Bytes() As Byte
    
    Debug.Print CorString.Format("{0,-30} :", Enc.ToString);
    
    IBC = Enc.GetByteCount(Chars)
    Debug.Print CorString.Format(" {0,-3}", IBC);
        
    IMBC = Enc.GetMaxByteCount(CorArray.Length(Chars))
    Debug.Print CorString.Format(" {0, -3} :", IMBC);
    
    Bytes = Enc.GetBytes(Chars)
    
    PrintHexBytes Bytes
End Sub

Private Sub PrintHexBytes(ByRef Bytes() As Byte)
    Dim i As Long
    
    If CorArray.IsNullOrEmpty(Bytes) Then
        Debug.Print "<none>"
    Else
        For i = 0 To UBound(Bytes)
            Debug.Print CorString.Format("{0:X2} ", Bytes(i));
        Next
        
        Debug.Print
    End If
End Sub

' This code produces the following output.
'
'    CorLib.UTF7Encoding            : 18  23  :7A 61 2B 41 77 59 42 2F 51 4F 79 32 50 2F 63 2F 77 2D
'    CorLib.UTF8Encoding            : 12  24  :7A 61 CC 86 C7 BD CE B2 F1 8F B3 BF
'    CorLib.UnicodeEncoding         : 14  16  :7A 00 61 00 06 03 FD 01 B2 03 FF D8 FF DC
'    CorLib.UnicodeEncoding         : 14  16  :00 7A 00 61 03 06 01 FD 03 B2 D8 FF DC FF
'    CorLib.UTF32Encoding           : 24  32  :7A 00 00 00 61 00 00 00 06 03 00 00 FD 01 00 00 B2 03 00 00 FF FC 04 00

See Also

Project CorLib Overview

Class EncodingStatic Overview