Unicode

This is a near-universal converter for various Unicode encodings.

Supported formats

Encodings

The Raw format simply prints the string as it should be represented. Note that most valid code points are either unassigned or do not have a glyph in the current font; characters may either be missing or represented as a square box.

This converter is only for Unicode text. Use the binary converter for arbitrary byte sequences.

Bases

Numbers can be represented in binary, octal, decimal or hexadecimal bases. When encoding bytes, the binary, octal and hexadecimal representations are padded to a fixed length (8, 3 and 2 digits per byte). The decimal representation, and all representations of codepoints, are space-separated.

Base64 is a special encoding that represents byte sequences as alphanumeric characters (plus the characters /, + and =), using four characters for every three bytes.

The PGP word list encodes bytes as a sequence of words, and is useful for conveying data over an audio channel.

Note that both Base64 and PGP words encode a byte stream, and cannot be used with codepoints, which are integers rather than bytes.