Huffman code
This is a Huffman code compression tool. Any text in the input will be profiled for generating an optimum variable-length prefix-free code, then compressed into bytes using that code.
The compression statistics include the description of the Huffman tree itself.
The tree here is represented as a preorder traversal, with the internal node
symbol defaulting to ~
(thus ~a~bc
stands for encoding a, b, c as 0, 10,
and 11 respectively). This encoding is 2N-1
characters for an alphabet of size N,
as there are always N-1
internal nodes in the tree.
The first four bytes of the output are a length header, which ensures that overhang bits are discarded while decoding.