Module DeSource

RFC1951/DEFLATE codec.

RFC1951/DEFLATE is a IETF standard. Module provides non-blocking streaming codec to decode and encode DEFLATE encoding. It can efficiently work payload by payload without blocking IO.

Module provides LZ77 compression algorithm but lets the client to define his algorithm as long as he/she uses shared queue provided in this module.

Lz77 compression and huffman compression.

Lz77 does a compression such as it searches a repeated pattern and emits a Copy code (see Queue.cmd) which tells us to copy a previous pattern. For instance, this is an equivalence between a list of Queue.cmds and a string.

  let cmds = [ `Literal 'a'; `Copy (1, 3) ]
  let results = "aaaa"

The goal of Lz77 is to produce such list from a string to help then a format such as DEFLATE to compress inputs.

Def does an huffman compression. From a list of Queue.cmd, it can calculate frequencies (see literals and distances) and generate a smaller representation of the given alphabet. Such compression is available via the Def.kind.Fixed or the Def.kind.Dynamic block. The Def.kind.Flat block does a copy of any literals into the output.

NOTE: It's illegal to emit a `Copy Queue.cmd and try to serialize it with a Def.kind.Flat block. Queue.eob is required for Def.kind.Fixed and Def.kind.Dynamic (to delimit the end of the block) and ignored by the Def.kind.Flat block.

Prelude.

de wants to be self-contained. By this constraint, it provides convenience values to be used by others (like zl). The client should not use these functions even if they are available. Others libraries like Bigstringaf serve the same purpose of a much better way.

The type for bigstring.

Sourcetype optint = Optint.t

Type type for optimal integer.

Sourceval bigstring_empty : bigstring

An empty bigstring.

Sourceval bigstring_create : int -> bigstring

bigstring_create len returns a uninitialized bigstring of length len.

Sourceval bigstring_length : bigstring -> int

bigstring_length t is the length of the bigstring t, in bytes.

Sourceval io_buffer_size : int

Window.

Sourcetype window

The type for windows.

Sourceval make_window : bits:int -> window

make_window allocates a new buffer which represents a window. It used by decoder and LZ77 compressor to keep tracking older inputs and:

  • process a copy from a distance by the decoder.
  • generate a copy from the compression algorithm.
Sourceval window_bits : window -> int

DEFLATE Decoder.

Decoder of RFC1951 DEFLATE codec. de provides a Inf.decoder to decode DEFLATE input and inflate it.

Sourcemodule Inf : sig ... end

Queue.

DEFLATE encoder needs a compressed input which can be transmited by a shared queue filled by compression algorithm. B is used between N and a compression algorithm like L. It provides a small representation of commands (see Queue.cmd) emitted by compression algorithm.

N encoder interprets Queue.cmd as fast as it can. Shared queue can be a bottleneck about the whole compression process. Indeed, it limits encoder on how many bytes it can produce. We recommend to make a queue as large as output buffer.

Sourcemodule Queue : sig ... end

Frequencies.

DYNAMIC DEFLATE block needs frequencies of code emitted by compression algorithm. literals and distances exist to keep frequencies while compression process.

Sourcetype literals = private int array

The type of frequencies of literals (including lengths).

Sourcetype distances = private int array

The type of frequencies of distances.

Sourceval make_literals : unit -> literals

make_literals allocates a new literals where frequencies of symbols (expect End Of Block) are set to 0.

Sourceval make_distances : unit -> distances

make_distances allocates a new distances where frequencies of distance symboles are set to 0.

Sourceval succ_literal : literals -> char -> unit

succ_literals literals chr increases frequency of chr.

Sourceval succ_length : literals -> int -> unit

succ_length literals l increases frequency of l code. l must be upper than 2 and lower than 259 according DEFLATE codec. Otherwise, it raises an Invalid_argument.

Sourceval succ_distance : distances -> int -> unit

succ_distance distance d increases frequency of d code. d must be upper than 0 and lower than 32769 according DEFLATE codec. Otherwise, it raises an Invalid_argument.

DEFLATE Encoder.

Sourcemodule Def : sig ... end

LZ77 compression algorithm.

Distribution provides LZ77 compression algorithm which can be used with N. However, the client must know others algorithms exist. This algorithm is used by zz to implement ZLIB layer.

Sourcemodule Lz77 : sig ... end

Higher API.

de provides useful but complex API. This sub-module provides an easier way to compress/uncompress DEFLATE codec. Even if the client still can give some details, we recommend to use Inf and Def if you want a precise control about memory consumption.

Sourcemodule Higher : sig ... end

/ *

Sourceval unsafe_set_cursor : Inf.decoder -> int -> unit
Sourcemodule Lookup : sig ... end
Sourcemodule T : sig ... end