Module Stdlib.UcharSource
Unicode characters.
The type for Unicode characters.
A value of this type represents a Unicode scalar value which is an integer in the ranges 0x0000...0xD7FF or 0xE000...0x10FFFF.
min is U+0000.
max is U+10FFFF.
bom is U+FEFF, the byte order mark (BOM) character.
rep is U+FFFD, the replacement character.
succ u is the scalar value after u in the set of Unicode scalar values.
pred u is the scalar value before u in the set of Unicode scalar values.
is_valid n is true if and only if n is a Unicode scalar value (i.e. in the ranges 0x0000...0xD7FF or 0xE000...0x10FFFF).
of_int i is i as a Unicode character.
to_int u is u as an integer.
is_char u is true if and only if u is a latin1 OCaml character.
of_char c is c as a Unicode character.
to_char u is u as an OCaml latin1 character.
equal u u' is u = u'.
compare u u' is Stdlib.compare u u'.
hash u associates a non-negative integer to u.
UTF codecs tools
The type for UTF decode results. Values of this type represent the result of a Unicode Transformation Format decoding attempt.
utf_decode_is_valid d is true if and only if d holds a valid decode.
utf_decode_uchar d is the Unicode character decoded by d if utf_decode_is_valid d is true and Uchar.rep otherwise.
utf_decode_length d is the number of elements from the source that were consumed by the decode d. This is always strictly positive and smaller or equal to 4. The kind of source elements depends on the actual decoder; for the decoders of the standard library this function always returns a length in bytes.
utf_decode n u is a valid UTF decode for u that consumed n elements from the source for decoding. n must be positive and smaller or equal to 4 (this is not checked by the module).
utf_decode_invalid n is an invalid UTF decode that consumed n elements from the source to error. n must be positive and smaller or equal to 4 (this is not checked by the module). The resulting decode has rep as the decoded Unicode character.
utf_8_byte_length u is the number of bytes needed to encode u in UTF-8.
utf_16_byte_length u is the number of bytes needed to encode u in UTF-16.