Module Misc.Utf8_lexeme
val normalize :
string ->
(Misc.Utf8_lexeme.t, Misc.Utf8_lexeme.t) Stdlib.Result.tNormalize the given UTF-8 encoded string. Invalid UTF-8 sequences results in a error and are replaced by U+FFFD. Identifier characters are put in NFC normalized form. Other Unicode characters are left unchanged.
val capitalize :
string ->
(Misc.Utf8_lexeme.t, Misc.Utf8_lexeme.t) Stdlib.Result.tLike normalize, but if the string starts with a lowercase identifier character, it is replaced by the corresponding uppercase character. Subsequent characters are not changed.
val uncapitalize :
string ->
(Misc.Utf8_lexeme.t, Misc.Utf8_lexeme.t) Stdlib.Result.tLike normalize, but if the string starts with an uppercase identifier character, it is replaced by the corresponding lowercase character. Subsequent characters are not changed.
val is_capitalized : Misc.Utf8_lexeme.t -> boolReturns true if the given normalized string starts with an uppercase identifier character, false otherwise. May return wrong results if the string is not normalized.
val is_valid_identifier : Misc.Utf8_lexeme.t -> boolCheck whether the given normalized string is a valid OCaml identifier:
- all characters are identifier characters
- it does not start with a digit or a single quote
val is_lowercase : Misc.Utf8_lexeme.t -> boolReturns true if the given normalized string only contains lowercase identifier character, false otherwise. May return wrong results if the string is not normalized.
type validation_result = | Valid| Invalid_character of Stdlib.Uchar.t(*Character not allowed
*)| Invalid_beginning of Stdlib.Uchar.t(*Character not allowed as first char
*)
val validate_identifier :
?with_dot:bool ->
Misc.Utf8_lexeme.t ->
Misc.Utf8_lexeme.validation_resultLike is_valid_identifier, but returns a more detailed error code. Dots can be allowed to extend support to path-like identifiers.
val starts_like_a_valid_identifier : Misc.Utf8_lexeme.t -> boolChecks whether the given normalized string starts with an identifier character other than a digit or a single quote. Subsequent characters are not checked.