Module GenSource

Generators

Values of type 'a Gen.t represent a possibly infinite sequence of values of type 'a. One can only iterate once on the sequence, as it is consumed by iteration/deconstruction/access. None is returned when the generator is exhausted.

The submodule Restart provides utilities to work with restartable generators, that is, functions unit -> 'a Gen.t that allow to build as many generators from the same source as needed.

Global type declarations

Sourcetype 'a t = unit -> 'a option

A generator may be called several times, yielding the next value each time. It returns None when no elements remain

Sourcetype 'a gen = 'a t
Sourcemodule type S = Gen_intf.S

Transient generators

Sourceval get : 'a t -> 'a option

Get the next value

Sourceval next : 'a t -> 'a option

Synonym for get

Sourceval get_exn : 'a t -> 'a

Get the next value, or fails

Sourceval junk : 'a t -> unit

Drop the next value, discarding it.

Sourceval repeatedly : (unit -> 'a) -> 'a t

Call the same function an infinite number of times (useful for instance if the function is a random generator).

Operations on transient generators

include S with type 'a t := 'a gen
Sourceval empty : 'a gen

Empty generator, with no elements

Sourceval singleton : 'a -> 'a gen

One-element generator

Sourceval return : 'a -> 'a gen

Alias to singleton

  • since 0.3
Sourceval repeat : 'a -> 'a gen

Repeat same element endlessly

Sourceval iterate : 'a -> ('a -> 'a) -> 'a gen

iterate x f is [x; f x; f (f x); f (f (f x)); ...]

Sourceval unfold : ('b -> ('a * 'b) option) -> 'b -> 'a gen

Dual of fold, with a deconstructing operation. It keeps on unfolding the 'b value into a new 'b, and a 'a which is yielded, until None is returned.

Sourceval init : ?limit:int -> (int -> 'a) -> 'a gen

Calls the function, starting from 0, on increasing indices. If limit is provided and is a positive int, iteration will stop at the limit (excluded). For instance init ~limit:4 id will yield 0, 1, 2, and 3.

Basic combinators

Note: those combinators, applied to generators (not restartable generators) consume their argument. Sometimes they consume it lazily, sometimes eagerly, but in any case once f gen has been called (with f a combinator), gen shouldn't be used anymore.

Sourceval is_empty : _ gen -> bool

Check whether the gen is empty. Pops an element, if any

Sourceval fold : ('b -> 'a -> 'b) -> 'b -> 'a gen -> 'b

Fold on the generator, tail-recursively. Consumes the generator.

Sourceval reduce : ('a -> 'a -> 'a) -> 'a gen -> 'a

Fold on non-empty sequences. Consumes the generator.

Sourceval scan : ('b -> 'a -> 'b) -> 'b -> 'a gen -> 'b gen

Like fold, but keeping successive values of the accumulator. Consumes the generator.

Sourceval unfold_scan : ('b -> 'a -> 'b * 'c) -> 'b -> 'a gen -> 'c gen

A mix of unfold and scan. The current state is combined with the current element to produce a new state, and an output value of type 'c.

  • since 0.2.2
Sourceval iter : ('a -> unit) -> 'a gen -> unit

Iterate on the gen, consumes it.

Sourceval iteri : (int -> 'a -> unit) -> 'a gen -> unit

Iterate on elements with their index in the gen, from 0, consuming it.

Sourceval length : _ gen -> int

Length of an gen (linear time), consuming it

Sourceval map : ('a -> 'b) -> 'a gen -> 'b gen

Lazy map. No iteration is performed now, the function will be called when the result is traversed.

Sourceval mapi : (int -> 'a -> 'b) -> 'a gen -> 'b gen

Lazy map with indexing starting from 0. No iteration is performed now, the function will be called when the result is traversed.

  • since 0.5
Sourceval fold_map : ('b -> 'a -> 'b) -> 'b -> 'a gen -> 'b gen

Lazy fold and map. No iteration is performed now, the function will be called when the result is traversed. The result is an iterator over the successive states of the fold.

  • since 0.2.4
Sourceval append : 'a gen -> 'a gen -> 'a gen

Append the two gens; the result contains the elements of the first, then the elements of the second gen.

Sourceval flatten : 'a Gen_intf.gen gen -> 'a gen

Flatten the generator of generators

Sourceval flat_map : ('a -> 'b Gen_intf.gen) -> 'a gen -> 'b gen

Monadic bind; each element is transformed to a sub-gen which is then iterated on, before the next element is processed, and so on.

Sourceval mem : ?eq:('a -> 'a -> bool) -> 'a -> 'a gen -> bool

Is the given element, member of the gen?

Sourceval take : int -> 'a gen -> 'a gen

Take at most n elements

Sourceval drop : int -> 'a gen -> 'a gen

Drop n elements

Sourceval nth : int -> 'a gen -> 'a

n-th element, or Not_found

  • raises Not_found

    if the generator contains less than n arguments

Sourceval take_nth : int -> 'a gen -> 'a gen

take_nth n g returns every element of g whose index is a multiple of n. For instance take_nth 2 (1--10) |> to_list will return 1;3;5;7;9

Sourceval filter : ('a -> bool) -> 'a gen -> 'a gen

Filter out elements that do not satisfy the predicate.

Sourceval take_while : ('a -> bool) -> 'a gen -> 'a gen

Take elements while they satisfy the predicate. The initial generator itself is not to be used anymore after this.

Sourceval fold_while : ('a -> 'b -> 'a * [ `Stop | `Continue ]) -> 'a -> 'b gen -> 'a

Fold elements until ('a, `Stop) is indicated by the accumulator.

  • since 0.2.4
Sourceval drop_while : ('a -> bool) -> 'a gen -> 'a gen

Drop elements while they satisfy the predicate. The initial generator itself should not be used anymore, only the result of drop_while.

Sourceval filter_map : ('a -> 'b option) -> 'a gen -> 'b gen

Maps some elements to 'b, drop the other ones

Sourceval zip_index : 'a gen -> (int * 'a) gen

Zip elements with their index in the gen

Sourceval unzip : ('a * 'b) gen -> 'a gen * 'b gen

Unzip into two sequences, splitting each pair

Sourceval partition : ('a -> bool) -> 'a gen -> 'a gen * 'a gen

partition p l returns the elements that satisfy p, and the elements that do not satisfy p

Sourceval for_all : ('a -> bool) -> 'a gen -> bool

Is the predicate true for all elements?

Sourceval exists : ('a -> bool) -> 'a gen -> bool

Is the predicate true for at least one element?

Sourceval min : ?lt:('a -> 'a -> bool) -> 'a gen -> 'a

Minimum element, according to the given comparison function.

Sourceval max : ?lt:('a -> 'a -> bool) -> 'a gen -> 'a

Maximum element, see min

Sourceval eq : ?eq:('a -> 'a -> bool) -> 'a gen -> 'a gen -> bool

Equality of generators.

Sourceval lexico : ?cmp:('a -> 'a -> int) -> 'a gen -> 'a gen -> int

Lexicographic comparison of generators. If a generator is a prefix of the other one, it is considered smaller.

Sourceval compare : ?cmp:('a -> 'a -> int) -> 'a gen -> 'a gen -> int

Synonym for lexico

Sourceval find : ('a -> bool) -> 'a gen -> 'a option

find p e returns the first element of e to satisfy p, or None.

Sourceval sum : int gen -> int

Sum of all elements

Multiple iterators

Sourceval map2 : ('a -> 'b -> 'c) -> 'a gen -> 'b gen -> 'c gen

Map on the two sequences. Stops once one of them is exhausted.

Sourceval iter2 : ('a -> 'b -> unit) -> 'a gen -> 'b gen -> unit

Iterate on the two sequences. Stops once one of them is exhausted.

Sourceval fold2 : ('acc -> 'a -> 'b -> 'acc) -> 'acc -> 'a gen -> 'b gen -> 'acc

Fold the common prefix of the two iterators

Sourceval for_all2 : ('a -> 'b -> bool) -> 'a gen -> 'b gen -> bool

Succeeds if all pairs of elements satisfy the predicate. Ignores elements of an iterator if the other runs dry.

Sourceval exists2 : ('a -> 'b -> bool) -> 'a gen -> 'b gen -> bool

Succeeds if some pair of elements satisfy the predicate. Ignores elements of an iterator if the other runs dry.

Sourceval zip_with : ('a -> 'b -> 'c) -> 'a gen -> 'b gen -> 'c gen

Combine common part of the gens (stops when one is exhausted)

Sourceval zip : 'a gen -> 'b gen -> ('a * 'b) gen

Zip together the common part of the gens

Complex combinators

Sourceval merge : 'a Gen_intf.gen gen -> 'a gen

Pick elements fairly in each sub-generator. The merge of gens e1, e2, ... picks elements in e1, e2, in e3, e1, e2 .... Once a generator is empty, it is skipped; when they are all empty, and none remains in the input, their merge is also empty. For instance, merge [1;3;5] [2;4;6] will be, in disorder, 1;2;3;4;5;6.

Sourceval intersection : ?cmp:('a -> 'a -> int) -> 'a gen -> 'a gen -> 'a gen

Intersection of two sorted sequences. Only elements that occur in both inputs appear in the output

Sourceval sorted_merge : ?cmp:('a -> 'a -> int) -> 'a gen -> 'a gen -> 'a gen

Merge two sorted sequences into a sorted sequence

Sourceval sorted_merge_n : ?cmp:('a -> 'a -> int) -> 'a gen list -> 'a gen

Sorted merge of multiple sorted sequences

Sourceval tee : ?n:int -> 'a gen -> 'a Gen_intf.gen list

Duplicate the gen into n generators (default 2). The generators share the same underlying instance of the gen, so the optimal case is when they are consumed evenly

Sourceval round_robin : ?n:int -> 'a gen -> 'a Gen_intf.gen list

Split the gen into n generators in a fair way. Elements with index = k mod n with go to the k-th gen. n default value is 2.

Sourceval interleave : 'a gen -> 'a gen -> 'a gen

interleave a b yields an element of a, then an element of b, and so on. When a generator is exhausted, this behaves like the other generator.

Sourceval intersperse : 'a -> 'a gen -> 'a gen

Put the separator element between all elements of the given gen

Sourceval product : 'a gen -> 'b gen -> ('a * 'b) gen

Cartesian product, in no predictable order. Works even if some of the arguments are infinite.

Sourceval group : ?eq:('a -> 'a -> bool) -> 'a gen -> 'a list gen

Group equal consecutive elements together.

Sourceval uniq : ?eq:('a -> 'a -> bool) -> 'a gen -> 'a gen

Remove consecutive duplicate elements. Basically this is like fun e -> map List.hd (group e).

Sourceval sort : ?cmp:('a -> 'a -> int) -> 'a gen -> 'a gen

Sort according to the given comparison function. The gen must be finite.

Sourceval sort_uniq : ?cmp:('a -> 'a -> int) -> 'a gen -> 'a gen

Sort and remove duplicates. The gen must be finite.

Sourceval chunks : int -> 'a gen -> 'a array gen

chunks n e returns a generator of arrays of length n, composed of successive elements of e. The last array may be smaller than n

Sourceval permutations : 'a gen -> 'a list gen

Permutations of the gen.

  • since 0.2.2
Sourceval permutations_heap : 'a gen -> 'a array gen

Permutations of the gen, using Heap's algorithm.

  • since 0.2.3
Sourceval combinations : int -> 'a gen -> 'a list gen

Combinations of given length. The ordering of the elements within each combination is unspecified. Example (ignoring ordering): combinations 2 (1--3) |> to_list = [[1;2]; [1;3]; [2;3]]

  • since 0.2.2
Sourceval power_set : 'a gen -> 'a list gen

All subsets of the gen (in no particular order). The ordering of the elements within each subset is unspecified.

  • since 0.2.2

Basic conversion functions

Sourceval of_list : 'a list -> 'a gen

Enumerate elements of the list

Sourceval to_list : 'a gen -> 'a list

non tail-call trasnformation to list, in the same order

Sourceval to_rev_list : 'a gen -> 'a list

Tail call conversion to list, in reverse order (more efficient)

Sourceval to_array : 'a gen -> 'a array

Convert the gen to an array (not very efficient)

Sourceval of_array : ?start:int -> ?len:int -> 'a array -> 'a gen

Iterate on (a slice of) the given array

Sourceval of_string : ?start:int -> ?len:int -> string -> char gen

Iterate on bytes of the string

Sourceval to_string : char gen -> string

Convert into a string

Sourceval to_buffer : Buffer.t -> char gen -> unit

Consumes the iterator and writes to the buffer

Sourceval rand_int : int -> int gen

Random ints in the given range.

Sourceval int_range : ?step:int -> int -> int -> int gen

int_range ~step a b generates integers between a and b, included, with steps of length step (1 if omitted). a is assumed to be smaller than b, otherwise the result will be empty.

  • parameter step

    step between two numbers; must not be zero, but it can be negative for decreasing ranges (@since 0.5).

Sourceval lines : char gen -> string gen

Group together chars belonging to the same line

  • since 0.3
Sourceval unlines : string gen -> char gen

Explode lines into their chars, adding a '\n' after each one

  • since 0.3
Sourcemodule Infix : sig ... end
Sourceval (--) : int -> int -> int gen

Synonym for int_range ~by:1

Sourceval (>>=) : 'a gen -> ('a -> 'b Gen_intf.gen) -> 'b gen

Monadic bind operator

Sourceval (>>|) : 'a gen -> ('a -> 'b) -> 'b gen

Infix map operator

  • since 0.2.3
Sourceval (>|=) : 'a gen -> ('a -> 'b) -> 'b gen

Infix map operator

  • since 0.2.3
Sourceval pp : ?start:string -> ?stop:string -> ?sep:string -> ?horizontal:bool -> (Format.formatter -> 'a -> unit) -> Format.formatter -> 'a gen -> unit

Pretty print the content of the generator on a formatter.

Sourceval of_seq : 'a Seq.t -> 'a gen
  • since 1.0
Sourceval to_iter : 'a gen -> 'a Gen_intf.iter
  • since 1.0

Restartable generators

A restartable generator is a function that produces copies of the same generator, on demand. It has the type unit -> 'a gen and it is assumed that every generated returned by the function behaves the same (that is, that it traverses the same sequence of elements).

Sourcemodule Restart : sig ... end

Utils

Sourceval persistent : 'a t -> 'a Restart.t

Store content of the transient generator in memory, to be able to iterate on it several times later. If possible, consider using combinators from Restart directly instead.

Sourceval persistent_lazy : ?caching:bool -> ?max_chunk_size:int -> 'a t -> 'a Restart.t

Same as persistent, but consumes the generator on demand (by chunks). This allows to make a restartable generator out of an ephemeral one, without paying a big cost upfront (nor even consuming it fully). Optional parameters: see GenMList.of_gen_lazy.

  • since 0.2.2
Sourceval persistent_to_seq : 'a t -> 'a Seq.t

Same as persistent, but returns a standard Seq.

  • since 1.0
Sourceval persistent_lazy_to_seq : ?caching:bool -> ?max_chunk_size:int -> 'a t -> 'a Seq.t

Same as persistent_lazy, but returns a standard Seq.

  • since 1.0
Sourceval peek : 'a t -> ('a * 'a option) t

peek g transforms the generator g into a generator of x, Some next if x was followed by next in g, or x, None if x was the last element of g

  • since 0.4
Sourceval peek_n : int -> 'a t -> ('a * 'a array) t

peek_n n g iterates on g, returning along with each element the array of the (at most) n elements that follow it immediately

  • since 0.4
Sourceval start : 'a Restart.t -> 'a t

Create a new transient generator. start gen is the same as gen () but is included for readability.

Basic IO

Very basic interface to manipulate files as sequence of chunks/lines.

  • since 0.2.3
Sourcemodule IO : sig ... end