jon.recoil.org

Module Datalog.Cursor

A cursor represents a query on the database. Cursors provide iter and fold functions to iterate over the matching facts.

Warning: Cursors are mutable data structures that are temporarily bound to a database and modified internally by iteration functions such as iter or fold. Reusing a cursor while it is being iterated over is unspecified behavior.

Binding order

The order in which variables are provided to Cursor.create and Cursor.create_with_parameters is called the binding order. For parameterized queries, the parameters appear before the variables in the binding order.

This order corresponds to the nesting order of the loop nest that will be used to evaluate the query, so it can have dramatic performance impact and need to be chosen carefully.

In order for the engine to be able to evaluate the query, variables must appear in the same order in at least one of the atoms constituting the query; for instance, it is not possible to iterate over [edge [x; y]] in the [y; x] binding order: we are requesting that y be bound before x, but in order to find the set of possible values for y from edge we need to first know the possible values for x. These cases will raise an error.

It is, however, possible to iterate on the query [edge [x; y]; edge [y; x]] using the [[y; x]] binding order: we can use the [edge [y; x]] instance to bind the variables for [y] and [x] in this order, then check if [(x, y)] is in the [edge] relation for each [(y, x)] pair that we find. In this case, the occurrences of [y] and [x] in [edge [y; x]] are said to be {b binding}, while their occurrences in [edge [x; y]] are {b non-binding}. More precisely, an occurrence of a variable [x] in a positive atom is binding if all the previous arguments of the atom appear before [x] in the binding order; all other occurrences are non-binding. In order to evaluate a query, we require that all variables have at least one binding occurrence in the query. Occurrences from negated atoms (from [not]) are never binding.

type ('p, 'v) with_parameters

Parameterized cursors take an additional argument of type 'p Constant.hlist that is provided when evaluating the cursor.

Cursors yielding values of type 'v Constant.hlist.

Cursor.create vars f creates a low-level Cursor.t from a high-level query, expressed as a conjunction of atoms.

Warning: The order of the variables in vars is crucial as it dictates the iteration order of the loop nest that will be used to evaluate the query. See the documentation of the Cursor module.

Note: If you need to perform queries that depend on the value of a variable outside the database, consider using parameterized cursors (see Cursor.create_with_parameters). Reusing a parameterized cursor is more efficient than creating new cursors for each value of the parameters.

Example

The following code creates cursors to iterate on the marked nodes (marked_cursor), and on all the edge pairs in the graphs (edge_cursor), respectively.

let marked_cursor =
  Cursor.create ["X"] (fun [x] -> [marked [x]])

let edge_cursor =
  Cursor.create ["src"; "dst"] (fun [src; dst] -> [edge [src; dst]])

Create a parameterized cursor.

This is a more general version of Cursor.create that also takes an additional list of parameters. The

Example

The following code creates two parameterized cursors for iterating over the successors or predecessors of a parametric node, which will be provided when evaluating the query.

Notice that in the successor_cursor, the parameters appears before the variable in the edge relation, while in the predecessor_cursor, it appears after the variable.

This means that the successor_cursor can be iterated efficiently, because it follows the structure of the relation: internally, the edge relation is represented as a map from nodes to their successors, and so evaluating the successor_cursor will result in a simple map lookup.

On the other hand, evaluating predecessor_cursor requires iterating over all the (non-terminal) nodes to check whether it contains p in its successor map.

let successor_cursor =
  Cursor.create_with_parameters ~parameters:["P"] ["X"] (fun [p] [x] ->
      [edge [p; x]])

let predecessor_cursor =
  Cursor.create_with_parameters ~parameters:["P"] ["X"] (fun [p] [x] ->
      [edge [x; p]])

fold cursor db ~init ~f accumulates f over all the variable bindings that match the query associated with cursor in db.

Warning: cursor must not be used from inside f.

Example

The following code computes the list of reversed edges (edges from target to source).

let reverse_edges =
  Cursor.fold edge_cursor db ~init:[] ~f:(fun [src; dst] acc ->
      (dst, src) :: acc)

iter cursor db ~f applies f to all the variable bindings that match the query associated with cursor in db.

Warning: cursor must not be used from inside f.

Example

The following code prints all the marked nodes.

let () =
  Format.eprintf "@[<v 2>Marked nodes:@ ";
  Cursor.iter marked_cursor db ~f:(fun [n] ->
      Format.eprintf "- %a@ " Node.print n);
  Format.eprintf "@]@."

fold_with_parameters cursor params db ~init ~f accumulates the function f over all the variable bindings that match the query in db. The values of the parameters are taken from the params list.

Warning: cursor must not be used from inside f.

Example

The following code accumulates the successors of node n2 in a list.

let successors =
  Cursor.fold_with_parameters successor_cursor [n2] db ~init:[]
    ~f:(fun [n] acc -> n :: acc)

iter_with_parameters cursor params db ~f applies f to all the variable bindings that match the query in db, where the parameter values are taken from params.

Warning: cursor must not be used from inside f.

Example

The following code prints the predecessors of node n2.

let () =
  Format.eprintf "@[<v 2>Predecessors of %a:@ " Node.print n2;
  Cursor.iter_with_parameters predecessor_cursor [n2] db ~f:(fun [n] ->
      Format.eprintf "- %a@ " Node.print n);
  Format.eprintf "@]@."