jon.recoil.org

Module Stdlib.GcSource

Memory management control and statistics; finalised values.

Sourcetype stat = {
  1. minor_words : float;
    (*

    Number of words allocated in the minor heap since the program was started.

    *)
  2. promoted_words : float;
    (*

    Number of words allocated in the minor heap that survived a minor collection and were moved to the major heap since the program was started.

    *)
  3. major_words : float;
    (*

    Number of words allocated in the major heap, including the promoted words, since the program was started.

    *)
  4. minor_collections : int;
    (*

    Number of minor collections since the program was started.

    *)
  5. major_collections : int;
    (*

    Number of major collection cycles completed since the program was started.

    *)
  6. heap_words : int;
    (*

    Total size of the major heap, in words.

    *)
  7. heap_chunks : int;
    (*

    Number of contiguous pieces of memory that make up the major heap. This metric is currently not available in OCaml 5: the field value is always 0.

    *)
  8. live_words : int;
    (*

    Number of words of live data in the major heap, including the header words.

    Note that "live" words refers to every word in the major heap that isn't currently known to be collectable, which includes words that have become unreachable by the program after the start of the previous gc cycle. It is typically much simpler and more predictable to call Gc.full_major (or Gc.compact) then computing gc stats, as then "live" words has the simple meaning of "reachable by the program". One caveat is that a single call to Gc.full_major will not reclaim values that have a finaliser from Gc.finalise (this does not apply to Gc.finalise_last). If this caveat matters, simply call Gc.full_major twice instead of once.

    *)
  9. live_blocks : int;
    (*

    Number of live blocks in the major heap.

    See live_words for a caveat about what "live" means.

    *)
  10. free_words : int;
    (*

    Number of words in the free list.

    *)
  11. free_blocks : int;
    (*

    Number of blocks in the free list. This metric is currently not available in OCaml 5: the field value is always 0.

    *)
  12. largest_free : int;
    (*

    Size (in words) of the largest block in the free list. This metric is currently not available in OCaml 5: the field value is always 0.

    *)
  13. fragments : int;
    (*

    Number of wasted words due to fragmentation. These are 1-words free blocks placed between two live blocks. They are not available for allocation.

    *)
  14. compactions : int;
    (*

    Number of heap compactions since the program was started.

    *)
  15. top_heap_words : int;
    (*

    Maximum size reached by the major heap, in words.

    *)
  16. stack_size : int;
    (*

    Current size of the stack, in words. This metric is currently not available in OCaml 5: the field value is always 0.

    • since 3.12
    *)
  17. forced_major_collections : int;
    (*

    Number of forced full major collections completed since the program was started.

    • since 4.12
    *)
}

The memory management counters are returned in a stat record. These counters give values for the whole program.

The total amount of memory allocated by the program since it was started is (in words) minor_words + major_words - promoted_words. Multiply by the word size (4 on a 32-bit machine, 8 on a 64-bit machine) to get the number of bytes.

Sourcetype control = {
  1. minor_heap_size : int;
    (*

    The size (in words) of the minor heap. Changing this parameter will trigger a minor collection. The total size of the minor heap used by this program is the sum of the heap sizes of the active domains. Default: 1M.

    *)
  2. major_heap_increment : int;
    (*

    How much to add to the major heap when increasing it. If this number is less than or equal to 1000, it is a percentage of the current heap size (i.e. setting it to 100 will double the heap size at each increase). If it is more than 1000, it is a fixed number of words that will be added to the heap. Default: 15.

    In runtime5, the "current heap size" metric does not include those allocations of more than 128 words.

    *)
  3. space_overhead : int;
    (*

    The major GC speed is computed from this parameter. This is the memory that will be "wasted" because the GC does not immediately collect unreachable blocks. It is expressed as a percentage of the memory used for live data. The GC will work more (use more CPU time and collect blocks more eagerly) if space_overhead is smaller. On runtime 4 this doesn't account correctly for bigarrays; you may find the GC works much harder than necessary to satisfy this parameter. Runtime 4 default: 100. Runtime 5 default: 80.

    *)
  4. verbose : int;
    (*

    This value controls the GC messages on standard error output. It is a sum of some of the following flags, to print messages on the corresponding events:

    • 0x00001 Main events of each major GC cycle
    • 0x00002 Minor collection events
    • 0x00004 Per-slice events
    • 0x00008 Heap compaction
    • 0x00010 GC policy computations
    • 0x00020 Address space reservation changes
    • 0x00040 Major domain events (such as creation and termination)
    • 0x00080 Stop-the-world events
    • 0x00100 Minor heap events (such as creation and resizing)
    • 0x00200 Major heap events (such as creation and teardown)
    • 0x00400 Resizing of GC tables
    • 0x00800 Allocation and resizing of stacks
    • 0x01000 Output GC statistics at program exit
    • 0x02000 Change of GC parameters
    • 0x04000 Calling of finalization functions
    • 0x08000 Bytecode executable and shared library search at start-up
    • 0x10000 GC debugging messages
    • 0x20000 Changes to the major GC mark stack
    • 0x10000000 Do not include timestamp and domain ID in log messages

    For runtime 4, the flags are as follows (although the messages produced may not fit these descriptions very well):

    • 0x0001 Start and end of major GC cycle.
    • 0x0002 Minor collection and major GC slice.
    • 0x0004 Growing and shrinking of the heap.
    • 0x0008 Resizing of stacks and memory manager tables.
    • 0x0010 Heap compaction.
    • 0x0020 Change of GC parameters.
    • 0x0040 Computation of major GC slice size.
    • 0x0080 Calling of finalisation functions.
    • 0x0100 Bytecode executable and shared library search at start-up.
    • 0x0200 Computation of compaction-triggering condition.
    • 0x0400 Output GC statistics at program exit.
    • 0x0800 GC debugging messages.
    • 0x1000 Include domain ID in log messages.
    • 0x2000 Include timestamp in log messages. Default: 0.
    *)
  5. max_overhead : int;
    (*

    Heap compaction is triggered when the estimated amount of "wasted" memory is more than max_overhead percent of the amount of live data. If max_overhead is set to 0, heap compaction is triggered at the end of each major GC cycle (this setting is intended for testing purposes only). If max_overhead >= 1000000, compaction is never triggered. On runtime4, if compaction is permanently disabled, it is strongly suggested to set allocation_policy to 2. Default: 500.

    *)
  6. stack_limit : int;
    (*

    The maximum size of the fiber stacks (in words). Default: 1024k.

    *)
  7. allocation_policy : int;
    (*

    The policy used for allocating in the major heap.

    This option is ignored when using runtime5.

    Prior to runtime5, possible values were 0, 1 and 2.

    • 0 was the next-fit policy
    • 1 was the first-fit policy (since OCaml 3.11)
    • 2 was the best-fit policy (since OCaml 4.10)

    More details for runtime4: -------------------------------------

    Possible values are 0, 1 and 2.

    • 0 is the next-fit policy, which is usually fast but can result in fragmentation, increasing memory consumption.
    • 1 is the first-fit policy, which avoids fragmentation but has corner cases (in certain realistic workloads) where it is sensibly slower.
    • 2 is the best-fit policy, which is fast and avoids fragmentation. In our experiments it is faster and uses less memory than both next-fit and first-fit. (since OCaml 4.10)

    The default is best-fit.

    On one example that was known to be bad for next-fit and first-fit, next-fit takes 28s using 855Mio of memory, first-fit takes 47s using 566Mio of memory, best-fit takes 27s using 545Mio of memory.

    Note: If you change to next-fit, you may need to reduce the space_overhead setting, for example using 80 instead of the default 120 which is tuned for best-fit. Otherwise, your program will need more memory.

    Note: changing the allocation policy at run-time forces a heap compaction, which is a lengthy operation unless the heap is small (e.g. at the start of the program).

    Default: 2.

    This metric is currently not available in OCaml 5: the field value is always 0.

    ----------------------------------------------------------------

    • since 3.11
    *)
  8. window_size : int;
    (*

    The size of the window used by the major GC for smoothing out variations in its workload. This is an integer between 1 and 50. Default: 1. This metric is currently not available in OCaml 5: the field value is always 0.

    • since 4.03
    *)
  9. custom_major_ratio : int;
    (*

    Target ratio of floating garbage to major heap size for out-of-heap memory held by custom values located in the major heap. The GC speed is adjusted to try to use this much memory for dead values that are not yet collected. Expressed as a percentage of major heap size. The default value keeps the out-of-heap floating garbage about the same size as the in-heap overhead. Note: this only applies to values allocated with caml_alloc_custom_mem (e.g. bigarrays). Default: 44.

    • since 4.08
    *)
  10. custom_minor_ratio : int;
    (*

    Bound on floating garbage for out-of-heap memory held by custom values in the minor heap. A minor GC is triggered when this much memory is held by custom values located in the minor heap. Expressed as a percentage of minor heap size. Note: this only applies to values allocated with caml_alloc_custom_mem (e.g. bigarrays).

    The main reason to limit the size of memory held in the minor heap is to avoid long minor GC pauses. Since custom values are typically faster to GC than normal values (they cannot hold pointers so need no scanning), an large amount of data can be held by the minor heap in custom blocks without significantly affecting GC time. So, by default, this value is above 100%.

    Default: 400.

    • since 4.08
    *)
  11. custom_minor_max_size : int;
    (*

    For runtime4: Maximum amount of out-of-heap memory for each custom value allocated in the minor heap. When a custom value is allocated on the minor heap and holds more than this many bytes, only this value is counted against custom_minor_ratio and the rest is directly counted against custom_major_ratio. Note: this only applies to values allocated with caml_alloc_custom_mem (e.g. bigarrays). Default: 8192 bytes.

    For runtime5: Maximum amount of out-of-heap memory for each custom value allocated in the minor heap. Custom values that hold more than this many bytes are allocated on the major heap. Note: this only applies to values allocated with caml_alloc_custom_mem (e.g. bigarrays). Numbers <=100 are interpreted as percentages of the size that would immediately trigger minor GC (minor heap size times custom_minor_ratio). Default: 10 %.

    • since 4.08
    *)
}

The GC parameters are given as a control record. Note that these parameters can also be initialised by setting the OCAMLRUNPARAM environment variable. See the documentation of ocamlrun.

Sourceval stat : unit -> Stdlib.Gc.stat @@ portable

Return the current values of the memory management counters in a stat record that represent the program's total memory stats. This function causes a full major collection.

Sourceval quick_stat : unit -> Stdlib.Gc.stat @@ portable

Same as stat except that live_words, live_blocks, free_words, free_blocks, largest_free, and fragments are set to 0. Due to per-domain buffers it may only represent the state of the program's total memory usage since the last minor collection or major cycle. This function is much faster than stat because it does not need to trigger a full major collection.

Sourceval counters : unit -> float * float * float @@ portable

Return (minor_words, promoted_words, major_words) for the current domain or potentially previous domains. This function is as fast as quick_stat.

Sourceval minor_words : unit -> float @@ portable

Number of words allocated in the minor heap by this domain or potentially previous domains. This number is accurate in byte-code programs, but only an approximation in programs compiled to native code.

In native code this function does not allocate.

  • since 4.04
Sourceval get : unit -> Stdlib.Gc.control @@ portable

Return the current values of the GC parameters in a control record.

  • alert unsynchronized_access GC parameters are a mutable global state.
Sourceval set : Stdlib.Gc.control -> unit @@ portable

set r changes the GC parameters according to the control record r. The normal usage is: Gc.set { (Gc.get()) with Gc.verbose = 0x00d }

  • alert unsynchronized_access GC parameters are a mutable global state.
Sourceval minor : unit -> unit @@ portable

Trigger a minor collection.

Sourceval major_slice : int -> int @@ portable

major_slice n Do a minor collection and a slice of major collection. n is the size of the slice: the GC will do enough work to free (on average) n words of memory. If n = 0, the GC will try to do enough work to ensure that the next automatic slice has no work to do. This function returns an unspecified integer (currently: 0).

Sourceval major : unit -> unit @@ portable

Do a minor collection and finish the current major collection cycle.

Sourceval full_major : unit -> unit @@ portable

Do a minor collection, finish the current major collection cycle, and perform a complete new cycle. This will collect all currently unreachable blocks.

Sourceval compact : unit -> unit @@ portable

Perform a full major collection and compact the heap. Note that heap compaction is a lengthy operation.

Sourceval print_stat : Stdlib.out_channel -> unit @@ portable

Print the current values of the memory management counters (in human-readable form) of the total program into the channel argument.

Sourceval allocated_bytes : unit -> float @@ portable

Return the number of bytes allocated by this domain and potentially a previous domain. It is returned as a float to avoid overflow problems with int on 32-bit machines.

Sourceval get_minor_free : unit -> int @@ portable

Return the current size of the free space inside the minor heap of this domain.

  • since 4.03
Sourceval finalise : ('a -> unit) -> 'a -> unit

finalise f v registers f as a finalisation function for v. v must be heap-allocated. f will be called with v as argument at some point between the first time v becomes unreachable (including through weak pointers) and the time v is collected by the GC. Several functions can be registered for the same value, or even several instances of the same function. Each instance will be called once (or never, if the program terminates before v becomes unreachable).

The GC will call the finalisation functions in the order of deallocation. When several values become unreachable at the same time (i.e. during the same GC cycle), the finalisation functions will be called in the reverse order of the corresponding calls to finalise. If finalise is called in the same order as the values are allocated, that means each value is finalised before the values it depends upon. Of course, this becomes false if additional dependencies are introduced by assignments.

Finalisers are run by the domain which registered them, unless that domain has already terminated in which case they may be run by some other domain. Note that termination of the initial domain ends the OCaml process, so finalisers registered by the initial domain will only by run by that domain.

In the presence of multiple OCaml threads it should be assumed that any particular finaliser may be executed in any of the threads.

Anything reachable from the closure of finalisation functions is considered reachable, so the following code will not work as expected:

  • let v = ... in Gc.finalise (fun _ -> ...v...) v

Instead you should make sure that v is not in the closure of the finalisation function by writing:

  • let f = fun x -> ... let v = ... in Gc.finalise f v

The f function can use all features of OCaml, including assignments that make the value reachable again. It can also loop forever (in this case, the other finalisation functions will not be called during the execution of f, unless it calls finalise_release). It can call finalise on v or other values to register other functions or even itself. It can raise an exception; in this case the exception will interrupt whatever the program was doing when the function was called.

finalise will raise Invalid_argument if v is not guaranteed to be heap-allocated. Some examples of values that are not heap-allocated are integers, constant constructors, booleans, the empty array, the empty list, the unit value. The exact list of what is heap-allocated or not is implementation-dependent. Some constant values can be heap-allocated but never deallocated during the lifetime of the program, for example a list of integer constants; this is also implementation-dependent. Note that values of types float are sometimes allocated and sometimes not, so finalising them is unsafe, and finalise will also raise Invalid_argument for them. Values of type 'a Lazy.t (for any 'a) are like float in this respect, except that the compiler sometimes optimizes them in a way that prevents finalise from detecting them. In this case, it will not raise Invalid_argument, but you should still avoid calling finalise on lazy values.

The results of calling String.make, Bytes.make, Bytes.create, Array.make, and Stdlib.ref are guaranteed to be heap-allocated and non-constant except when the length argument is 0.

Sourceval finalise_last : (unit -> unit) -> 'a -> unit

same as finalise except the value is not given as argument. So you can't use the given value for the computation of the finalisation function. The benefit is that the function is called after the value is unreachable for the last time instead of the first time. So contrary to finalise the value will never be reachable again or used again. In particular every weak pointer and ephemeron that contained this value as key or data is unset before running the finalisation function. Moreover the finalisation functions attached with finalise are always called before the finalisation functions attached with finalise_last.

As for finalise, the finaliser is run by the domain which registered it, unless that domain has already terminated in which case it may be run by some other domain.

  • since 4.04
Sourceval finalise_release : unit -> unit @@ portable

A finalisation function may call finalise_release to tell the GC that it can launch the next finalisation function without waiting for the current one to return.

Sourcetype alarm

An alarm is a piece of data that calls a user function at the end of major GC cycle. The following functions are provided to create and delete alarms.

Sourceval create_alarm : (unit -> unit) -> Stdlib.Gc.alarm

create_alarm f will arrange for f to be called at the end of major GC cycles, not caused by f itself, starting with the current cycle or the next one. f will run on the same domain that created the alarm, until the domain exits or delete_alarm is called. A value of type alarm is returned that you can use to call delete_alarm.

It is not guaranteed that the Gc alarm runs at the end of every major GC cycle, but it is guaranteed that it will run eventually.

As an example, here is a crude way to interrupt a function if the memory consumption of the program exceeds a given limit in MB, suitable for use in the toplevel:

let run_with_memory_limit (limit : int) (f : unit -> 'a) : 'a =
  let limit_memory () =
    let mem = Gc.(quick_stat ()).heap_words in
    if mem / (1024 * 1024) > limit / (Sys.word_size / 8) then
      raise Out_of_memory
  in
  let alarm = Gc.create_alarm limit_memory in
  Fun.protect f ~finally:(fun () -> Gc.delete_alarm alarm ; Gc.compact ())
Sourceval delete_alarm : Stdlib.Gc.alarm -> unit @@ portable

delete_alarm a will stop the calls to the function associated to a. Calling delete_alarm a again has no effect.

Sourceval eventlog_pause : unit -> unit @@ portable
  • deprecated Use Runtime_events.pause instead.
Sourceval eventlog_resume : unit -> unit @@ portable
  • deprecated Use Runtime_events.resume instead.
Sourcemodule Safe : sig ... end

Submodule containing non-backwards-compatible functions which enforce thread safety via modes.

Sourcemodule Memprof : sig ... end

Memprof is a profiling engine which randomly samples allocated memory words. Every allocated word has a probability of being sampled equal to a configurable sampling rate. Once a block is sampled, it becomes tracked. A tracked block triggers a user-defined callback as soon as it is allocated, promoted or deallocated.

Sourcemodule Tweak : sig ... end

GC Tweaks are unstable and undocumented configurable GC parameters, primarily intended for use by GC developers.