Operators — API Reference

Autogenerated reference for every exported mutation / crossover / selection operator — including the LLM mutation operator and its prompt-enrichment sections.

Types

Arborist.ExpansionMutation — Type

ExpansionMutation(; max_depth=nothing, max_size=nothing)

Mutation operator that selects a random leaf node (Symbol or Number in rvalue position) and wraps it in a randomly chosen function call from the function set. Returns the genome unchanged if no suitable wrapping function exists. See SubtreeMutation for the max_depth / max_size semantics.

Arborist.HoistMutation — Type

HoistMutation(; max_depth=nothing, max_size=nothing)

Mutation operator that selects a random non-leaf subtree and replaces it with one of its own child subtrees, reducing program complexity by one level of nesting. Falls back to SubtreeMutation if no non-leaf subtree exists.

Because hoisting only shrinks a tree, the caps are rarely triggered here, but they are accepted for symmetry with the other operators and to keep a cap uniformly enforced across a breeding pass.

This is a standard bloat-reduction operator.

Arborist.PointMutation — Type

PointMutation(; max_depth=nothing, max_size=nothing)

Mutation operator that selects a random sub-expression in the genome's body and applies a local modification (e.g., changing a variable, literal, or operator). See SubtreeMutation for the max_depth / max_size semantics.

Arborist.SubtreeMutation — Type

SubtreeMutation(; max_depth=nothing, max_size=nothing)

Mutation operator that replaces a randomly selected statement in the genome's body with a new randomly generated statement.

Optional max_depth / max_size keyword arguments cap the depth (longest root-to-leaf path, via tree_depth) and total node count (via complexity) of the produced offspring. If either cap is exceeded, the unchanged parent is returned — a Koza-style reject-and-retry guard that prevents runaway bloat. Caps default to nothing (no limit) for behavior-preserving upgrade.

Arborist.SubtreeCrossover — Type

SubtreeCrossover(; max_depth=nothing, max_size=nothing)

Crossover operator that performs subtree crossover between two genomes. Selects compatible subtrees (matching types) and swaps them between parents.

Optional max_depth / max_size cap the offspring. When an offspring violates either cap, the corresponding parent is returned in its place — Koza-style reject-and-retry bloat control. Caps default to nothing (no limit) for behavior-preserving upgrade.

Arborist.EpsilonLexicaseSelection — Type

EpsilonLexicaseSelection(; epsilon=0.0) <: AbstractSelectionStrategy

Epsilon-lexicase (La Cava, Spector, Danai 2016) — a relaxed form of LexicaseSelection for continuous targets. At each case, individuals are kept if their loss on that case is within epsilon of the best candidate's loss.

epsilon > 0.0: fixed scalar tolerance applied to every case.
epsilon == 0.0 (default): auto-epsilon — per-case tolerance computed as the median absolute deviation (MAD) of the case's population fitness distribution, following La Cava et al.

MAD-based auto-epsilon is the standard in the symbolic-regression lexicase literature and avoids the manual-tuning pitfall that plagues fixed-epsilon variants.

Arborist.FitnessProportionateSelection — Type

FitnessProportionateSelection <: AbstractSelectionStrategy

Roulette-wheel selection. Under the library's minimization convention, individual i is selected with probability proportional to 1 / (eps + f[i] - f_min), so lower-fitness individuals get higher selection weight. Edge cases:

All fitnesses equal → uniform sampling over the population.
Any Inf fitness → zero probability of selection (finite individuals absorb the whole distribution). All-Inf → uniform fallback.

Proportionate selection is the textbook baseline but is rarely optimal in practice (premature convergence on multi-modal fitness). Included for reproducibility of pre-2005 GA/GP literature.

Arborist.LexicaseSelection — Type

LexicaseSelection <: AbstractSelectionStrategy

Lexicase selection (Spector 2012; Helmuth, Spector, Matheson 2014):

All individuals start as candidates.
Shuffle test cases into a random order.
For each case in order, narrow candidates to those with the best (lowest) loss on that case. If only one remains, select it.
If a full pass exhausts cases with multiple candidates remaining, pick uniformly at random.

Lexicase preserves individuals that excel on some cases even if their aggregate fitness is poor, making it effective on modal / deceptive fitness landscapes where averaged objectives mask specialist solutions.

Requires the evaluator to implement evaluate_cases. The solve loop materializes the per-individual per-case matrix automatically when this strategy is passed to GeneticProgramming(selection=...).

Arborist.RankSelection — Type

RankSelection(; selection_pressure::Float64=1.5) <: AbstractSelectionStrategy

Linear-rank selection (Baker 1985), adapted for Arborist's minimization convention. Individuals are sorted ascending by fitness (best first); rank r ∈ 1..N gets weight

w(r) = s - 2 * (s - 1) * (r - 1) / (N - 1)

so at s == 1.0 selection is uniform, and at s == 2.0 the best individual (rank 1) has exactly twice the weight of the worst (rank N). Values outside [1.0, 2.0] are rejected.

Less sensitive to fitness scaling than FitnessProportionateSelection and often a better default when fitnesses span many orders of magnitude. The sign is flipped from the textbook maximization form so that — under the minimization convention — rank 1 is the best and gets the highest weight.

Fields

selection_pressure::Float64: typical values in [1.1, 2.0] (default 1.5).

Arborist.TournamentSelection — Type

TournamentSelection <: AbstractSelectionStrategy

Selection strategy that picks tournament_size individuals at random and returns the one with the best (lowest) fitness.

Fields

tournament_size::Int: number of individuals competing in each tournament

Arborist.TruncationSelection — Type

TruncationSelection(; ratio::Float64=0.5) <: AbstractSelectionStrategy

Pick uniformly at random from the top ceil(ratio * N) individuals (sorted ascending by fitness — best first). ratio=1.0 degenerates to uniform selection over the whole population; ratio=1/N selects only the best.

Aggressive but predictable. Commonly used inside evolution-strategy variants where intense exploitation is desired.

Fields

ratio::Float64: fraction of the population to keep (default 0.5).

Arborist.AddConnectionMutation — Type

AddConnectionMutation <: AbstractMutationOperator

Structural mutation: add a new enabled connection between two existing nodes. Tries max_attempts random (from, to) pairs before giving up. Input/bias nodes are never destinations; output nodes are never sources.

Arborist.AddNodeMutation — Type

AddNodeMutation <: AbstractMutationOperator

Structural mutation: disable a random enabled connection and insert a new hidden node on the split. The new node picks an activation uniformly from hidden_activations. The incoming edge to the new node has weight 1.0; the outgoing edge inherits the original connection's weight.

Arborist.NEATCrossover — Type

NEATCrossover <: AbstractCrossoverOperator

NEAT-style innovation-aligned crossover for GraphGenome (Stanley & Miikkulainen, 2002). Matching genes are inherited randomly from either parent; disjoint/excess genes come from the fitter parent. Produces two children — the second swaps parent order so that the less-fit parent's disjoint/excess genes are also preserved in the population.

Arborist.NEATDefaultMutation — Type

NEATDefaultMutation <: AbstractMutationOperator

Composite operator that reproduces the canonical NEAT mutation branching (Stanley & Miikkulainen, 2002). A single mutate call picks one branch by the cumulative probabilities derived from the positive weights below.

Defaults: 0.80 weight perturb / 0.10 weight replace / 0.05 add connection / 0.03 add node / 0.02 toggle — matching the original bare mutate(g, rng) implementation.

Each sub-operator's parameters are exposed via keyword arguments for tuning (e.g. perturb_sigma, max_attempts); see the per-operator docstrings.

Arborist.ToggleConnectionMutation — Type

ToggleConnectionMutation <: AbstractMutationOperator

Flip the enabled flag on a random connection.

Arborist.WeightPerturbMutation — Type

WeightPerturbMutation <: AbstractMutationOperator

Per-connection Gaussian weight perturbation. Each enabled connection has a perturb_prob chance of being perturbed by Normal(0, perturb_sigma). Defaults match NEAT: perturb_prob=0.9, perturb_sigma=0.3.

Arborist.WeightReplaceMutation — Type

WeightReplaceMutation <: AbstractMutationOperator

Full redraw of one random connection weight from Normal(0, replace_sigma). Default matches NEAT: replace_sigma=2.0.

Arborist.LLMCallStats — Type

LLMCallStats

Mutable accumulator for LLM mutation call outcomes and token usage. Updated internally by mutate(::LLMMutationOperator, ...) — no external instrumentation needed.

Token fields (input_tokens, output_tokens) are extracted from the API response usage object when available (Anthropic and OpenAI both provide this). Character-count fields (input_chars, output_chars) are always populated and can be used as a ~4 chars/token estimate when the API doesn't return exact counts (e.g. some Ollama versions).

Arborist.LLMMutationOperator — Type

LLMMutationOperator <: AbstractMutationOperator

Mutation operator that uses an LLM to generate semantically meaningful program variations. Implements the FunSearch/AlphaEvolve pattern: serialize the genome to source, prompt the LLM to improve it, parse and validate the response.

Falls back to the provided fallback_op on any failure (API error, timeout, parse failure, type-consistency failure). The evolutionary loop is never interrupted by LLM failures.

Fields

endpoint::String: API endpoint URL
model::String: model identifier
api_key_env::String: name of environment variable holding the API key (empty string means no key needed, e.g. for local Ollama)
system_prompt::String: system prompt for the LLM
temperature::Float64: sampling temperature
max_tokens::Int: maximum response tokens
timeout_seconds::Float64: HTTP request timeout
fallback_op::AbstractMutationOperator: operator to use on any failure
sections::Vector{AbstractPromptSection}: prompt enrichment sections (default: empty — no enrichment, identical to pre-enrichment behavior)
context::Union{MutationContext, Nothing}: populated by the solve loop each generation; nothing until the first generation runs
stats::LLMCallStats: accumulated call outcomes and token usage
debug_log::Union{IO, Nothing}: when set, writes the full user message, raw LLM response, and outcome for each call. Set to open("log.jsonl", "w") or stdout for debugging; nothing (default) disables logging.

Arborist.LLMMutationOperator — Method

LLMMutationOperator(; kwargs...) -> LLMMutationOperator

Construct an LLMMutationOperator with keyword arguments and sensible defaults.

Supports Anthropic API (default), OpenAI API, and local Ollama without code changes — only constructor arguments differ.

Arborist.AbstractPromptSection — Type

AbstractPromptSection

Base type for prompt enrichment sections. Each subtype implements render(section, context) -> String that produces a text block (or "" to skip) given the current MutationContext.

Users compose a Vector{AbstractPromptSection} on the LLMMutationOperator to control what context the LLM sees.

Arborist.ElitesSection — Type

ElitesSection(k=3)

Shows the top-K programs from the current population with their fitnesses. This is the core FunSearch/AlphaEvolve pattern: giving the LLM scored examples of what works well so it can learn from them.

Arborist.FitnessSection — Type

FitnessSection()

Shows the parent genome's fitness and rank, plus population best and mean. Gives the LLM a sense of how good the current program is and what the target looks like.

Arborist.GenerationSection — Type

GenerationSection()

Shows the current generation number and a phase label (early/mid/late). Allows the LLM to calibrate mutation aggressiveness: explore early, exploit late.

Arborist.MutationContext — Type

MutationContext

Mutable container for population state that the solve loop fills in each generation. LLM mutation operators read this to build enriched prompts.

Population-level fields (generation, max_generations, fitnesses, genomes_serialized) are set once per generation by _update_llm_contexts!. Per-call fields (parent_fitness, parent_rank) are set inside the breeding loop by _set_parent_context!, after tournament selection but before mutate.

Functions

Arborist.mutate — Method

mutate(op::ExpansionMutation, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Select a random leaf node in rvalue position and wrap it in a function call. Returns the genome unchanged if no suitable wrapping function exists.

Arborist.mutate — Method

mutate(op::HoistMutation, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Select a random non-leaf subtree (an Expr with at least one Expr child) and replace it with one of its Expr children. Falls back to SubtreeMutation if no non-leaf subtree exists.

Arborist.mutate — Method

mutate(op::PointMutation, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Select a random sub-expression in the genome body and apply a local modification using the existing mutate! function from codegen.jl.

Arborist.mutate — Method

mutate(op::SubtreeMutation, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Replace a random statement in the genome body with a new randomly generated statement.

Arborist.crossover — Method

crossover(op::SubtreeCrossover, g1::ExprGenome, g2::ExprGenome, rng::AbstractRNG) -> Tuple{ExprGenome, ExprGenome}

Produce two offspring by performing subtree crossover on the body expressions of two parent genomes. Falls back to copies of the parents if no compatible subtree pair is found.

Arborist.neat_defaults — Method

neat_defaults() -> NamedTuple

Return (mutation_ops, crossover_ops) populated with NEATDefaultMutation() and NEATCrossover() for drop-in use with GeneticProgramming, NSGAII, or IslandModel when evolving GraphGenome:

ops = neat_defaults()
algorithm = GeneticProgramming(; pop_size=150, generations=100,
                                mutation_ops=ops.mutation_ops,
                                crossover_ops=ops.crossover_ops)

Arborist.mutate — Method

mutate(op::LLMMutationOperator, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Serialize the genome, send it to an LLM for semantic mutation, parse and validate the response. Falls back to op.fallback_op on any failure.

All failures are silent at the framework level — uses @warn but never rethrows. The evolutionary loop is robust to 100% LLM failure rate.

Arborist.mutate — Method

mutate(op::LLMMutationOperator, g::GraphGenome, rng::AbstractRNG) -> GraphGenome

LLM-driven semantic mutation of a NEAT-style GraphGenome. Serializes the genome via the node-and-connection text format, queries the LLM, and parses the response back with deserialize(GraphGenome, ...; reassign_innovations=true) so LLM-generated innovation IDs never collide with the parent pool's innovation history (content-aware alignment is a later phase).

Falls back to op.fallback_op on any failure. The fallback operator must dispatch on GraphGenome — typically a NEAT operator from neat_defaults() or NEATDefaultMutation(). If the default SubtreeMutation() fallback is left in place the fallback will itself raise MethodError; callers using LLM on GraphGenome should explicitly set fallback_op=NEATDefaultMutation().

Arborist.render — Method

render(section::AbstractPromptSection, ctx::MutationContext) -> String

Render this section into a text block for the LLM prompt. Return "" if the context does not contain enough information (graceful degradation).

Arborist.render_enrichment — Method

render_enrichment(sections, ctx) -> String

Concatenate the non-empty renders of all sections. Returns "" if every section produces empty output or the sections list is empty.

Constants

Arborist._http_post — Constant

_http_post

Module-level hook for HTTP POST calls. Default implementation uses Downloads.jl. Tests can replace this with a mock function via _http_post[] = mock_fn.