Operators — API Reference

Autogenerated reference for every exported mutation / crossover / selection operator — including the LLM mutation operator and its prompt-enrichment sections.

Types

Arborist.ExpansionMutationType
ExpansionMutation(; max_depth=nothing, max_size=nothing)

Mutation operator that selects a random leaf node (Symbol or Number in rvalue position) and wraps it in a randomly chosen function call from the function set. Returns the genome unchanged if no suitable wrapping function exists. See SubtreeMutation for the max_depth / max_size semantics.

source
Arborist.HoistMutationType
HoistMutation(; max_depth=nothing, max_size=nothing)

Mutation operator that selects a random non-leaf subtree and replaces it with one of its own child subtrees, reducing program complexity by one level of nesting. Falls back to SubtreeMutation if no non-leaf subtree exists.

Because hoisting only shrinks a tree, the caps are rarely triggered here, but they are accepted for symmetry with the other operators and to keep a cap uniformly enforced across a breeding pass.

This is a standard bloat-reduction operator.

source
Arborist.PointMutationType
PointMutation(; max_depth=nothing, max_size=nothing)

Mutation operator that selects a random sub-expression in the genome's body and applies a local modification (e.g., changing a variable, literal, or operator). See SubtreeMutation for the max_depth / max_size semantics.

source
Arborist.SubtreeMutationType
SubtreeMutation(; max_depth=nothing, max_size=nothing)

Mutation operator that replaces a randomly selected statement in the genome's body with a new randomly generated statement.

Optional max_depth / max_size keyword arguments cap the depth (longest root-to-leaf path, via tree_depth) and total node count (via complexity) of the produced offspring. If either cap is exceeded, the unchanged parent is returned — a Koza-style reject-and-retry guard that prevents runaway bloat. Caps default to nothing (no limit) for behavior-preserving upgrade.

source
Arborist.SubtreeCrossoverType
SubtreeCrossover(; max_depth=nothing, max_size=nothing)

Crossover operator that performs subtree crossover between two genomes. Selects compatible subtrees (matching types) and swaps them between parents.

Optional max_depth / max_size cap the offspring. When an offspring violates either cap, the corresponding parent is returned in its place — Koza-style reject-and-retry bloat control. Caps default to nothing (no limit) for behavior-preserving upgrade.

source
Arborist.EpsilonLexicaseSelectionType
EpsilonLexicaseSelection(; epsilon=0.0) <: AbstractSelectionStrategy

Epsilon-lexicase (La Cava, Spector, Danai 2016) — a relaxed form of LexicaseSelection for continuous targets. At each case, individuals are kept if their loss on that case is within epsilon of the best candidate's loss.

  • epsilon > 0.0: fixed scalar tolerance applied to every case.
  • epsilon == 0.0 (default): auto-epsilon — per-case tolerance computed as the median absolute deviation (MAD) of the case's population fitness distribution, following La Cava et al.

MAD-based auto-epsilon is the standard in the symbolic-regression lexicase literature and avoids the manual-tuning pitfall that plagues fixed-epsilon variants.

source
Arborist.FitnessProportionateSelectionType
FitnessProportionateSelection <: AbstractSelectionStrategy

Roulette-wheel selection. Under the library's minimization convention, individual i is selected with probability proportional to 1 / (eps + f[i] - f_min), so lower-fitness individuals get higher selection weight. Edge cases:

  • All fitnesses equal → uniform sampling over the population.
  • Any Inf fitness → zero probability of selection (finite individuals absorb the whole distribution). All-Inf → uniform fallback.

Proportionate selection is the textbook baseline but is rarely optimal in practice (premature convergence on multi-modal fitness). Included for reproducibility of pre-2005 GA/GP literature.

source
Arborist.LexicaseSelectionType
LexicaseSelection <: AbstractSelectionStrategy

Lexicase selection (Spector 2012; Helmuth, Spector, Matheson 2014):

  1. All individuals start as candidates.
  2. Shuffle test cases into a random order.
  3. For each case in order, narrow candidates to those with the best (lowest) loss on that case. If only one remains, select it.
  4. If a full pass exhausts cases with multiple candidates remaining, pick uniformly at random.

Lexicase preserves individuals that excel on some cases even if their aggregate fitness is poor, making it effective on modal / deceptive fitness landscapes where averaged objectives mask specialist solutions.

Requires the evaluator to implement evaluate_cases. The solve loop materializes the per-individual per-case matrix automatically when this strategy is passed to GeneticProgramming(selection=...).

source
Arborist.RankSelectionType
RankSelection(; selection_pressure::Float64=1.5) <: AbstractSelectionStrategy

Linear-rank selection (Baker 1985), adapted for Arborist's minimization convention. Individuals are sorted ascending by fitness (best first); rank r ∈ 1..N gets weight

w(r) = s - 2 * (s - 1) * (r - 1) / (N - 1)

so at s == 1.0 selection is uniform, and at s == 2.0 the best individual (rank 1) has exactly twice the weight of the worst (rank N). Values outside [1.0, 2.0] are rejected.

Less sensitive to fitness scaling than FitnessProportionateSelection and often a better default when fitnesses span many orders of magnitude. The sign is flipped from the textbook maximization form so that — under the minimization convention — rank 1 is the best and gets the highest weight.

Fields

  • selection_pressure::Float64: typical values in [1.1, 2.0] (default 1.5).
source
Arborist.TournamentSelectionType
TournamentSelection <: AbstractSelectionStrategy

Selection strategy that picks tournament_size individuals at random and returns the one with the best (lowest) fitness.

Fields

  • tournament_size::Int: number of individuals competing in each tournament
source
Arborist.TruncationSelectionType
TruncationSelection(; ratio::Float64=0.5) <: AbstractSelectionStrategy

Pick uniformly at random from the top ceil(ratio * N) individuals (sorted ascending by fitness — best first). ratio=1.0 degenerates to uniform selection over the whole population; ratio=1/N selects only the best.

Aggressive but predictable. Commonly used inside evolution-strategy variants where intense exploitation is desired.

Fields

  • ratio::Float64: fraction of the population to keep (default 0.5).
source
Arborist.AddConnectionMutationType
AddConnectionMutation <: AbstractMutationOperator

Structural mutation: add a new enabled connection between two existing nodes. Tries max_attempts random (from, to) pairs before giving up. Input/bias nodes are never destinations; output nodes are never sources.

source
Arborist.AddNodeMutationType
AddNodeMutation <: AbstractMutationOperator

Structural mutation: disable a random enabled connection and insert a new hidden node on the split. The new node picks an activation uniformly from hidden_activations. The incoming edge to the new node has weight 1.0; the outgoing edge inherits the original connection's weight.

source
Arborist.NEATCrossoverType
NEATCrossover <: AbstractCrossoverOperator

NEAT-style innovation-aligned crossover for GraphGenome (Stanley & Miikkulainen, 2002). Matching genes are inherited randomly from either parent; disjoint/excess genes come from the fitter parent. Produces two children — the second swaps parent order so that the less-fit parent's disjoint/excess genes are also preserved in the population.

source
Arborist.NEATDefaultMutationType
NEATDefaultMutation <: AbstractMutationOperator

Composite operator that reproduces the canonical NEAT mutation branching (Stanley & Miikkulainen, 2002). A single mutate call picks one branch by the cumulative probabilities derived from the positive weights below.

Defaults: 0.80 weight perturb / 0.10 weight replace / 0.05 add connection / 0.03 add node / 0.02 toggle — matching the original bare mutate(g, rng) implementation.

Each sub-operator's parameters are exposed via keyword arguments for tuning (e.g. perturb_sigma, max_attempts); see the per-operator docstrings.

source
Arborist.WeightPerturbMutationType
WeightPerturbMutation <: AbstractMutationOperator

Per-connection Gaussian weight perturbation. Each enabled connection has a perturb_prob chance of being perturbed by Normal(0, perturb_sigma). Defaults match NEAT: perturb_prob=0.9, perturb_sigma=0.3.

source
Arborist.WeightReplaceMutationType
WeightReplaceMutation <: AbstractMutationOperator

Full redraw of one random connection weight from Normal(0, replace_sigma). Default matches NEAT: replace_sigma=2.0.

source
Arborist.LLMCallStatsType
LLMCallStats

Mutable accumulator for LLM mutation call outcomes and token usage. Updated internally by mutate(::LLMMutationOperator, ...) — no external instrumentation needed.

Token fields (input_tokens, output_tokens) are extracted from the API response usage object when available (Anthropic and OpenAI both provide this). Character-count fields (input_chars, output_chars) are always populated and can be used as a ~4 chars/token estimate when the API doesn't return exact counts (e.g. some Ollama versions).

source
Arborist.LLMMutationOperatorType
LLMMutationOperator <: AbstractMutationOperator

Mutation operator that uses an LLM to generate semantically meaningful program variations. Implements the FunSearch/AlphaEvolve pattern: serialize the genome to source, prompt the LLM to improve it, parse and validate the response.

Falls back to the provided fallback_op on any failure (API error, timeout, parse failure, type-consistency failure). The evolutionary loop is never interrupted by LLM failures.

Fields

  • endpoint::String: API endpoint URL
  • model::String: model identifier
  • api_key_env::String: name of environment variable holding the API key (empty string means no key needed, e.g. for local Ollama)
  • system_prompt::String: system prompt for the LLM
  • temperature::Float64: sampling temperature
  • max_tokens::Int: maximum response tokens
  • timeout_seconds::Float64: HTTP request timeout
  • fallback_op::AbstractMutationOperator: operator to use on any failure
  • sections::Vector{AbstractPromptSection}: prompt enrichment sections (default: empty — no enrichment, identical to pre-enrichment behavior)
  • context::Union{MutationContext, Nothing}: populated by the solve loop each generation; nothing until the first generation runs
  • stats::LLMCallStats: accumulated call outcomes and token usage
  • debug_log::Union{IO, Nothing}: when set, writes the full user message, raw LLM response, and outcome for each call. Set to open("log.jsonl", "w") or stdout for debugging; nothing (default) disables logging.
source
Arborist.LLMMutationOperatorMethod
LLMMutationOperator(; kwargs...) -> LLMMutationOperator

Construct an LLMMutationOperator with keyword arguments and sensible defaults.

Supports Anthropic API (default), OpenAI API, and local Ollama without code changes — only constructor arguments differ.

source
Arborist.AbstractPromptSectionType
AbstractPromptSection

Base type for prompt enrichment sections. Each subtype implements render(section, context) -> String that produces a text block (or "" to skip) given the current MutationContext.

Users compose a Vector{AbstractPromptSection} on the LLMMutationOperator to control what context the LLM sees.

source
Arborist.ElitesSectionType
ElitesSection(k=3)

Shows the top-K programs from the current population with their fitnesses. This is the core FunSearch/AlphaEvolve pattern: giving the LLM scored examples of what works well so it can learn from them.

source
Arborist.FitnessSectionType
FitnessSection()

Shows the parent genome's fitness and rank, plus population best and mean. Gives the LLM a sense of how good the current program is and what the target looks like.

source
Arborist.GenerationSectionType
GenerationSection()

Shows the current generation number and a phase label (early/mid/late). Allows the LLM to calibrate mutation aggressiveness: explore early, exploit late.

source
Arborist.MutationContextType
MutationContext

Mutable container for population state that the solve loop fills in each generation. LLM mutation operators read this to build enriched prompts.

Population-level fields (generation, max_generations, fitnesses, genomes_serialized) are set once per generation by _update_llm_contexts!. Per-call fields (parent_fitness, parent_rank) are set inside the breeding loop by _set_parent_context!, after tournament selection but before mutate.

source

Functions

Arborist.mutateMethod
mutate(op::ExpansionMutation, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Select a random leaf node in rvalue position and wrap it in a function call. Returns the genome unchanged if no suitable wrapping function exists.

source
Arborist.mutateMethod
mutate(op::HoistMutation, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Select a random non-leaf subtree (an Expr with at least one Expr child) and replace it with one of its Expr children. Falls back to SubtreeMutation if no non-leaf subtree exists.

source
Arborist.mutateMethod
mutate(op::PointMutation, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Select a random sub-expression in the genome body and apply a local modification using the existing mutate! function from codegen.jl.

source
Arborist.mutateMethod
mutate(op::SubtreeMutation, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Replace a random statement in the genome body with a new randomly generated statement.

source
Arborist.crossoverMethod
crossover(op::SubtreeCrossover, g1::ExprGenome, g2::ExprGenome, rng::AbstractRNG) -> Tuple{ExprGenome, ExprGenome}

Produce two offspring by performing subtree crossover on the body expressions of two parent genomes. Falls back to copies of the parents if no compatible subtree pair is found.

source
Arborist.neat_defaultsMethod
neat_defaults() -> NamedTuple

Return (mutation_ops, crossover_ops) populated with NEATDefaultMutation() and NEATCrossover() for drop-in use with GeneticProgramming, NSGAII, or IslandModel when evolving GraphGenome:

ops = neat_defaults()
algorithm = GeneticProgramming(; pop_size=150, generations=100,
                                mutation_ops=ops.mutation_ops,
                                crossover_ops=ops.crossover_ops)
source
Arborist.mutateMethod
mutate(op::LLMMutationOperator, g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Serialize the genome, send it to an LLM for semantic mutation, parse and validate the response. Falls back to op.fallback_op on any failure.

All failures are silent at the framework level — uses @warn but never rethrows. The evolutionary loop is robust to 100% LLM failure rate.

source
Arborist.mutateMethod
mutate(op::LLMMutationOperator, g::GraphGenome, rng::AbstractRNG) -> GraphGenome

LLM-driven semantic mutation of a NEAT-style GraphGenome. Serializes the genome via the node-and-connection text format, queries the LLM, and parses the response back with deserialize(GraphGenome, ...; reassign_innovations=true) so LLM-generated innovation IDs never collide with the parent pool's innovation history (content-aware alignment is a later phase).

Falls back to op.fallback_op on any failure. The fallback operator must dispatch on GraphGenome — typically a NEAT operator from neat_defaults() or NEATDefaultMutation(). If the default SubtreeMutation() fallback is left in place the fallback will itself raise MethodError; callers using LLM on GraphGenome should explicitly set fallback_op=NEATDefaultMutation().

source
Arborist.renderMethod
render(section::AbstractPromptSection, ctx::MutationContext) -> String

Render this section into a text block for the LLM prompt. Return "" if the context does not contain enough information (graceful degradation).

source
Arborist.render_enrichmentMethod
render_enrichment(sections, ctx) -> String

Concatenate the non-empty renders of all sections. Returns "" if every section produces empty output or the sections list is empty.

source

Constants

Arborist._http_postConstant
_http_post

Module-level hook for HTTP POST calls. Default implementation uses Downloads.jl. Tests can replace this with a mock function via _http_post[] = mock_fn.

source