Genome Types — API Reference

Autogenerated reference for every exported genome-related symbol (AbstractGenome, the concrete genomes, the GPProblem carrier, the per-genome evaluators and primitives, the all-time-best HallOfFame container, and the Graphviz to_dot visualization helper).

Types

Arborist.AbstractCrossoverOperator — Type

AbstractCrossoverOperator

Base type for crossover operators.

Concrete subtypes define crossover(op, g1::AbstractGenome, g2::AbstractGenome, rng::AbstractRNG) -> Tuple. May also override operator_name(op) -> Symbol for RunLog tallies.

Arborist.AbstractEvaluator — Type

AbstractEvaluator

Base type for fitness evaluators.

Any concrete subtype E <: AbstractEvaluator must implement:

evaluate(e::E, f::Function) -> Float64 (lower is better)
input_signature(e::E) -> Dict{Symbol, DataType}
output_signature(e::E) -> Dict{Symbol, DataType}

Optionally, evaluators that can decompose fitness into independent per-case losses (e.g. per-row MSE, per-sample squared error) may implement evaluate_cases(g::AbstractGenome, e::E) -> Vector{Float64}. This is required for lexicase selection; evaluators that cannot meaningfully decompose (e.g. AntEvaluator, EpisodicEvaluator) should leave it unimplemented — lexicase will then raise a clear MethodError.

Arborist.AbstractEvolutionResult — Type

AbstractEvolutionResult

Base type for results returned by solve.

Arborist.AbstractEvolutionaryAlgorithm — Type

AbstractEvolutionaryAlgorithm

Base type for evolutionary algorithm configurations.

Arborist.AbstractGenome — Type

AbstractGenome

Base type for all genome representations in Arborist.

A concrete subtype G <: AbstractGenome participates in evolution by providing the operations the solve path needs. At minimum:

mutate(op, g::G, rng::AbstractRNG) -> G for each mutation operator that dispatches on G (or a direct mutate(g::G, rng) method for genome types that use direct dispatch, e.g. AntGenome, GraphGenome).
crossover(op, g1::G, g2::G, rng::AbstractRNG) -> Tuple{G, G} (or a direct crossover(g1, g2, rng) method for direct-dispatch genomes).
distance(g1::G, g2::G) -> Float64 — used by ThresholdSpeciation.
complexity(g::G) -> Real — used by bloat penalty and ParsimonyEvaluator.
serialize(g::G) -> String — used by the LLM operator and logging.

Population initialization is genome-specific. Each genome type defines its own construction path invoked from the matching solve method; the signature is not fixed — ExprGenome uses a GenState, TreeGenome takes an OperatorEnum and feature count, AntGenome takes a primitive set, and GraphGenome takes input/output counts. See the per-genome solve(::GPProblem{G,E}, ::GeneticProgramming) methods.

deserialize(::Type{G}, s::String, ctx...) is required only when using the LLM mutation operator on G; its extra arguments depend on G (e.g. GenState for ExprGenome, (OperatorEnum, n_features) for TreeGenome). LLMMutationOperator currently dispatches on ExprGenome only.

Arborist.AbstractMutationOperator — Type

AbstractMutationOperator

Base type for mutation operators.

Concrete subtypes define mutate(op, g::AbstractGenome, rng::AbstractRNG) -> AbstractGenome. Concrete subtypes may also override operator_name(op) -> Symbol to expose a friendly key for RunLog's per-operator tallies. The default derives the name from the struct type.

Arborist.AbstractSelectionStrategy — Type

AbstractSelectionStrategy

Base type for parent selection strategies (e.g., tournament selection, lexicase selection).

Concrete subtypes must implement:

select_parent(s::S, selection_fitnesses::Vector{Float64}, case_fitnesses, rng) returning the integer index of the selected parent in genomes / selection_fitnesses. Case-based strategies (lexicase) use the matrix; scalar-fitness strategies (tournament) ignore it.
needs_cases(s::S) -> Bool (default false). Strategies that return true cause the solve loop to materialize a per-case fitness vector for every individual each generation via evaluate_cases. Returning true requires the evaluator to implement evaluate_cases; otherwise the solve loop raises MethodError the first time it tries.

Arborist.AbstractSpeciation — Type

AbstractSpeciation

Base type for speciation strategies.

Arborist.AbstractTopology — Type

AbstractTopology

Base type for island migration topologies. Concrete subtypes define migration_targets(t, i, n_islands, rng) returning destination island indices.

Arborist.ExprGenome — Type

ExprGenome <: AbstractGenome

Genome representation based on Julia Expr trees. Wraps the existing codegen.jl / evolution.jl infrastructure.

Fields

body::Vector{Expr}: body statements (not yet wrapped in a function harness)
state::GenState: type context carrying variable types, function set, etc.

Known limitations

serialize / deserialize round-trip is ~80% reliable. repr()-style Float32(literal) forms produced by the Julia printer fail type-checking on round-trip. The LLM operator falls back silently to a classical operator, but checkpoint/resume or cross-process migration can lose a fraction of individuals.
@eval grows Julia's method table monotonically across generations. Long runs (thousands of generations × hundreds of individuals) accumulate tens of thousands of methods, slowing dispatch. Use TreeGenome for long runs where applicable.

Arborist.GPProblem — Type

GPProblem{G<:AbstractGenome, E<:AbstractEvaluator}

Problem specification for genetic programming. Combines an evaluator (which defines the fitness landscape) with a genome type and configuration.

Fields

evaluator::E: the fitness evaluator
genome_type::Type{G}: the genome type to evolve
function_set::FunctionSet: available functions for code generation
num_temps::Int: number of temporary variables per genome
seed::Union{Int, Nothing}: random seed for reproducibility (nothing for no seeding)

Arborist.GPProblem — Method

GPProblem(evaluator, ::Type{G}; function_set, num_temps, seed) -> GPProblem

Construct a GPProblem with keyword arguments and sensible defaults.

Arborist.AntEvaluator — Type

AntEvaluator <: AbstractEvaluator

Fitness evaluator for AntGenome. Compiles and executes the evolved program with an AntSimulator. Returns the number of uneaten food pellets as fitness (lower is better, 0 = perfect).

Arborist.AntGenome — Type

AntGenome <: AbstractGenome

A genome for evolving programs that control an agent via side-effectful primitives. Unlike ExprGenome, AntGenome does not require a typed input/output signature — the program operates on implicit agent state via a module-level simulator reference.

Suitable for the Santa Fe Ant Trail and similar control problems.

Fields

program::Expr: a :block expression of nested primitive calls and control flow
primitives::Vector{Symbol}: action primitives (consume moves)
conditions::Vector{Symbol}: condition primitives (sensors)
max_depth::Int: maximum program depth

Known limitations

Not thread-safe. The simulator uses a module-level Ref for state. GeneticProgramming(; parallel=true) with AntGenome raises a runtime error. Use parallel=false or refactor to a thread-local-state pattern (as demonstrated in examples/bin_packing.jl and examples/sorting.jl).

Arborist.AntSimulator — Type

AntSimulator

Mutable state for ant simulation on a toroidal grid.

Arborist.ConnectionGene — Type

ConnectionGene

A directed connection between two nodes.

Arborist.EpisodicEvaluator — Type

EpisodicEvaluator{FInit,FDyn,FRew,FDone,FObs,FDec} <: AbstractEvaluator

Evaluates a GraphGenome as a closed-loop policy on an episodic environment defined by declarative callables. The network is treated as obs -> action, and the evaluator drives the loop:

for ep in 1:n_episodes
    rng   = MersenneTwister(episode_seed_base + ep)
    state = initial_state(rng)
    for step in 1:max_steps
        obs          = observe(state)
        net_output   = forward_network(state, obs)
        action       = decode_action(net_output)
        next_state   = dynamics(state, action)
        total       += reward(state, action, next_state)
        state        = next_state
        done(state) && break
    end
end

Fitness is -mean_reward_per_episode (framework convention is lower-is-better, so episodic tasks that want to maximise reward are negated). On cycle detection in allow_recurrent=false mode, returns Inf.

Fields

n_inputs::Int / n_outputs::Int — dimensions the network expects, must match length(observe(state)) and length(net_output).
initial_state::FInit — rng -> state. Must be reproducible from rng.
dynamics::FDyn — (state, action) -> next_state.
reward::FRew — (state, action, next_state) -> Float64.
done::FDone — state -> Bool. Stops the episode early when true.
observe::FObs — state -> Vector{Float64} of length n_inputs.
decode_action::FDec — Vector{Float64} of length n_outputs → action.
max_steps::Int — per-episode step cap.
n_episodes::Int — rollouts averaged per evaluate_genome call.
episode_seed_base::Int — rng = MersenneTwister(base + ep_index).
activation_fns::Dict{Symbol,Function} — defaults to ACTIVATION_FNS.
allow_recurrent::Bool — defaults true (episodic tasks usually want persistent hidden-node state across timesteps).
relaxation_passes::Int — recurrent-mode sweeps per step; default 1.

Design

The shape is declarative / pure-functional by default (see memory/episodicevaluatordesign.md). For environments with heavy reusable state (physics-engine handle, loaded dataset), a future StatefulEpisodicEvaluator subtype can offer the reset!/step! idiom; it is intentionally not built yet.

Known limitations

Not parallel-safe for stateful environments. The declarative API is structurally thread-safe when every callable is pure, but a dynamics closure that captures mutable state will race under GeneticProgramming(; parallel=true). Use parallel=false for stateful environments until StatefulEpisodicEvaluator lands.

Arborist.EpisodicEvaluator — Method

EpisodicEvaluator(n_inputs, n_outputs, initial_state, dynamics, reward, done,
                  observe, decode_action; max_steps, n_episodes, ...)

Outer constructor. Keyword-argument defaults:

max_steps = 1000
n_episodes = 1
episode_seed_base = 0
activation_fns = ACTIVATION_FNS
allow_recurrent = true
relaxation_passes = 1

Arborist.GraphEvaluator — Type

GraphEvaluator <: AbstractEvaluator

Evaluates a GraphGenome by building the neural network from the genome topology, running it on input data, and computing MSE against target outputs.

Fields

input_data::Matrix{Float64}: n_inputs × n_samples. In recurrent mode, samples are treated as a time sequence and node activations persist across samples.
output_data::Matrix{Float64}: n_outputs × n_samples.
activation_fns::Dict{Symbol, Function}: activation function lookup.
allow_recurrent::Bool: when true, cycles in the genome are allowed and evaluation uses a relaxation loop with state that persists across samples. Default false — cycles return Inf, state resets per sample.
relaxation_passes::Int: number of activation sweeps per sample when allow_recurrent=true. Default 1. Higher values let information propagate further through the network within a single sample.

Arborist.GraphEvaluator — Method

GraphEvaluator(input_data, output_data;
               activation_fns=ACTIVATION_FNS,
               allow_recurrent=false,
               relaxation_passes=1)

Construct a GraphEvaluator. Defaults match the original feedforward behavior: cycles return Inf, per-sample state reset, single forward pass. Pass allow_recurrent=true for sequence/memory tasks where node activations should persist across samples (and cycles are legal).

Arborist.GraphGenome — Type

GraphGenome <: AbstractGenome

A genome representing a neural network topology, following the NEAT encoding (Stanley & Miikkulainen, 2002). Supports structural mutation (add node, add connection) and weight mutation, plus crossover aligned by innovation number.

Fields

nodes::Dict{Int, NodeGene}: node genes keyed by node ID
connections::Dict{Int, ConnectionGene}: connection genes keyed by innovation number
n_inputs::Int: number of input nodes (not counting bias)
n_outputs::Int: number of output nodes
fitness::Float64: cached fitness value

Known limitations

Distributed NEAT innovation matching is disjoint-range, not content-aware. Under IslandModel(distributed=true), each worker gets a unique innovation ID range via init_innovation_range!((island_id - 1) * INNOVATION_STRIDE) so IDs don't collide. The cost: structurally identical mutations on different workers receive different IDs and are treated as disjoint by NEAT crossover rather than aligned. Per-generation cross-worker innovation dedup is not implemented.

Arborist.GraphGenomeContext — Type

GraphGenomeContext

Per-island state carrier for GraphGenome under IslandModel. Parallels GenState (ExprGenome) and TreeGenomeContext (TreeGenome): all three carry .rng so that island-loop sites reading state.rng work uniformly, and all three are the second element of the tuple returned from _initialize_population.

The extra n_inputs / n_outputs fields are kept available for future use (e.g. cross-island initialization) but aren't currently consulted — migrant GraphGenomes carry their own n_inputs / n_outputs.

Arborist.NodeGene — Type

NodeGene

A single node in a neural network topology genome.

Arborist.ADFGenome — Type

ADFGenome{T} <: AbstractGenome

Genome with a main expression tree and N = length(adfs) Automatically Defined Function trees. Each ADF is a Node{T} that may reference the ARG slots ARG0..ARG{arity-1} (encoded as features above the user's n_features).

The main tree may invoke any ADF as a binary operator at slot n_base_binary + i. ADFs themselves may reference features [1, n_features] and ARG slots; nested ADF-from-ADF calls are not currently supported (ADF body is generated without ADF placeholders).

Fields

main::Node{T}: main expression tree.
adfs::Vector{Node{T}}: ADF body trees, one per ADF.
arity::Int: shared arity of every ADF (default 2).
operators::OperatorEnum: the augmented operator enum (base operators
- N ADF placeholders). Use base_operators(g) to recover the user's
original operator set.
n_features::Int: number of real input features. ARG slots occupy [n_features+1, n_features+arity].
n_adfs::Int: convenience — length(adfs).

Arborist.TreeFitnessEvaluator — Type

TreeFitnessEvaluator{T} <: AbstractEvaluator

Fitness evaluator for TreeGenome. Evaluates the expression tree directly over a data matrix without @eval. Dramatically faster than TableFitnessEvaluator for large datasets.

Fields

X::Matrix{T}: input data, n_features × n_samples
y::Vector{T}: target output, length n_samples
operators::OperatorEnum: operator configuration

Arborist.TreeGenome — Type

TreeGenome{T} <: AbstractGenome

A genome backed by a DynamicExpressions.jl expression tree. Supports fast vectorized evaluation over datasets without @eval. Appropriate for pure function approximation problems.

Type parameter T is the numeric type of the expression (Float32 is recommended for most GP applications).

Fields

tree::Node{T}: the expression tree
operators::OperatorEnum: operator configuration for evaluation
n_features::Int: number of input features

Arborist.TreeGenomeContext — Type

TreeGenomeContext{T}

Per-island state carrier for TreeGenome under IslandModel. Parallels GenState for ExprGenome: both carry .rng so that island-loop sites reading state.rng work uniformly, and both are the second element of the tuple returned from _initialize_population.

The extra operators and n_features fields let from_migrant reconstruct a TreeGenome{T} by invoking deserialize with the destination island's operator enum.

Arborist.MigrantGenome — Type

MigrantGenome

Serializable carrier for genome data during island migration. Contains only data that survives cross-process transfer — GenState, compiled functions, and RNG state are reconstructed locally.

Fields

data::Any: genome-specific payload (Vector{Expr} for ExprGenome, Expr for AntGenome, etc.)
fitness::Float64: fitness on the source island
genome_type::Symbol: identifies the genome type for reconstruction

Arborist.GPResult — Type

GPResult{G<:AbstractGenome} <: AbstractEvolutionResult

Result returned by solve. Contains the best genome found, fitness history, and metadata about the evolutionary run.

Fields

best_genome::G: the genome with the best fitness found during the run
best_fitness::Float64: fitness of the best genome (lower is better)
population::Vector{G}: final population sorted by fitness
fitness_history::Vector{Float64}: best fitness per generation
mean_history::Vector{Float64}: mean finite fitness per generation
generations_run::Int: number of generations completed
wall_time::Float64: elapsed wall-clock time in seconds
converged::Bool: whether the run met the convergence criterion
hall_of_fame::Union{Nothing, HallOfFame{G}}: top-K archive across all generations when solve(... ; hall_of_fame_size=K) was passed with K > 0. nothing otherwise.

Arborist.HallOfFame — Type

HallOfFame{G<:AbstractGenome}

Bounded top-K archive of the best genomes a solve() has ever seen, across all generations. Maintained in ascending fitness order (best first). Opt-in via the hall_of_fame_size::Int kwarg on solve() — size == 0 is the default and produces nothing in GPResult.hall_of_fame.

Fields

capacity::Int: maximum number of distinct entries retained
genomes::Vector{G}: genome list, best-first
fitnesses::Vector{Float64}: matching fitness list

Dedup

push!(hof, g, f) treats two fitnesses as duplicates when they are within 1e-12 of each other. This cheap filter catches structurally- equivalent solutions (identical fitness) without the cost of walking genomes for structural equality. It will occasionally merge two semantically-distinct genomes that happen to produce the same exact fitness; acceptable for a Hall-of-Fame, which is best-effort rather than canonical.

Arborist.HallOfFame — Method

HallOfFame{G}(capacity::Int) -> HallOfFame{G}
HallOfFame{G}(; capacity::Int=20) -> HallOfFame{G}

Functions

Arborist.evaluate_cases — Function

evaluate_cases(g::AbstractGenome, e::AbstractEvaluator) -> Vector{Float64}

Per-case loss vector (lower = better) for evaluators that can decompose their fitness into independent cases (per-row, per-sample). Used by lexicase selection.

No default implementation: evaluators that can support lexicase must opt in explicitly. If not implemented, calling it raises MethodError.

Arborist.needs_cases — Method

needs_cases(s::AbstractSelectionStrategy) -> Bool

Return true if the strategy requires per-case fitnesses (evaluate_cases-derived). Default: false.

Arborist.operator_name — Method

operator_name(op) -> Symbol

Stable name for an operator, used as the key in GenerationLog.operator_attempted / operator_success. Default: the concrete type's nameof.

Arborist.select_parent — Function

select_parent(s::AbstractSelectionStrategy, selection_fitnesses, case_fitnesses, rng) -> Int

Select a parent index. selection_fitnesses::Vector{Float64} is the sharing-adjusted scalar fitness used by classical strategies (lower is better). case_fitnesses::Union{Nothing, Vector{Vector{Float64}}} is the per-individual per-case loss matrix used by lexicase strategies (same convention: lower is better; nothing when needs_cases(s) == false).

Arborist.tree_depth — Function

tree_depth(g::AbstractGenome) -> Int

Longest root-to-leaf path through the genome's expression tree. Defined for tree-structured genomes (ExprGenome, TreeGenome, AntGenome, ADFGenome) and used by mutation/crossover operators that enforce a max_depth cap.

Graph-structured genomes (e.g. GraphGenome) do not define this — MethodError on those is intentional; depth is not a meaningful bound for a recurrent graph.

Leaf convention: a bare leaf (symbol / number / feature node) has depth 1; each additional level of nesting increases depth by 1.

Arborist.complexity — Method

complexity(g::ExprGenome) -> Float64

Total node count across all body statements, measured via unravel.

Arborist.crossover — Method

crossover(g1::ExprGenome, g2::ExprGenome, rng::AbstractRNG) -> Tuple{ExprGenome, ExprGenome}

Produce two offspring via subtree crossover.

Arborist.deserialize — Method

deserialize(::Type{ExprGenome}, s::String, state::GenState) -> Union{ExprGenome, Nothing}

Parse a string of Julia statements into an ExprGenome. Returns nothing if zero valid statements survive parsing and type-checking.

Accepts assignments, while loops, if/if-else statements, for loops, blocks, break, continue, and standalone function calls. Multi-line control flow is supported by parsing the entire string as a block.

Statements that fail parsing or type-checking are skipped (partial recovery) rather than rejecting the whole genome.

Does not eval anything; parse only.

Arborist.deserialize — Method

deserialize(::Type{ExprGenome}, s::String; state::Union{GenState, Nothing}=nothing) -> Union{ExprGenome, Nothing}

Backward-compatible keyword-argument version. Delegates to the positional version when state is provided; returns nothing when it is not.

Arborist.distance — Method

distance(g1::ExprGenome, g2::ExprGenome) -> Float64

Structural compatibility distance. Counts Expr nodes appearing in one program but not the other after type-normalizing.

Arborist.evaluate_cases — Method

evaluate_cases(g::ExprGenome, e::TableFitnessEvaluator) -> Vector{Float64}

Compile the genome and return per-row squared error via the TableFitnessEvaluator case evaluator. All rows Inf on compilation failure.

Arborist.evaluate_genome — Method

evaluate_genome(g::ExprGenome, evaluator::AbstractEvaluator) -> Float64

Compile and evaluate an ExprGenome against the given evaluator. Returns Inf on any compilation or evaluation failure.

Arborist.initialize — Method

initialize(::Type{ExprGenome}, problem::GPProblem) -> ExprGenome

Create a random ExprGenome using the problem's function set and evaluator signatures.

Arborist.mutate — Method

mutate(g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Produce a mutated copy of the genome by applying a random point mutation to a randomly selected sub-expression.

Arborist.serialize — Method

serialize(g::ExprGenome) -> String

Convert an ExprGenome body to a human-readable Julia source string suitable for inclusion in an LLM prompt. Each statement is printed on its own line using Julia's standard pretty-printer.

Arborist.tree_depth — Method

tree_depth(g::ExprGenome) -> Int

Maximum depth over every statement in g.body. Empty bodies return 0.

Arborist.evaluate_genome — Method

evaluate_genome(g::AntGenome, e::AntEvaluator) -> Float64

Compile and evaluate an AntGenome against the ant trail evaluator.

Arborist.gp_ant_food_ahead — Method

gp_ant_food_ahead(::Bool) -> Bool

Sensor primitive: return true if the cell directly ahead of the ant contains food, false otherwise. Does not consume a move or change the ant's pose. The Bool argument is ignored (placeholder for the evolved program's type scheme).

Arborist.gp_ant_left — Method

gp_ant_left(::Bool) -> Bool

Rotate the ant 90° counter-clockwise, consuming a move. The Bool argument is ignored (placeholder for the evolved program's type scheme). Returns true if the turn happened, false if the simulator is absent or the ant is out of moves.

Arborist.gp_ant_move — Method

gp_ant_move(::Bool) -> Bool

Advance the ant one cell in its current direction, consuming a move. Eats the food pellet in the destination cell if present. The Bool argument is a placeholder for the evolved program's type scheme and is ignored. Returns true if the move happened, false if the simulator is absent or the ant has exhausted its move budget.

Arborist.gp_ant_right — Method

gp_ant_right(::Bool) -> Bool

Rotate the ant 90° clockwise, consuming a move. The Bool argument is ignored (placeholder for the evolved program's type scheme). Returns true if the turn happened, false if the simulator is absent or the ant is out of moves.

Arborist.tree_depth — Method

tree_depth(g::AntGenome) -> Int

Longest root-to-leaf path through the ant program's Expr tree. Note: AntGenome also carries a max_depth field which is the construction ceiling used by _random_ant_program — it limits how deeply a fresh random program is generated but does not bound later mutation output. Use the mutation operator's max_depth kwarg for a post-mutation cap.

CommonSolve.solve — Method

solve(problem::GPProblem{AntGenome}, algorithm::GeneticProgramming; ...) -> GPResult

Run GP evolution with AntGenome for side-effectful program synthesis.

Warning

AntGenome uses a module-level simulator reference (_ant_sim_ref) that is not thread-safe. The parallel field on algorithm must be false. For parallel side-effectful evaluation, use thread-local state as demonstrated in the bin packing example.

Arborist.deserialize — Method

deserialize(::Type{GraphGenome}, s::AbstractString, n_inputs, n_outputs;
            reassign_innovations=false) -> Union{GraphGenome, Nothing}

Parse the text emitted by serialize(::GraphGenome) back into a GraphGenome. The format is line-oriented:

N <id> <type> <activation> — node line
C <in>-><out> w=<weight> en=<true|false> i=<innovation> — connection line

Lines not starting with N or C are skipped (tolerates LLM commentary, code fences, etc). Returns nothing when a malformed line is encountered, when a connection references an undefined node, or when n_inputs / n_outputs disagree with the decoded node set.

Preserves node IDs and innovation numbers verbatim — required for content-aware distributed migration and NEAT crossover alignment. Pass reassign_innovations=true to issue a fresh innovation ID to every connection via _next_innovation!(); the LLM mutation path uses this to prevent LLM-generated IDs from colliding with the parent pool's history.

Arborist.evaluate_cases — Method

evaluate_cases(g::GraphGenome, e::GraphEvaluator) -> Vector{Float64}

Return per-sample mean squared error (averaged across outputs) as a Vector{Float64} of length size(e.input_data, 2). Any sample that raises or produces a non-finite squared error is reported as Inf. Used by lexicase selection.

Feedforward mode only: recurrent evaluators have persistent state across samples (samples form a time sequence) so per-sample cases are not independent. Calling this on a recurrent evaluator raises ArgumentError.

Arborist.evaluate_genome — Method

evaluate_genome(g::GraphGenome, e::EpisodicEvaluator) -> Float64

Run e.n_episodes closed-loop rollouts of g as a policy on the environment described by e, return -mean_reward_per_episode.

Arborist.evaluate_genome — Method

evaluate_genome(g::GraphGenome, e::GraphEvaluator) -> Float64

Evaluate a GraphGenome by propagating inputs through the network. Returns mean squared error against target outputs.

Feedforward mode (e.allow_recurrent=false, default): topologically sorts the network; returns Inf on cycle. Each sample is independent — node activations reset between samples.
Recurrent mode (e.allow_recurrent=true): cycles are allowed. Node activations persist across samples (samples are treated as a time sequence). Each sample runs e.relaxation_passes activation sweeps over all non-input nodes in sorted-id order, reading from the previous pass's values for inputs from cyclic edges.

Arborist.init_innovation_range! — Method

init_innovation_range!(offset::Int)

Set the module-local innovation counter to offset. Used by the distributed island model to give each worker a disjoint range of innovation IDs so that NEAT crossover on migrants does not align structurally unrelated genes under the same innovation number.

Callers in distributed mode typically use offsets like (island_id - 1) * 10^9 — disjoint as long as no single worker allocates more than 10^9 structural mutations in a run. The sequential island model does not need this: all islands share the same process-global counter, which already ensures uniqueness.

Arborist.initialize — Method

initialize(::Type{GraphGenome}, n_inputs, n_outputs, rng) -> GraphGenome

Create a minimal fully-connected network: all inputs connected to all outputs with random weights, no hidden nodes. Includes a bias node.

Arborist.reset_innovation_counter! — Method

reset_innovation_counter!()

Reset the global innovation counter to 0. Must be called at the start of each solve() call for GraphGenome problems.

CommonSolve.solve — Method

solve(problem::GPProblem{GraphGenome}, algorithm::GeneticProgramming; ...) -> GPResult

Run NEAT-style evolution with GraphGenome. Handles initialization, mutation, crossover with innovation-aligned genes, and speciation.

Accepts any AbstractEvaluator that implements evaluate_genome(::GraphGenome, e) and whose input_signature(e) / output_signature(e) lengths match the intended network dimensions — GraphEvaluator for table-based tasks, EpisodicEvaluator for closed-loop control tasks.

Arborist.augmented_operators — Method

augmented_operators(base::OperatorEnum, n_adfs::Int) -> OperatorEnum

Build the operator enum used by an ADFGenome's trees: the user's binary operators followed by n_adfs placeholder binary operators (one per ADF). ADF body trees and the main tree share this enum. The placeholders are never actually invoked — expand_adfs rewrites them before evaluation.

Arborist.base_operators — Method

base_operators(g::ADFGenome) -> OperatorEnum

Recover the user's original operator enum (without the N ADF placeholder slots).

Arborist.crossover — Method

crossover(::SubtreeCrossover, g1::ADFGenome, g2::ADFGenome, rng) -> Tuple

Same-index subtree crossover: pick uniformly among (main, adf1, ..., adfN) and swap subtrees within the chosen tree pair. Requires both genomes to share n_features and n_adfs.

Arborist.evaluate_adf — Method

evaluate_adf(g::ADFGenome{T}, X::Matrix{T}, y::Vector{T}) -> Float64

Expand ADF calls and compute MSE against y over X. Returns Inf on expansion or evaluation failure (typical: ARG references with no enclosing ADF context, i.e. ARG slots leaked into the main tree's expanded form).

Arborist.expand_adfs — Method

expand_adfs(g::ADFGenome{T}) -> Node{T}

Produce a fully-expanded copy of g.main where every ADF call has been replaced by the corresponding ADF body with ARG references substituted for the call's argument subtrees. The result uses only the base operators (no placeholders) and references only real features [1, n_features]. Suitable for direct eval_tree_array evaluation against the user's base_operators.

Arborist.initialize_adf — Method

initialize(::Type{ADFGenome{T}}, base_ops, n_features, n_adfs;
           arity=2, max_depth=4, rng) -> ADFGenome{T}

Construct a random ADFGenome with n_adfs ADFs. Main tree uses base operators plus ADF placeholders; ADF bodies use only base operators (nested ADF calls are not generated). ADF bodies may reference ARG slots in addition to the user's features.

Arborist.mutate — Method

mutate(::SubtreeMutation, g::ADFGenome, rng) -> ADFGenome

Pick uniformly among (main, adf1, ..., adfN) and apply subtree mutation to that tree. ADF bodies use the base operator set and may reference ARG slots; main tree uses augmented operators and references only real features.

Arborist.tree_depth — Method

tree_depth(g::ADFGenome) -> Int

Maximum of count_depth across the main tree and every ADF body. This captures the worst-case depth a caller might evaluate post-expansion; it does not account for expansion-driven inlining, which can compose depths up to depth(main) + depth(any_adf) - 1 in the fully expanded form.

Arborist.SymbolicRegressionEvaluator — Method

SymbolicRegressionEvaluator(f; domain, points=20, operators=_default_operators(Float32), noise=0.0)

Convenience constructor for symbolic regression problems. Generates a TreeFitnessEvaluator from a Julia function and domain specification.

Arguments

f: Target function (univariate: accepts Float32, multivariate: accepts Vector{Float32})
domain: Tuple{T,T} for univariate, Vector{Tuple{T,T}} for multivariate
points: Sample points per dimension (default: 20)
operators: OperatorEnum (default: +, -, *, / with sin, cos, exp, abs)
noise: Gaussian noise standard deviation to add to targets (default: 0.0)

Arborist.deserialize — Method

deserialize(::Type{TreeGenome{T}}, s, operators, n_features) -> Union{TreeGenome{T}, Nothing}

Parse a string representation of an expression tree back into a TreeGenome{T}. Accepts both the infix form emitted by serialize / DynamicExpressions' string_tree (e.g. x1 + 1.0, sin((x1 + 1.0) * x2)) and the prefix s-expression form used by older code paths (e.g. +(x1, 1.0), sin(*(x1, x2))). Both forms are accepted because Meta.parse normalizes them to the same Expr(:call, ...) structure that _expr_to_node walks.

Returns nothing for unparseable input, unrecognized operators, or out-of-range feature indices. The caller is responsible for any fallback behavior.

Arborist.erc_uniform — Method

erc_uniform(lo::T, hi::T) -> Function

Return a callable (rng::AbstractRNG) -> T that samples uniformly from [lo, hi]. Pass the result as GeneticProgramming(; constant_sampler=...) to wire Koza-style Ephemeral Random Constants into TreeGenome creation and mutation.

alg = GeneticProgramming(; constant_sampler = erc_uniform(-5.0f0, 5.0f0))

Arborist.evaluate — Method

evaluate(e::TreeFitnessEvaluator{T}, g::TreeGenome{T}) -> Float64

Evaluate a TreeGenome against the data matrix. Returns mean squared error. Returns Inf if evaluation throws or produces NaN/Inf values.

Arborist.evaluate_cases — Method

evaluate_cases(g::TreeGenome{T}, e::TreeFitnessEvaluator{T}) -> Vector{Float64}

Return per-sample squared error as a Vector{Float64} of length length(e.y). Non-finite samples (NaN/Inf after vectorised evaluation) are reported as Inf. All samples Inf on evaluation failure. Used by lexicase selection.

Arborist.from_migrant — Method

from_migrant(m::MigrantGenome, ctx::TreeGenomeContext{T}) -> TreeGenome{T}

Reconstruct a TreeGenome{T} from a MigrantGenome by wrapping the transported Node{T} with the destination island's operators and n_features from ctx.

Arborist.optimize_constants! — Method

optimize_constants!(g::TreeGenome, e::TreeFitnessEvaluator; kwargs...) -> Float64

Apply BFGS with central finite-difference gradients to the constants of g.tree against e's dataset. Mutates g.tree in place; returns the post-optimization MSE loss.

Returns the pre-optimization loss unchanged if the tree has zero constants or if the initial evaluation produces Inf / NaN. Never makes the tree worse: if BFGS diverges or line search fails, constants are restored and the original loss is returned.

Keyword arguments

max_iter::Int = 50
tol::Float64 = 1e-8
fd_step::Float64 = 1e-3

Arborist.to_migrant — Method

to_migrant(g::TreeGenome{T}, fitness::Float64) -> MigrantGenome

Pack a TreeGenome's Node{T} into a MigrantGenome for cross-island (and cross-process) migration. The Node{T} is carried directly rather than going through the string-based serialize/deserialize path: direct transport avoids any parse ambiguity, preserves exact bit patterns of Float32 constants, and is independent of DynamicExpressions' string_tree output format. Julia's Distributed serializer handles Node{T} natively.

The destination island's OperatorEnum is reattached in from_migrant. Op indices stored in Node{T} are stable across islands because every island holds the same OperatorEnum built from the problem.

Arborist.tree_depth — Method

tree_depth(g::TreeGenome) -> Int

Longest root-to-leaf path through the genome's expression tree, computed via DynamicExpressions.count_depth. A bare-leaf tree has depth 1.

CommonSolve.solve — Method

solve(problem::GPProblem{TreeGenome{T}}, algorithm::GeneticProgramming; ...) -> GPResult

Run genetic programming evolution with TreeGenome. Uses DynamicExpressions.jl for fast vectorized evaluation without @eval.

Arborist.from_migrant — Method

from_migrant(m::MigrantGenome, state::GenState) -> ExprGenome

Reconstruct an ExprGenome from a MigrantGenome using the local GenState.

Arborist.from_migrant — Method

from_migrant(m::MigrantGenome, ctx::GraphGenomeContext) -> GraphGenome

Context-dispatched form used by IslandModel (_inject_migrants_local! calls from_migrant(m, island.state) uniformly across genome types). GraphGenome migrants carry their own n_inputs / n_outputs, so the context is consulted only for dispatch.

Arborist.from_migrant — Method

from_migrant(m::MigrantGenome, ::Type{GraphGenome}) -> GraphGenome

Reconstruct a GraphGenome from a MigrantGenome.

Arborist.from_migrant — Method

from_migrant(m::MigrantGenome, primitives::Vector{Symbol},
             conditions::Vector{Symbol}, max_depth::Int) -> AntGenome

Reconstruct an AntGenome from a MigrantGenome.

Arborist.to_migrant — Method

to_migrant(g::AntGenome, fitness::Float64) -> MigrantGenome

Extract serializable data from an AntGenome for cross-process migration.

Arborist.to_migrant — Method

to_migrant(g::ExprGenome, fitness::Float64) -> MigrantGenome

Extract serializable data from an ExprGenome for cross-process migration.

Arborist.to_migrant — Method

to_migrant(g::GraphGenome, fitness::Float64) -> MigrantGenome

Extract serializable data from a GraphGenome for cross-process migration.

Arborist.fitnesses — Method

fitnesses(hof::HallOfFame) -> Vector{Float64}

Return the archive's fitness list in order (best first). Alias for hof.fitnesses — kept as a function for API stability if the internal representation changes.

Base.push! — Method

push!(hof::HallOfFame{G}, genome::G, fitness::Real)

Insert (genome, fitness) into the hall if it qualifies. Non-finite fitnesses are rejected. Duplicates (fitness within 1e-12) are rejected. When the archive is at capacity, the worst-fitness entry is evicted if the candidate is strictly better.

Arborist.to_dot — Function

to_dot(g) -> String
to_dot(io::IO, g)

Produce a Graphviz DOT document describing the genome g. Supported inputs are TreeGenome, ExprGenome, ADFGenome, AntGenome, and GraphGenome.

For tree-structured genomes the output is a directed acyclic graph with ellipse-shaped nodes labeled by operator / constant / variable. For GraphGenome the output is a left-to-right network diagram with distinct node shapes by role (input, output, bias, hidden) and edges labeled by connection weight, with disabled connections shown dashed and gray.

The function returns the document as a String. The two-argument form writes to io and returns io for chaining. Tree-genome methods do not throw on NaN / Inf constants; they are rendered literally.

Constants

Arborist.ACTIVATION_FNS — Constant

ACTIVATION_FNS

Dictionary mapping activation Symbol names to their unary Function implementations, used by GraphEvaluator when propagating values through a GraphGenome. The built-in set is:

:sigmoid — NEAT-style steepened logistic 1 / (1 + exp(-4.9·x)).
:tanh — hyperbolic tangent.
:relu — rectified linear, max(0, x).
:identity — x (pass-through).
:gauss — Gaussian bump exp(-x²). Common in CPPN / HyperNEAT work.
:sin — plain sin(x). Substrate or network is expected to supply any frequency scaling.
:abs — absolute value |x|.
:step — Heaviside step, 1.0 for x > 0, else 0.0.

New activations can be added by assigning into this dict before solving; each NodeGene stores the activation as a Symbol and looks the function up here at evaluation time.

Note: the NEAT mutation operators (AddNodeMutation, NEATDefaultMutation) only draw from :sigmoid, :tanh, :relu by default when adding a new hidden node. To make CPPN activations available to those operators, pass hidden_activations=[:sigmoid, :tanh, :gauss, :sin, :abs] (or similar) at construction.

Arborist.DEFAULT_TREE_GP_SYSTEM_PROMPT — Constant

Default system prompt for TreeGenome LLM mutation (prefix notation).