Genome Types — API Reference

Autogenerated reference for every exported genome-related symbol (AbstractGenome, the concrete genomes, the GPProblem carrier, the per-genome evaluators and primitives, the all-time-best HallOfFame container, and the Graphviz to_dot visualization helper).

Types

Arborist.AbstractCrossoverOperatorType
AbstractCrossoverOperator

Base type for crossover operators.

Concrete subtypes define crossover(op, g1::AbstractGenome, g2::AbstractGenome, rng::AbstractRNG) -> Tuple. May also override operator_name(op) -> Symbol for RunLog tallies.

source
Arborist.AbstractEvaluatorType
AbstractEvaluator

Base type for fitness evaluators.

Any concrete subtype E <: AbstractEvaluator must implement:

  • evaluate(e::E, f::Function) -> Float64 (lower is better)
  • input_signature(e::E) -> Dict{Symbol, DataType}
  • output_signature(e::E) -> Dict{Symbol, DataType}

Optionally, evaluators that can decompose fitness into independent per-case losses (e.g. per-row MSE, per-sample squared error) may implement evaluate_cases(g::AbstractGenome, e::E) -> Vector{Float64}. This is required for lexicase selection; evaluators that cannot meaningfully decompose (e.g. AntEvaluator, EpisodicEvaluator) should leave it unimplemented — lexicase will then raise a clear MethodError.

source
Arborist.AbstractGenomeType
AbstractGenome

Base type for all genome representations in Arborist.

A concrete subtype G <: AbstractGenome participates in evolution by providing the operations the solve path needs. At minimum:

  • mutate(op, g::G, rng::AbstractRNG) -> G for each mutation operator that dispatches on G (or a direct mutate(g::G, rng) method for genome types that use direct dispatch, e.g. AntGenome, GraphGenome).
  • crossover(op, g1::G, g2::G, rng::AbstractRNG) -> Tuple{G, G} (or a direct crossover(g1, g2, rng) method for direct-dispatch genomes).
  • distance(g1::G, g2::G) -> Float64 — used by ThresholdSpeciation.
  • complexity(g::G) -> Real — used by bloat penalty and ParsimonyEvaluator.
  • serialize(g::G) -> String — used by the LLM operator and logging.

Population initialization is genome-specific. Each genome type defines its own construction path invoked from the matching solve method; the signature is not fixed — ExprGenome uses a GenState, TreeGenome takes an OperatorEnum and feature count, AntGenome takes a primitive set, and GraphGenome takes input/output counts. See the per-genome solve(::GPProblem{G,E}, ::GeneticProgramming) methods.

deserialize(::Type{G}, s::String, ctx...) is required only when using the LLM mutation operator on G; its extra arguments depend on G (e.g. GenState for ExprGenome, (OperatorEnum, n_features) for TreeGenome). LLMMutationOperator currently dispatches on ExprGenome only.

source
Arborist.AbstractMutationOperatorType
AbstractMutationOperator

Base type for mutation operators.

Concrete subtypes define mutate(op, g::AbstractGenome, rng::AbstractRNG) -> AbstractGenome. Concrete subtypes may also override operator_name(op) -> Symbol to expose a friendly key for RunLog's per-operator tallies. The default derives the name from the struct type.

source
Arborist.AbstractSelectionStrategyType
AbstractSelectionStrategy

Base type for parent selection strategies (e.g., tournament selection, lexicase selection).

Concrete subtypes must implement:

  • select_parent(s::S, selection_fitnesses::Vector{Float64}, case_fitnesses, rng) returning the integer index of the selected parent in genomes / selection_fitnesses. Case-based strategies (lexicase) use the matrix; scalar-fitness strategies (tournament) ignore it.
  • needs_cases(s::S) -> Bool (default false). Strategies that return true cause the solve loop to materialize a per-case fitness vector for every individual each generation via evaluate_cases. Returning true requires the evaluator to implement evaluate_cases; otherwise the solve loop raises MethodError the first time it tries.
source
Arborist.AbstractTopologyType
AbstractTopology

Base type for island migration topologies. Concrete subtypes define migration_targets(t, i, n_islands, rng) returning destination island indices.

source
Arborist.ExprGenomeType
ExprGenome <: AbstractGenome

Genome representation based on Julia Expr trees. Wraps the existing codegen.jl / evolution.jl infrastructure.

Fields

  • body::Vector{Expr}: body statements (not yet wrapped in a function harness)
  • state::GenState: type context carrying variable types, function set, etc.

Known limitations

  • serialize / deserialize round-trip is ~80% reliable. repr()-style Float32(literal) forms produced by the Julia printer fail type-checking on round-trip. The LLM operator falls back silently to a classical operator, but checkpoint/resume or cross-process migration can lose a fraction of individuals.
  • @eval grows Julia's method table monotonically across generations. Long runs (thousands of generations × hundreds of individuals) accumulate tens of thousands of methods, slowing dispatch. Use TreeGenome for long runs where applicable.
source
Arborist.GPProblemType
GPProblem{G<:AbstractGenome, E<:AbstractEvaluator}

Problem specification for genetic programming. Combines an evaluator (which defines the fitness landscape) with a genome type and configuration.

Fields

  • evaluator::E: the fitness evaluator
  • genome_type::Type{G}: the genome type to evolve
  • function_set::FunctionSet: available functions for code generation
  • num_temps::Int: number of temporary variables per genome
  • seed::Union{Int, Nothing}: random seed for reproducibility (nothing for no seeding)
source
Arborist.GPProblemMethod
GPProblem(evaluator, ::Type{G}; function_set, num_temps, seed) -> GPProblem

Construct a GPProblem with keyword arguments and sensible defaults.

source
Arborist.AntEvaluatorType
AntEvaluator <: AbstractEvaluator

Fitness evaluator for AntGenome. Compiles and executes the evolved program with an AntSimulator. Returns the number of uneaten food pellets as fitness (lower is better, 0 = perfect).

source
Arborist.AntGenomeType
AntGenome <: AbstractGenome

A genome for evolving programs that control an agent via side-effectful primitives. Unlike ExprGenome, AntGenome does not require a typed input/output signature — the program operates on implicit agent state via a module-level simulator reference.

Suitable for the Santa Fe Ant Trail and similar control problems.

Fields

  • program::Expr: a :block expression of nested primitive calls and control flow
  • primitives::Vector{Symbol}: action primitives (consume moves)
  • conditions::Vector{Symbol}: condition primitives (sensors)
  • max_depth::Int: maximum program depth

Known limitations

  • Not thread-safe. The simulator uses a module-level Ref for state. GeneticProgramming(; parallel=true) with AntGenome raises a runtime error. Use parallel=false or refactor to a thread-local-state pattern (as demonstrated in examples/bin_packing.jl and examples/sorting.jl).
source
Arborist.EpisodicEvaluatorType
EpisodicEvaluator{FInit,FDyn,FRew,FDone,FObs,FDec} <: AbstractEvaluator

Evaluates a GraphGenome as a closed-loop policy on an episodic environment defined by declarative callables. The network is treated as obs -> action, and the evaluator drives the loop:

for ep in 1:n_episodes
    rng   = MersenneTwister(episode_seed_base + ep)
    state = initial_state(rng)
    for step in 1:max_steps
        obs          = observe(state)
        net_output   = forward_network(state, obs)
        action       = decode_action(net_output)
        next_state   = dynamics(state, action)
        total       += reward(state, action, next_state)
        state        = next_state
        done(state) && break
    end
end

Fitness is -mean_reward_per_episode (framework convention is lower-is-better, so episodic tasks that want to maximise reward are negated). On cycle detection in allow_recurrent=false mode, returns Inf.

Fields

  • n_inputs::Int / n_outputs::Int — dimensions the network expects, must match length(observe(state)) and length(net_output).
  • initial_state::FInitrng -> state. Must be reproducible from rng.
  • dynamics::FDyn(state, action) -> next_state.
  • reward::FRew(state, action, next_state) -> Float64.
  • done::FDonestate -> Bool. Stops the episode early when true.
  • observe::FObsstate -> Vector{Float64} of length n_inputs.
  • decode_action::FDecVector{Float64} of length n_outputs → action.
  • max_steps::Int — per-episode step cap.
  • n_episodes::Int — rollouts averaged per evaluate_genome call.
  • episode_seed_base::Intrng = MersenneTwister(base + ep_index).
  • activation_fns::Dict{Symbol,Function} — defaults to ACTIVATION_FNS.
  • allow_recurrent::Bool — defaults true (episodic tasks usually want persistent hidden-node state across timesteps).
  • relaxation_passes::Int — recurrent-mode sweeps per step; default 1.

Design

The shape is declarative / pure-functional by default (see memory/episodicevaluatordesign.md). For environments with heavy reusable state (physics-engine handle, loaded dataset), a future StatefulEpisodicEvaluator subtype can offer the reset!/step! idiom; it is intentionally not built yet.

Known limitations

  • Not parallel-safe for stateful environments. The declarative API is structurally thread-safe when every callable is pure, but a dynamics closure that captures mutable state will race under GeneticProgramming(; parallel=true). Use parallel=false for stateful environments until StatefulEpisodicEvaluator lands.
source
Arborist.EpisodicEvaluatorMethod
EpisodicEvaluator(n_inputs, n_outputs, initial_state, dynamics, reward, done,
                  observe, decode_action; max_steps, n_episodes, ...)

Outer constructor. Keyword-argument defaults:

  • max_steps = 1000
  • n_episodes = 1
  • episode_seed_base = 0
  • activation_fns = ACTIVATION_FNS
  • allow_recurrent = true
  • relaxation_passes = 1
source
Arborist.GraphEvaluatorType
GraphEvaluator <: AbstractEvaluator

Evaluates a GraphGenome by building the neural network from the genome topology, running it on input data, and computing MSE against target outputs.

Fields

  • input_data::Matrix{Float64}: n_inputs × n_samples. In recurrent mode, samples are treated as a time sequence and node activations persist across samples.
  • output_data::Matrix{Float64}: n_outputs × n_samples.
  • activation_fns::Dict{Symbol, Function}: activation function lookup.
  • allow_recurrent::Bool: when true, cycles in the genome are allowed and evaluation uses a relaxation loop with state that persists across samples. Default false — cycles return Inf, state resets per sample.
  • relaxation_passes::Int: number of activation sweeps per sample when allow_recurrent=true. Default 1. Higher values let information propagate further through the network within a single sample.
source
Arborist.GraphEvaluatorMethod
GraphEvaluator(input_data, output_data;
               activation_fns=ACTIVATION_FNS,
               allow_recurrent=false,
               relaxation_passes=1)

Construct a GraphEvaluator. Defaults match the original feedforward behavior: cycles return Inf, per-sample state reset, single forward pass. Pass allow_recurrent=true for sequence/memory tasks where node activations should persist across samples (and cycles are legal).

source
Arborist.GraphGenomeType
GraphGenome <: AbstractGenome

A genome representing a neural network topology, following the NEAT encoding (Stanley & Miikkulainen, 2002). Supports structural mutation (add node, add connection) and weight mutation, plus crossover aligned by innovation number.

Fields

  • nodes::Dict{Int, NodeGene}: node genes keyed by node ID
  • connections::Dict{Int, ConnectionGene}: connection genes keyed by innovation number
  • n_inputs::Int: number of input nodes (not counting bias)
  • n_outputs::Int: number of output nodes
  • fitness::Float64: cached fitness value

Known limitations

  • Distributed NEAT innovation matching is disjoint-range, not content-aware. Under IslandModel(distributed=true), each worker gets a unique innovation ID range via init_innovation_range!((island_id - 1) * INNOVATION_STRIDE) so IDs don't collide. The cost: structurally identical mutations on different workers receive different IDs and are treated as disjoint by NEAT crossover rather than aligned. Per-generation cross-worker innovation dedup is not implemented.
source
Arborist.GraphGenomeContextType
GraphGenomeContext

Per-island state carrier for GraphGenome under IslandModel. Parallels GenState (ExprGenome) and TreeGenomeContext (TreeGenome): all three carry .rng so that island-loop sites reading state.rng work uniformly, and all three are the second element of the tuple returned from _initialize_population.

The extra n_inputs / n_outputs fields are kept available for future use (e.g. cross-island initialization) but aren't currently consulted — migrant GraphGenomes carry their own n_inputs / n_outputs.

source
Arborist.ADFGenomeType
ADFGenome{T} <: AbstractGenome

Genome with a main expression tree and N = length(adfs) Automatically Defined Function trees. Each ADF is a Node{T} that may reference the ARG slots ARG0..ARG{arity-1} (encoded as features above the user's n_features).

The main tree may invoke any ADF as a binary operator at slot n_base_binary + i. ADFs themselves may reference features [1, n_features] and ARG slots; nested ADF-from-ADF calls are not currently supported (ADF body is generated without ADF placeholders).

Fields

  • main::Node{T}: main expression tree.
  • adfs::Vector{Node{T}}: ADF body trees, one per ADF.
  • arity::Int: shared arity of every ADF (default 2).
  • operators::OperatorEnum: the augmented operator enum (base operators
    • N ADF placeholders). Use base_operators(g) to recover the user's
    original operator set.
  • n_features::Int: number of real input features. ARG slots occupy [n_features+1, n_features+arity].
  • n_adfs::Int: convenience — length(adfs).
source
Arborist.TreeFitnessEvaluatorType
TreeFitnessEvaluator{T} <: AbstractEvaluator

Fitness evaluator for TreeGenome. Evaluates the expression tree directly over a data matrix without @eval. Dramatically faster than TableFitnessEvaluator for large datasets.

Fields

  • X::Matrix{T}: input data, n_features × n_samples
  • y::Vector{T}: target output, length n_samples
  • operators::OperatorEnum: operator configuration
source
Arborist.TreeGenomeType
TreeGenome{T} <: AbstractGenome

A genome backed by a DynamicExpressions.jl expression tree. Supports fast vectorized evaluation over datasets without @eval. Appropriate for pure function approximation problems.

Type parameter T is the numeric type of the expression (Float32 is recommended for most GP applications).

Fields

  • tree::Node{T}: the expression tree
  • operators::OperatorEnum: operator configuration for evaluation
  • n_features::Int: number of input features
source
Arborist.TreeGenomeContextType
TreeGenomeContext{T}

Per-island state carrier for TreeGenome under IslandModel. Parallels GenState for ExprGenome: both carry .rng so that island-loop sites reading state.rng work uniformly, and both are the second element of the tuple returned from _initialize_population.

The extra operators and n_features fields let from_migrant reconstruct a TreeGenome{T} by invoking deserialize with the destination island's operator enum.

source
Arborist.MigrantGenomeType
MigrantGenome

Serializable carrier for genome data during island migration. Contains only data that survives cross-process transfer — GenState, compiled functions, and RNG state are reconstructed locally.

Fields

  • data::Any: genome-specific payload (Vector{Expr} for ExprGenome, Expr for AntGenome, etc.)
  • fitness::Float64: fitness on the source island
  • genome_type::Symbol: identifies the genome type for reconstruction
source
Arborist.GPResultType
GPResult{G<:AbstractGenome} <: AbstractEvolutionResult

Result returned by solve. Contains the best genome found, fitness history, and metadata about the evolutionary run.

Fields

  • best_genome::G: the genome with the best fitness found during the run
  • best_fitness::Float64: fitness of the best genome (lower is better)
  • population::Vector{G}: final population sorted by fitness
  • fitness_history::Vector{Float64}: best fitness per generation
  • mean_history::Vector{Float64}: mean finite fitness per generation
  • generations_run::Int: number of generations completed
  • wall_time::Float64: elapsed wall-clock time in seconds
  • converged::Bool: whether the run met the convergence criterion
  • hall_of_fame::Union{Nothing, HallOfFame{G}}: top-K archive across all generations when solve(... ; hall_of_fame_size=K) was passed with K > 0. nothing otherwise.
source
Arborist.HallOfFameType
HallOfFame{G<:AbstractGenome}

Bounded top-K archive of the best genomes a solve() has ever seen, across all generations. Maintained in ascending fitness order (best first). Opt-in via the hall_of_fame_size::Int kwarg on solve()size == 0 is the default and produces nothing in GPResult.hall_of_fame.

Fields

  • capacity::Int: maximum number of distinct entries retained
  • genomes::Vector{G}: genome list, best-first
  • fitnesses::Vector{Float64}: matching fitness list

Dedup

push!(hof, g, f) treats two fitnesses as duplicates when they are within 1e-12 of each other. This cheap filter catches structurally- equivalent solutions (identical fitness) without the cost of walking genomes for structural equality. It will occasionally merge two semantically-distinct genomes that happen to produce the same exact fitness; acceptable for a Hall-of-Fame, which is best-effort rather than canonical.

source
Arborist.HallOfFameMethod
HallOfFame{G}(capacity::Int) -> HallOfFame{G}
HallOfFame{G}(; capacity::Int=20) -> HallOfFame{G}
source

Functions

Arborist.evaluate_casesFunction
evaluate_cases(g::AbstractGenome, e::AbstractEvaluator) -> Vector{Float64}

Per-case loss vector (lower = better) for evaluators that can decompose their fitness into independent cases (per-row, per-sample). Used by lexicase selection.

No default implementation: evaluators that can support lexicase must opt in explicitly. If not implemented, calling it raises MethodError.

source
Arborist.needs_casesMethod
needs_cases(s::AbstractSelectionStrategy) -> Bool

Return true if the strategy requires per-case fitnesses (evaluate_cases-derived). Default: false.

source
Arborist.operator_nameMethod
operator_name(op) -> Symbol

Stable name for an operator, used as the key in GenerationLog.operator_attempted / operator_success. Default: the concrete type's nameof.

source
Arborist.select_parentFunction
select_parent(s::AbstractSelectionStrategy, selection_fitnesses, case_fitnesses, rng) -> Int

Select a parent index. selection_fitnesses::Vector{Float64} is the sharing-adjusted scalar fitness used by classical strategies (lower is better). case_fitnesses::Union{Nothing, Vector{Vector{Float64}}} is the per-individual per-case loss matrix used by lexicase strategies (same convention: lower is better; nothing when needs_cases(s) == false).

source
Arborist.tree_depthFunction
tree_depth(g::AbstractGenome) -> Int

Longest root-to-leaf path through the genome's expression tree. Defined for tree-structured genomes (ExprGenome, TreeGenome, AntGenome, ADFGenome) and used by mutation/crossover operators that enforce a max_depth cap.

Graph-structured genomes (e.g. GraphGenome) do not define this — MethodError on those is intentional; depth is not a meaningful bound for a recurrent graph.

Leaf convention: a bare leaf (symbol / number / feature node) has depth 1; each additional level of nesting increases depth by 1.

source
Arborist.complexityMethod
complexity(g::ExprGenome) -> Float64

Total node count across all body statements, measured via unravel.

source
Arborist.crossoverMethod
crossover(g1::ExprGenome, g2::ExprGenome, rng::AbstractRNG) -> Tuple{ExprGenome, ExprGenome}

Produce two offspring via subtree crossover.

source
Arborist.deserializeMethod
deserialize(::Type{ExprGenome}, s::String, state::GenState) -> Union{ExprGenome, Nothing}

Parse a string of Julia statements into an ExprGenome. Returns nothing if zero valid statements survive parsing and type-checking.

Accepts assignments, while loops, if/if-else statements, for loops, blocks, break, continue, and standalone function calls. Multi-line control flow is supported by parsing the entire string as a block.

Statements that fail parsing or type-checking are skipped (partial recovery) rather than rejecting the whole genome.

Does not eval anything; parse only.

source
Arborist.deserializeMethod
deserialize(::Type{ExprGenome}, s::String; state::Union{GenState, Nothing}=nothing) -> Union{ExprGenome, Nothing}

Backward-compatible keyword-argument version. Delegates to the positional version when state is provided; returns nothing when it is not.

source
Arborist.distanceMethod
distance(g1::ExprGenome, g2::ExprGenome) -> Float64

Structural compatibility distance. Counts Expr nodes appearing in one program but not the other after type-normalizing.

source
Arborist.evaluate_casesMethod
evaluate_cases(g::ExprGenome, e::TableFitnessEvaluator) -> Vector{Float64}

Compile the genome and return per-row squared error via the TableFitnessEvaluator case evaluator. All rows Inf on compilation failure.

source
Arborist.evaluate_genomeMethod
evaluate_genome(g::ExprGenome, evaluator::AbstractEvaluator) -> Float64

Compile and evaluate an ExprGenome against the given evaluator. Returns Inf on any compilation or evaluation failure.

source
Arborist.initializeMethod
initialize(::Type{ExprGenome}, problem::GPProblem) -> ExprGenome

Create a random ExprGenome using the problem's function set and evaluator signatures.

source
Arborist.mutateMethod
mutate(g::ExprGenome, rng::AbstractRNG) -> ExprGenome

Produce a mutated copy of the genome by applying a random point mutation to a randomly selected sub-expression.

source
Arborist.serializeMethod
serialize(g::ExprGenome) -> String

Convert an ExprGenome body to a human-readable Julia source string suitable for inclusion in an LLM prompt. Each statement is printed on its own line using Julia's standard pretty-printer.

source
Arborist.tree_depthMethod
tree_depth(g::ExprGenome) -> Int

Maximum depth over every statement in g.body. Empty bodies return 0.

source
Arborist.evaluate_genomeMethod
evaluate_genome(g::AntGenome, e::AntEvaluator) -> Float64

Compile and evaluate an AntGenome against the ant trail evaluator.

source
Arborist.gp_ant_food_aheadMethod
gp_ant_food_ahead(::Bool) -> Bool

Sensor primitive: return true if the cell directly ahead of the ant contains food, false otherwise. Does not consume a move or change the ant's pose. The Bool argument is ignored (placeholder for the evolved program's type scheme).

source
Arborist.gp_ant_leftMethod
gp_ant_left(::Bool) -> Bool

Rotate the ant 90° counter-clockwise, consuming a move. The Bool argument is ignored (placeholder for the evolved program's type scheme). Returns true if the turn happened, false if the simulator is absent or the ant is out of moves.

source
Arborist.gp_ant_moveMethod
gp_ant_move(::Bool) -> Bool

Advance the ant one cell in its current direction, consuming a move. Eats the food pellet in the destination cell if present. The Bool argument is a placeholder for the evolved program's type scheme and is ignored. Returns true if the move happened, false if the simulator is absent or the ant has exhausted its move budget.

source
Arborist.gp_ant_rightMethod
gp_ant_right(::Bool) -> Bool

Rotate the ant 90° clockwise, consuming a move. The Bool argument is ignored (placeholder for the evolved program's type scheme). Returns true if the turn happened, false if the simulator is absent or the ant is out of moves.

source
Arborist.tree_depthMethod
tree_depth(g::AntGenome) -> Int

Longest root-to-leaf path through the ant program's Expr tree. Note: AntGenome also carries a max_depth field which is the construction ceiling used by _random_ant_program — it limits how deeply a fresh random program is generated but does not bound later mutation output. Use the mutation operator's max_depth kwarg for a post-mutation cap.

source
CommonSolve.solveMethod
solve(problem::GPProblem{AntGenome}, algorithm::GeneticProgramming; ...) -> GPResult

Run GP evolution with AntGenome for side-effectful program synthesis.

Warning

AntGenome uses a module-level simulator reference (_ant_sim_ref) that is not thread-safe. The parallel field on algorithm must be false. For parallel side-effectful evaluation, use thread-local state as demonstrated in the bin packing example.

source
Arborist.deserializeMethod
deserialize(::Type{GraphGenome}, s::AbstractString, n_inputs, n_outputs;
            reassign_innovations=false) -> Union{GraphGenome, Nothing}

Parse the text emitted by serialize(::GraphGenome) back into a GraphGenome. The format is line-oriented:

  • N <id> <type> <activation> — node line
  • C <in>-><out> w=<weight> en=<true|false> i=<innovation> — connection line

Lines not starting with N or C are skipped (tolerates LLM commentary, code fences, etc). Returns nothing when a malformed line is encountered, when a connection references an undefined node, or when n_inputs / n_outputs disagree with the decoded node set.

Preserves node IDs and innovation numbers verbatim — required for content-aware distributed migration and NEAT crossover alignment. Pass reassign_innovations=true to issue a fresh innovation ID to every connection via _next_innovation!(); the LLM mutation path uses this to prevent LLM-generated IDs from colliding with the parent pool's history.

source
Arborist.evaluate_casesMethod
evaluate_cases(g::GraphGenome, e::GraphEvaluator) -> Vector{Float64}

Return per-sample mean squared error (averaged across outputs) as a Vector{Float64} of length size(e.input_data, 2). Any sample that raises or produces a non-finite squared error is reported as Inf. Used by lexicase selection.

Feedforward mode only: recurrent evaluators have persistent state across samples (samples form a time sequence) so per-sample cases are not independent. Calling this on a recurrent evaluator raises ArgumentError.

source
Arborist.evaluate_genomeMethod
evaluate_genome(g::GraphGenome, e::EpisodicEvaluator) -> Float64

Run e.n_episodes closed-loop rollouts of g as a policy on the environment described by e, return -mean_reward_per_episode.

source
Arborist.evaluate_genomeMethod
evaluate_genome(g::GraphGenome, e::GraphEvaluator) -> Float64

Evaluate a GraphGenome by propagating inputs through the network. Returns mean squared error against target outputs.

  • Feedforward mode (e.allow_recurrent=false, default): topologically sorts the network; returns Inf on cycle. Each sample is independent — node activations reset between samples.
  • Recurrent mode (e.allow_recurrent=true): cycles are allowed. Node activations persist across samples (samples are treated as a time sequence). Each sample runs e.relaxation_passes activation sweeps over all non-input nodes in sorted-id order, reading from the previous pass's values for inputs from cyclic edges.
source
Arborist.init_innovation_range!Method
init_innovation_range!(offset::Int)

Set the module-local innovation counter to offset. Used by the distributed island model to give each worker a disjoint range of innovation IDs so that NEAT crossover on migrants does not align structurally unrelated genes under the same innovation number.

Callers in distributed mode typically use offsets like (island_id - 1) * 10^9 — disjoint as long as no single worker allocates more than 10^9 structural mutations in a run. The sequential island model does not need this: all islands share the same process-global counter, which already ensures uniqueness.

source
Arborist.initializeMethod
initialize(::Type{GraphGenome}, n_inputs, n_outputs, rng) -> GraphGenome

Create a minimal fully-connected network: all inputs connected to all outputs with random weights, no hidden nodes. Includes a bias node.

source
CommonSolve.solveMethod
solve(problem::GPProblem{GraphGenome}, algorithm::GeneticProgramming; ...) -> GPResult

Run NEAT-style evolution with GraphGenome. Handles initialization, mutation, crossover with innovation-aligned genes, and speciation.

Accepts any AbstractEvaluator that implements evaluate_genome(::GraphGenome, e) and whose input_signature(e) / output_signature(e) lengths match the intended network dimensions — GraphEvaluator for table-based tasks, EpisodicEvaluator for closed-loop control tasks.

source
Arborist.augmented_operatorsMethod
augmented_operators(base::OperatorEnum, n_adfs::Int) -> OperatorEnum

Build the operator enum used by an ADFGenome's trees: the user's binary operators followed by n_adfs placeholder binary operators (one per ADF). ADF body trees and the main tree share this enum. The placeholders are never actually invoked — expand_adfs rewrites them before evaluation.

source
Arborist.base_operatorsMethod
base_operators(g::ADFGenome) -> OperatorEnum

Recover the user's original operator enum (without the N ADF placeholder slots).

source
Arborist.crossoverMethod
crossover(::SubtreeCrossover, g1::ADFGenome, g2::ADFGenome, rng) -> Tuple

Same-index subtree crossover: pick uniformly among (main, adf1, ..., adfN) and swap subtrees within the chosen tree pair. Requires both genomes to share n_features and n_adfs.

source
Arborist.evaluate_adfMethod
evaluate_adf(g::ADFGenome{T}, X::Matrix{T}, y::Vector{T}) -> Float64

Expand ADF calls and compute MSE against y over X. Returns Inf on expansion or evaluation failure (typical: ARG references with no enclosing ADF context, i.e. ARG slots leaked into the main tree's expanded form).

source
Arborist.expand_adfsMethod
expand_adfs(g::ADFGenome{T}) -> Node{T}

Produce a fully-expanded copy of g.main where every ADF call has been replaced by the corresponding ADF body with ARG references substituted for the call's argument subtrees. The result uses only the base operators (no placeholders) and references only real features [1, n_features]. Suitable for direct eval_tree_array evaluation against the user's base_operators.

source
Arborist.initialize_adfMethod
initialize(::Type{ADFGenome{T}}, base_ops, n_features, n_adfs;
           arity=2, max_depth=4, rng) -> ADFGenome{T}

Construct a random ADFGenome with n_adfs ADFs. Main tree uses base operators plus ADF placeholders; ADF bodies use only base operators (nested ADF calls are not generated). ADF bodies may reference ARG slots in addition to the user's features.

source
Arborist.mutateMethod
mutate(::SubtreeMutation, g::ADFGenome, rng) -> ADFGenome

Pick uniformly among (main, adf1, ..., adfN) and apply subtree mutation to that tree. ADF bodies use the base operator set and may reference ARG slots; main tree uses augmented operators and references only real features.

source
Arborist.tree_depthMethod
tree_depth(g::ADFGenome) -> Int

Maximum of count_depth across the main tree and every ADF body. This captures the worst-case depth a caller might evaluate post-expansion; it does not account for expansion-driven inlining, which can compose depths up to depth(main) + depth(any_adf) - 1 in the fully expanded form.

source
Arborist.SymbolicRegressionEvaluatorMethod
SymbolicRegressionEvaluator(f; domain, points=20, operators=_default_operators(Float32), noise=0.0)

Convenience constructor for symbolic regression problems. Generates a TreeFitnessEvaluator from a Julia function and domain specification.

Arguments

  • f: Target function (univariate: accepts Float32, multivariate: accepts Vector{Float32})
  • domain: Tuple{T,T} for univariate, Vector{Tuple{T,T}} for multivariate
  • points: Sample points per dimension (default: 20)
  • operators: OperatorEnum (default: +, -, *, / with sin, cos, exp, abs)
  • noise: Gaussian noise standard deviation to add to targets (default: 0.0)
source
Arborist.deserializeMethod
deserialize(::Type{TreeGenome{T}}, s, operators, n_features) -> Union{TreeGenome{T}, Nothing}

Parse a string representation of an expression tree back into a TreeGenome{T}. Accepts both the infix form emitted by serialize / DynamicExpressions' string_tree (e.g. x1 + 1.0, sin((x1 + 1.0) * x2)) and the prefix s-expression form used by older code paths (e.g. +(x1, 1.0), sin(*(x1, x2))). Both forms are accepted because Meta.parse normalizes them to the same Expr(:call, ...) structure that _expr_to_node walks.

Returns nothing for unparseable input, unrecognized operators, or out-of-range feature indices. The caller is responsible for any fallback behavior.

source
Arborist.erc_uniformMethod
erc_uniform(lo::T, hi::T) -> Function

Return a callable (rng::AbstractRNG) -> T that samples uniformly from [lo, hi]. Pass the result as GeneticProgramming(; constant_sampler=...) to wire Koza-style Ephemeral Random Constants into TreeGenome creation and mutation.

alg = GeneticProgramming(; constant_sampler = erc_uniform(-5.0f0, 5.0f0))
source
Arborist.evaluateMethod
evaluate(e::TreeFitnessEvaluator{T}, g::TreeGenome{T}) -> Float64

Evaluate a TreeGenome against the data matrix. Returns mean squared error. Returns Inf if evaluation throws or produces NaN/Inf values.

source
Arborist.evaluate_casesMethod
evaluate_cases(g::TreeGenome{T}, e::TreeFitnessEvaluator{T}) -> Vector{Float64}

Return per-sample squared error as a Vector{Float64} of length length(e.y). Non-finite samples (NaN/Inf after vectorised evaluation) are reported as Inf. All samples Inf on evaluation failure. Used by lexicase selection.

source
Arborist.from_migrantMethod
from_migrant(m::MigrantGenome, ctx::TreeGenomeContext{T}) -> TreeGenome{T}

Reconstruct a TreeGenome{T} from a MigrantGenome by wrapping the transported Node{T} with the destination island's operators and n_features from ctx.

source
Arborist.optimize_constants!Method
optimize_constants!(g::TreeGenome, e::TreeFitnessEvaluator; kwargs...) -> Float64

Apply BFGS with central finite-difference gradients to the constants of g.tree against e's dataset. Mutates g.tree in place; returns the post-optimization MSE loss.

Returns the pre-optimization loss unchanged if the tree has zero constants or if the initial evaluation produces Inf / NaN. Never makes the tree worse: if BFGS diverges or line search fails, constants are restored and the original loss is returned.

Keyword arguments

  • max_iter::Int = 50
  • tol::Float64 = 1e-8
  • fd_step::Float64 = 1e-3
source
Arborist.to_migrantMethod
to_migrant(g::TreeGenome{T}, fitness::Float64) -> MigrantGenome

Pack a TreeGenome's Node{T} into a MigrantGenome for cross-island (and cross-process) migration. The Node{T} is carried directly rather than going through the string-based serialize/deserialize path: direct transport avoids any parse ambiguity, preserves exact bit patterns of Float32 constants, and is independent of DynamicExpressions' string_tree output format. Julia's Distributed serializer handles Node{T} natively.

The destination island's OperatorEnum is reattached in from_migrant. Op indices stored in Node{T} are stable across islands because every island holds the same OperatorEnum built from the problem.

source
Arborist.tree_depthMethod
tree_depth(g::TreeGenome) -> Int

Longest root-to-leaf path through the genome's expression tree, computed via DynamicExpressions.count_depth. A bare-leaf tree has depth 1.

source
CommonSolve.solveMethod
solve(problem::GPProblem{TreeGenome{T}}, algorithm::GeneticProgramming; ...) -> GPResult

Run genetic programming evolution with TreeGenome. Uses DynamicExpressions.jl for fast vectorized evaluation without @eval.

source
Arborist.from_migrantMethod
from_migrant(m::MigrantGenome, state::GenState) -> ExprGenome

Reconstruct an ExprGenome from a MigrantGenome using the local GenState.

source
Arborist.from_migrantMethod
from_migrant(m::MigrantGenome, ctx::GraphGenomeContext) -> GraphGenome

Context-dispatched form used by IslandModel (_inject_migrants_local! calls from_migrant(m, island.state) uniformly across genome types). GraphGenome migrants carry their own n_inputs / n_outputs, so the context is consulted only for dispatch.

source
Arborist.from_migrantMethod
from_migrant(m::MigrantGenome, ::Type{GraphGenome}) -> GraphGenome

Reconstruct a GraphGenome from a MigrantGenome.

source
Arborist.from_migrantMethod
from_migrant(m::MigrantGenome, primitives::Vector{Symbol},
             conditions::Vector{Symbol}, max_depth::Int) -> AntGenome

Reconstruct an AntGenome from a MigrantGenome.

source
Arborist.to_migrantMethod
to_migrant(g::AntGenome, fitness::Float64) -> MigrantGenome

Extract serializable data from an AntGenome for cross-process migration.

source
Arborist.to_migrantMethod
to_migrant(g::ExprGenome, fitness::Float64) -> MigrantGenome

Extract serializable data from an ExprGenome for cross-process migration.

source
Arborist.to_migrantMethod
to_migrant(g::GraphGenome, fitness::Float64) -> MigrantGenome

Extract serializable data from a GraphGenome for cross-process migration.

source
Arborist.fitnessesMethod
fitnesses(hof::HallOfFame) -> Vector{Float64}

Return the archive's fitness list in order (best first). Alias for hof.fitnesses — kept as a function for API stability if the internal representation changes.

source
Base.push!Method
push!(hof::HallOfFame{G}, genome::G, fitness::Real)

Insert (genome, fitness) into the hall if it qualifies. Non-finite fitnesses are rejected. Duplicates (fitness within 1e-12) are rejected. When the archive is at capacity, the worst-fitness entry is evicted if the candidate is strictly better.

source
Arborist.to_dotFunction
to_dot(g) -> String
to_dot(io::IO, g)

Produce a Graphviz DOT document describing the genome g. Supported inputs are TreeGenome, ExprGenome, ADFGenome, AntGenome, and GraphGenome.

For tree-structured genomes the output is a directed acyclic graph with ellipse-shaped nodes labeled by operator / constant / variable. For GraphGenome the output is a left-to-right network diagram with distinct node shapes by role (input, output, bias, hidden) and edges labeled by connection weight, with disabled connections shown dashed and gray.

The function returns the document as a String. The two-argument form writes to io and returns io for chaining. Tree-genome methods do not throw on NaN / Inf constants; they are rendered literally.

source

Constants

Arborist.ACTIVATION_FNSConstant
ACTIVATION_FNS

Dictionary mapping activation Symbol names to their unary Function implementations, used by GraphEvaluator when propagating values through a GraphGenome. The built-in set is:

  • :sigmoid — NEAT-style steepened logistic 1 / (1 + exp(-4.9·x)).
  • :tanh — hyperbolic tangent.
  • :relu — rectified linear, max(0, x).
  • :identityx (pass-through).
  • :gauss — Gaussian bump exp(-x²). Common in CPPN / HyperNEAT work.
  • :sin — plain sin(x). Substrate or network is expected to supply any frequency scaling.
  • :abs — absolute value |x|.
  • :step — Heaviside step, 1.0 for x > 0, else 0.0.

New activations can be added by assigning into this dict before solving; each NodeGene stores the activation as a Symbol and looks the function up here at evaluation time.

Note: the NEAT mutation operators (AddNodeMutation, NEATDefaultMutation) only draw from :sigmoid, :tanh, :relu by default when adding a new hidden node. To make CPPN activations available to those operators, pass hidden_activations=[:sigmoid, :tanh, :gauss, :sin, :abs] (or similar) at construction.

source