Genome Types — API Reference
Autogenerated reference for every exported genome-related symbol (AbstractGenome, the concrete genomes, the GPProblem carrier, the per-genome evaluators and primitives, the all-time-best HallOfFame container, and the Graphviz to_dot visualization helper).
Types
Arborist.AbstractCrossoverOperator — Type
AbstractCrossoverOperatorBase type for crossover operators.
Concrete subtypes define crossover(op, g1::AbstractGenome, g2::AbstractGenome, rng::AbstractRNG) -> Tuple. May also override operator_name(op) -> Symbol for RunLog tallies.
Arborist.AbstractEvaluator — Type
AbstractEvaluatorBase type for fitness evaluators.
Any concrete subtype E <: AbstractEvaluator must implement:
evaluate(e::E, f::Function) -> Float64(lower is better)input_signature(e::E) -> Dict{Symbol, DataType}output_signature(e::E) -> Dict{Symbol, DataType}
Optionally, evaluators that can decompose fitness into independent per-case losses (e.g. per-row MSE, per-sample squared error) may implement evaluate_cases(g::AbstractGenome, e::E) -> Vector{Float64}. This is required for lexicase selection; evaluators that cannot meaningfully decompose (e.g. AntEvaluator, EpisodicEvaluator) should leave it unimplemented — lexicase will then raise a clear MethodError.
Arborist.AbstractEvolutionResult — Type
AbstractEvolutionResultBase type for results returned by solve.
Arborist.AbstractEvolutionaryAlgorithm — Type
AbstractEvolutionaryAlgorithmBase type for evolutionary algorithm configurations.
Arborist.AbstractGenome — Type
AbstractGenomeBase type for all genome representations in Arborist.
A concrete subtype G <: AbstractGenome participates in evolution by providing the operations the solve path needs. At minimum:
mutate(op, g::G, rng::AbstractRNG) -> Gfor each mutation operator that dispatches onG(or a directmutate(g::G, rng)method for genome types that use direct dispatch, e.g.AntGenome,GraphGenome).crossover(op, g1::G, g2::G, rng::AbstractRNG) -> Tuple{G, G}(or a directcrossover(g1, g2, rng)method for direct-dispatch genomes).distance(g1::G, g2::G) -> Float64— used byThresholdSpeciation.complexity(g::G) -> Real— used by bloat penalty andParsimonyEvaluator.serialize(g::G) -> String— used by the LLM operator and logging.
Population initialization is genome-specific. Each genome type defines its own construction path invoked from the matching solve method; the signature is not fixed — ExprGenome uses a GenState, TreeGenome takes an OperatorEnum and feature count, AntGenome takes a primitive set, and GraphGenome takes input/output counts. See the per-genome solve(::GPProblem{G,E}, ::GeneticProgramming) methods.
deserialize(::Type{G}, s::String, ctx...) is required only when using the LLM mutation operator on G; its extra arguments depend on G (e.g. GenState for ExprGenome, (OperatorEnum, n_features) for TreeGenome). LLMMutationOperator currently dispatches on ExprGenome only.
Arborist.AbstractMutationOperator — Type
AbstractMutationOperatorBase type for mutation operators.
Concrete subtypes define mutate(op, g::AbstractGenome, rng::AbstractRNG) -> AbstractGenome. Concrete subtypes may also override operator_name(op) -> Symbol to expose a friendly key for RunLog's per-operator tallies. The default derives the name from the struct type.
Arborist.AbstractSelectionStrategy — Type
AbstractSelectionStrategyBase type for parent selection strategies (e.g., tournament selection, lexicase selection).
Concrete subtypes must implement:
select_parent(s::S, selection_fitnesses::Vector{Float64}, case_fitnesses, rng)returning the integer index of the selected parent ingenomes/selection_fitnesses. Case-based strategies (lexicase) use the matrix; scalar-fitness strategies (tournament) ignore it.needs_cases(s::S) -> Bool(defaultfalse). Strategies that returntruecause the solve loop to materialize a per-case fitness vector for every individual each generation viaevaluate_cases. Returningtruerequires the evaluator to implementevaluate_cases; otherwise the solve loop raisesMethodErrorthe first time it tries.
Arborist.AbstractSpeciation — Type
AbstractSpeciationBase type for speciation strategies.
Arborist.AbstractTopology — Type
AbstractTopologyBase type for island migration topologies. Concrete subtypes define migration_targets(t, i, n_islands, rng) returning destination island indices.
Arborist.ExprGenome — Type
ExprGenome <: AbstractGenomeGenome representation based on Julia Expr trees. Wraps the existing codegen.jl / evolution.jl infrastructure.
Fields
body::Vector{Expr}: body statements (not yet wrapped in a function harness)state::GenState: type context carrying variable types, function set, etc.
Known limitations
serialize/deserializeround-trip is ~80% reliable.repr()-styleFloat32(literal)forms produced by the Julia printer fail type-checking on round-trip. The LLM operator falls back silently to a classical operator, but checkpoint/resume or cross-process migration can lose a fraction of individuals.@evalgrows Julia's method table monotonically across generations. Long runs (thousands of generations × hundreds of individuals) accumulate tens of thousands of methods, slowing dispatch. UseTreeGenomefor long runs where applicable.
Arborist.GPProblem — Type
GPProblem{G<:AbstractGenome, E<:AbstractEvaluator}Problem specification for genetic programming. Combines an evaluator (which defines the fitness landscape) with a genome type and configuration.
Fields
evaluator::E: the fitness evaluatorgenome_type::Type{G}: the genome type to evolvefunction_set::FunctionSet: available functions for code generationnum_temps::Int: number of temporary variables per genomeseed::Union{Int, Nothing}: random seed for reproducibility (nothingfor no seeding)
Arborist.GPProblem — Method
GPProblem(evaluator, ::Type{G}; function_set, num_temps, seed) -> GPProblemConstruct a GPProblem with keyword arguments and sensible defaults.
Arborist.AntEvaluator — Type
AntEvaluator <: AbstractEvaluatorFitness evaluator for AntGenome. Compiles and executes the evolved program with an AntSimulator. Returns the number of uneaten food pellets as fitness (lower is better, 0 = perfect).
Arborist.AntGenome — Type
AntGenome <: AbstractGenomeA genome for evolving programs that control an agent via side-effectful primitives. Unlike ExprGenome, AntGenome does not require a typed input/output signature — the program operates on implicit agent state via a module-level simulator reference.
Suitable for the Santa Fe Ant Trail and similar control problems.
Fields
program::Expr: a:blockexpression of nested primitive calls and control flowprimitives::Vector{Symbol}: action primitives (consume moves)conditions::Vector{Symbol}: condition primitives (sensors)max_depth::Int: maximum program depth
Known limitations
- Not thread-safe. The simulator uses a module-level
Reffor state.GeneticProgramming(; parallel=true)withAntGenomeraises a runtime error. Useparallel=falseor refactor to a thread-local-state pattern (as demonstrated inexamples/bin_packing.jlandexamples/sorting.jl).
Arborist.AntSimulator — Type
AntSimulatorMutable state for ant simulation on a toroidal grid.
Arborist.ConnectionGene — Type
ConnectionGeneA directed connection between two nodes.
Arborist.EpisodicEvaluator — Type
EpisodicEvaluator{FInit,FDyn,FRew,FDone,FObs,FDec} <: AbstractEvaluatorEvaluates a GraphGenome as a closed-loop policy on an episodic environment defined by declarative callables. The network is treated as obs -> action, and the evaluator drives the loop:
for ep in 1:n_episodes
rng = MersenneTwister(episode_seed_base + ep)
state = initial_state(rng)
for step in 1:max_steps
obs = observe(state)
net_output = forward_network(state, obs)
action = decode_action(net_output)
next_state = dynamics(state, action)
total += reward(state, action, next_state)
state = next_state
done(state) && break
end
endFitness is -mean_reward_per_episode (framework convention is lower-is-better, so episodic tasks that want to maximise reward are negated). On cycle detection in allow_recurrent=false mode, returns Inf.
Fields
n_inputs::Int/n_outputs::Int— dimensions the network expects, must matchlength(observe(state))andlength(net_output).initial_state::FInit—rng -> state. Must be reproducible from rng.dynamics::FDyn—(state, action) -> next_state.reward::FRew—(state, action, next_state) -> Float64.done::FDone—state -> Bool. Stops the episode early whentrue.observe::FObs—state -> Vector{Float64}of lengthn_inputs.decode_action::FDec—Vector{Float64}of lengthn_outputs→ action.max_steps::Int— per-episode step cap.n_episodes::Int— rollouts averaged perevaluate_genomecall.episode_seed_base::Int—rng = MersenneTwister(base + ep_index).activation_fns::Dict{Symbol,Function}— defaults toACTIVATION_FNS.allow_recurrent::Bool— defaultstrue(episodic tasks usually want persistent hidden-node state across timesteps).relaxation_passes::Int— recurrent-mode sweeps per step; default1.
Design
The shape is declarative / pure-functional by default (see memory/episodicevaluatordesign.md). For environments with heavy reusable state (physics-engine handle, loaded dataset), a future StatefulEpisodicEvaluator subtype can offer the reset!/step! idiom; it is intentionally not built yet.
Known limitations
- Not parallel-safe for stateful environments. The declarative API is structurally thread-safe when every callable is pure, but a
dynamicsclosure that captures mutable state will race underGeneticProgramming(; parallel=true). Useparallel=falsefor stateful environments untilStatefulEpisodicEvaluatorlands.
Arborist.EpisodicEvaluator — Method
EpisodicEvaluator(n_inputs, n_outputs, initial_state, dynamics, reward, done,
observe, decode_action; max_steps, n_episodes, ...)Outer constructor. Keyword-argument defaults:
max_steps = 1000n_episodes = 1episode_seed_base = 0activation_fns = ACTIVATION_FNSallow_recurrent = truerelaxation_passes = 1
Arborist.GraphEvaluator — Type
GraphEvaluator <: AbstractEvaluatorEvaluates a GraphGenome by building the neural network from the genome topology, running it on input data, and computing MSE against target outputs.
Fields
input_data::Matrix{Float64}:n_inputs × n_samples. In recurrent mode, samples are treated as a time sequence and node activations persist across samples.output_data::Matrix{Float64}:n_outputs × n_samples.activation_fns::Dict{Symbol, Function}: activation function lookup.allow_recurrent::Bool: whentrue, cycles in the genome are allowed and evaluation uses a relaxation loop with state that persists across samples. Defaultfalse— cycles returnInf, state resets per sample.relaxation_passes::Int: number of activation sweeps per sample whenallow_recurrent=true. Default1. Higher values let information propagate further through the network within a single sample.
Arborist.GraphEvaluator — Method
GraphEvaluator(input_data, output_data;
activation_fns=ACTIVATION_FNS,
allow_recurrent=false,
relaxation_passes=1)Construct a GraphEvaluator. Defaults match the original feedforward behavior: cycles return Inf, per-sample state reset, single forward pass. Pass allow_recurrent=true for sequence/memory tasks where node activations should persist across samples (and cycles are legal).
Arborist.GraphGenome — Type
GraphGenome <: AbstractGenomeA genome representing a neural network topology, following the NEAT encoding (Stanley & Miikkulainen, 2002). Supports structural mutation (add node, add connection) and weight mutation, plus crossover aligned by innovation number.
Fields
nodes::Dict{Int, NodeGene}: node genes keyed by node IDconnections::Dict{Int, ConnectionGene}: connection genes keyed by innovation numbern_inputs::Int: number of input nodes (not counting bias)n_outputs::Int: number of output nodesfitness::Float64: cached fitness value
Known limitations
- Distributed NEAT innovation matching is disjoint-range, not content-aware. Under
IslandModel(distributed=true), each worker gets a unique innovation ID range viainit_innovation_range!((island_id - 1) * INNOVATION_STRIDE)so IDs don't collide. The cost: structurally identical mutations on different workers receive different IDs and are treated as disjoint by NEAT crossover rather than aligned. Per-generation cross-worker innovation dedup is not implemented.
Arborist.GraphGenomeContext — Type
GraphGenomeContextPer-island state carrier for GraphGenome under IslandModel. Parallels GenState (ExprGenome) and TreeGenomeContext (TreeGenome): all three carry .rng so that island-loop sites reading state.rng work uniformly, and all three are the second element of the tuple returned from _initialize_population.
The extra n_inputs / n_outputs fields are kept available for future use (e.g. cross-island initialization) but aren't currently consulted — migrant GraphGenomes carry their own n_inputs / n_outputs.
Arborist.NodeGene — Type
NodeGeneA single node in a neural network topology genome.
Arborist.ADFGenome — Type
ADFGenome{T} <: AbstractGenomeGenome with a main expression tree and N = length(adfs) Automatically Defined Function trees. Each ADF is a Node{T} that may reference the ARG slots ARG0..ARG{arity-1} (encoded as features above the user's n_features).
The main tree may invoke any ADF as a binary operator at slot n_base_binary + i. ADFs themselves may reference features [1, n_features] and ARG slots; nested ADF-from-ADF calls are not currently supported (ADF body is generated without ADF placeholders).
Fields
main::Node{T}: main expression tree.adfs::Vector{Node{T}}: ADF body trees, one per ADF.arity::Int: shared arity of every ADF (default 2).operators::OperatorEnum: the augmented operator enum (base operators- N ADF placeholders). Use
base_operators(g)to recover the user's
- N ADF placeholders). Use
n_features::Int: number of real input features. ARG slots occupy[n_features+1, n_features+arity].n_adfs::Int: convenience —length(adfs).
Arborist.TreeFitnessEvaluator — Type
TreeFitnessEvaluator{T} <: AbstractEvaluatorFitness evaluator for TreeGenome. Evaluates the expression tree directly over a data matrix without @eval. Dramatically faster than TableFitnessEvaluator for large datasets.
Fields
X::Matrix{T}: input data,n_features × n_samplesy::Vector{T}: target output, lengthn_samplesoperators::OperatorEnum: operator configuration
Arborist.TreeGenome — Type
TreeGenome{T} <: AbstractGenomeA genome backed by a DynamicExpressions.jl expression tree. Supports fast vectorized evaluation over datasets without @eval. Appropriate for pure function approximation problems.
Type parameter T is the numeric type of the expression (Float32 is recommended for most GP applications).
Fields
tree::Node{T}: the expression treeoperators::OperatorEnum: operator configuration for evaluationn_features::Int: number of input features
Arborist.TreeGenomeContext — Type
TreeGenomeContext{T}Per-island state carrier for TreeGenome under IslandModel. Parallels GenState for ExprGenome: both carry .rng so that island-loop sites reading state.rng work uniformly, and both are the second element of the tuple returned from _initialize_population.
The extra operators and n_features fields let from_migrant reconstruct a TreeGenome{T} by invoking deserialize with the destination island's operator enum.
Arborist.MigrantGenome — Type
MigrantGenomeSerializable carrier for genome data during island migration. Contains only data that survives cross-process transfer — GenState, compiled functions, and RNG state are reconstructed locally.
Fields
data::Any: genome-specific payload (Vector{Expr} for ExprGenome, Expr for AntGenome, etc.)fitness::Float64: fitness on the source islandgenome_type::Symbol: identifies the genome type for reconstruction
Arborist.GPResult — Type
GPResult{G<:AbstractGenome} <: AbstractEvolutionResultResult returned by solve. Contains the best genome found, fitness history, and metadata about the evolutionary run.
Fields
best_genome::G: the genome with the best fitness found during the runbest_fitness::Float64: fitness of the best genome (lower is better)population::Vector{G}: final population sorted by fitnessfitness_history::Vector{Float64}: best fitness per generationmean_history::Vector{Float64}: mean finite fitness per generationgenerations_run::Int: number of generations completedwall_time::Float64: elapsed wall-clock time in secondsconverged::Bool: whether the run met the convergence criterionhall_of_fame::Union{Nothing, HallOfFame{G}}: top-K archive across all generations whensolve(... ; hall_of_fame_size=K)was passed withK > 0.nothingotherwise.
Arborist.HallOfFame — Type
HallOfFame{G<:AbstractGenome}Bounded top-K archive of the best genomes a solve() has ever seen, across all generations. Maintained in ascending fitness order (best first). Opt-in via the hall_of_fame_size::Int kwarg on solve() — size == 0 is the default and produces nothing in GPResult.hall_of_fame.
Fields
capacity::Int: maximum number of distinct entries retainedgenomes::Vector{G}: genome list, best-firstfitnesses::Vector{Float64}: matching fitness list
Dedup
push!(hof, g, f) treats two fitnesses as duplicates when they are within 1e-12 of each other. This cheap filter catches structurally- equivalent solutions (identical fitness) without the cost of walking genomes for structural equality. It will occasionally merge two semantically-distinct genomes that happen to produce the same exact fitness; acceptable for a Hall-of-Fame, which is best-effort rather than canonical.
Arborist.HallOfFame — Method
HallOfFame{G}(capacity::Int) -> HallOfFame{G}
HallOfFame{G}(; capacity::Int=20) -> HallOfFame{G}Functions
Arborist.evaluate_cases — Function
evaluate_cases(g::AbstractGenome, e::AbstractEvaluator) -> Vector{Float64}Per-case loss vector (lower = better) for evaluators that can decompose their fitness into independent cases (per-row, per-sample). Used by lexicase selection.
No default implementation: evaluators that can support lexicase must opt in explicitly. If not implemented, calling it raises MethodError.
Arborist.needs_cases — Method
needs_cases(s::AbstractSelectionStrategy) -> BoolReturn true if the strategy requires per-case fitnesses (evaluate_cases-derived). Default: false.
Arborist.operator_name — Method
operator_name(op) -> SymbolStable name for an operator, used as the key in GenerationLog.operator_attempted / operator_success. Default: the concrete type's nameof.
Arborist.select_parent — Function
select_parent(s::AbstractSelectionStrategy, selection_fitnesses, case_fitnesses, rng) -> IntSelect a parent index. selection_fitnesses::Vector{Float64} is the sharing-adjusted scalar fitness used by classical strategies (lower is better). case_fitnesses::Union{Nothing, Vector{Vector{Float64}}} is the per-individual per-case loss matrix used by lexicase strategies (same convention: lower is better; nothing when needs_cases(s) == false).
Arborist.tree_depth — Function
tree_depth(g::AbstractGenome) -> IntLongest root-to-leaf path through the genome's expression tree. Defined for tree-structured genomes (ExprGenome, TreeGenome, AntGenome, ADFGenome) and used by mutation/crossover operators that enforce a max_depth cap.
Graph-structured genomes (e.g. GraphGenome) do not define this — MethodError on those is intentional; depth is not a meaningful bound for a recurrent graph.
Leaf convention: a bare leaf (symbol / number / feature node) has depth 1; each additional level of nesting increases depth by 1.
Arborist.complexity — Method
complexity(g::ExprGenome) -> Float64Total node count across all body statements, measured via unravel.
Arborist.crossover — Method
crossover(g1::ExprGenome, g2::ExprGenome, rng::AbstractRNG) -> Tuple{ExprGenome, ExprGenome}Produce two offspring via subtree crossover.
Arborist.deserialize — Method
deserialize(::Type{ExprGenome}, s::String, state::GenState) -> Union{ExprGenome, Nothing}Parse a string of Julia statements into an ExprGenome. Returns nothing if zero valid statements survive parsing and type-checking.
Accepts assignments, while loops, if/if-else statements, for loops, blocks, break, continue, and standalone function calls. Multi-line control flow is supported by parsing the entire string as a block.
Statements that fail parsing or type-checking are skipped (partial recovery) rather than rejecting the whole genome.
Does not eval anything; parse only.
Arborist.deserialize — Method
deserialize(::Type{ExprGenome}, s::String; state::Union{GenState, Nothing}=nothing) -> Union{ExprGenome, Nothing}Backward-compatible keyword-argument version. Delegates to the positional version when state is provided; returns nothing when it is not.
Arborist.distance — Method
distance(g1::ExprGenome, g2::ExprGenome) -> Float64Structural compatibility distance. Counts Expr nodes appearing in one program but not the other after type-normalizing.
Arborist.evaluate_cases — Method
evaluate_cases(g::ExprGenome, e::TableFitnessEvaluator) -> Vector{Float64}Compile the genome and return per-row squared error via the TableFitnessEvaluator case evaluator. All rows Inf on compilation failure.
Arborist.evaluate_genome — Method
evaluate_genome(g::ExprGenome, evaluator::AbstractEvaluator) -> Float64Compile and evaluate an ExprGenome against the given evaluator. Returns Inf on any compilation or evaluation failure.
Arborist.initialize — Method
initialize(::Type{ExprGenome}, problem::GPProblem) -> ExprGenomeCreate a random ExprGenome using the problem's function set and evaluator signatures.
Arborist.mutate — Method
mutate(g::ExprGenome, rng::AbstractRNG) -> ExprGenomeProduce a mutated copy of the genome by applying a random point mutation to a randomly selected sub-expression.
Arborist.serialize — Method
serialize(g::ExprGenome) -> StringConvert an ExprGenome body to a human-readable Julia source string suitable for inclusion in an LLM prompt. Each statement is printed on its own line using Julia's standard pretty-printer.
Arborist.tree_depth — Method
tree_depth(g::ExprGenome) -> IntMaximum depth over every statement in g.body. Empty bodies return 0.
Arborist.evaluate_genome — Method
evaluate_genome(g::AntGenome, e::AntEvaluator) -> Float64Compile and evaluate an AntGenome against the ant trail evaluator.
Arborist.gp_ant_food_ahead — Method
gp_ant_food_ahead(::Bool) -> BoolSensor primitive: return true if the cell directly ahead of the ant contains food, false otherwise. Does not consume a move or change the ant's pose. The Bool argument is ignored (placeholder for the evolved program's type scheme).
Arborist.gp_ant_left — Method
gp_ant_left(::Bool) -> BoolRotate the ant 90° counter-clockwise, consuming a move. The Bool argument is ignored (placeholder for the evolved program's type scheme). Returns true if the turn happened, false if the simulator is absent or the ant is out of moves.
Arborist.gp_ant_move — Method
gp_ant_move(::Bool) -> BoolAdvance the ant one cell in its current direction, consuming a move. Eats the food pellet in the destination cell if present. The Bool argument is a placeholder for the evolved program's type scheme and is ignored. Returns true if the move happened, false if the simulator is absent or the ant has exhausted its move budget.
Arborist.gp_ant_right — Method
gp_ant_right(::Bool) -> BoolRotate the ant 90° clockwise, consuming a move. The Bool argument is ignored (placeholder for the evolved program's type scheme). Returns true if the turn happened, false if the simulator is absent or the ant is out of moves.
Arborist.tree_depth — Method
tree_depth(g::AntGenome) -> IntLongest root-to-leaf path through the ant program's Expr tree. Note: AntGenome also carries a max_depth field which is the construction ceiling used by _random_ant_program — it limits how deeply a fresh random program is generated but does not bound later mutation output. Use the mutation operator's max_depth kwarg for a post-mutation cap.
CommonSolve.solve — Method
solve(problem::GPProblem{AntGenome}, algorithm::GeneticProgramming; ...) -> GPResultRun GP evolution with AntGenome for side-effectful program synthesis.
Arborist.deserialize — Method
deserialize(::Type{GraphGenome}, s::AbstractString, n_inputs, n_outputs;
reassign_innovations=false) -> Union{GraphGenome, Nothing}Parse the text emitted by serialize(::GraphGenome) back into a GraphGenome. The format is line-oriented:
N <id> <type> <activation>— node lineC <in>-><out> w=<weight> en=<true|false> i=<innovation>— connection line
Lines not starting with N or C are skipped (tolerates LLM commentary, code fences, etc). Returns nothing when a malformed line is encountered, when a connection references an undefined node, or when n_inputs / n_outputs disagree with the decoded node set.
Preserves node IDs and innovation numbers verbatim — required for content-aware distributed migration and NEAT crossover alignment. Pass reassign_innovations=true to issue a fresh innovation ID to every connection via _next_innovation!(); the LLM mutation path uses this to prevent LLM-generated IDs from colliding with the parent pool's history.
Arborist.evaluate_cases — Method
evaluate_cases(g::GraphGenome, e::GraphEvaluator) -> Vector{Float64}Return per-sample mean squared error (averaged across outputs) as a Vector{Float64} of length size(e.input_data, 2). Any sample that raises or produces a non-finite squared error is reported as Inf. Used by lexicase selection.
Feedforward mode only: recurrent evaluators have persistent state across samples (samples form a time sequence) so per-sample cases are not independent. Calling this on a recurrent evaluator raises ArgumentError.
Arborist.evaluate_genome — Method
evaluate_genome(g::GraphGenome, e::EpisodicEvaluator) -> Float64Run e.n_episodes closed-loop rollouts of g as a policy on the environment described by e, return -mean_reward_per_episode.
Arborist.evaluate_genome — Method
evaluate_genome(g::GraphGenome, e::GraphEvaluator) -> Float64Evaluate a GraphGenome by propagating inputs through the network. Returns mean squared error against target outputs.
- Feedforward mode (
e.allow_recurrent=false, default): topologically sorts the network; returnsInfon cycle. Each sample is independent — node activations reset between samples. - Recurrent mode (
e.allow_recurrent=true): cycles are allowed. Node activations persist across samples (samples are treated as a time sequence). Each sample runse.relaxation_passesactivation sweeps over all non-input nodes in sorted-id order, reading from the previous pass's values for inputs from cyclic edges.
Arborist.init_innovation_range! — Method
init_innovation_range!(offset::Int)Set the module-local innovation counter to offset. Used by the distributed island model to give each worker a disjoint range of innovation IDs so that NEAT crossover on migrants does not align structurally unrelated genes under the same innovation number.
Callers in distributed mode typically use offsets like (island_id - 1) * 10^9 — disjoint as long as no single worker allocates more than 10^9 structural mutations in a run. The sequential island model does not need this: all islands share the same process-global counter, which already ensures uniqueness.
Arborist.initialize — Method
initialize(::Type{GraphGenome}, n_inputs, n_outputs, rng) -> GraphGenomeCreate a minimal fully-connected network: all inputs connected to all outputs with random weights, no hidden nodes. Includes a bias node.
Arborist.reset_innovation_counter! — Method
reset_innovation_counter!()Reset the global innovation counter to 0. Must be called at the start of each solve() call for GraphGenome problems.
CommonSolve.solve — Method
solve(problem::GPProblem{GraphGenome}, algorithm::GeneticProgramming; ...) -> GPResultRun NEAT-style evolution with GraphGenome. Handles initialization, mutation, crossover with innovation-aligned genes, and speciation.
Accepts any AbstractEvaluator that implements evaluate_genome(::GraphGenome, e) and whose input_signature(e) / output_signature(e) lengths match the intended network dimensions — GraphEvaluator for table-based tasks, EpisodicEvaluator for closed-loop control tasks.
Arborist.augmented_operators — Method
augmented_operators(base::OperatorEnum, n_adfs::Int) -> OperatorEnumBuild the operator enum used by an ADFGenome's trees: the user's binary operators followed by n_adfs placeholder binary operators (one per ADF). ADF body trees and the main tree share this enum. The placeholders are never actually invoked — expand_adfs rewrites them before evaluation.
Arborist.base_operators — Method
base_operators(g::ADFGenome) -> OperatorEnumRecover the user's original operator enum (without the N ADF placeholder slots).
Arborist.crossover — Method
crossover(::SubtreeCrossover, g1::ADFGenome, g2::ADFGenome, rng) -> TupleSame-index subtree crossover: pick uniformly among (main, adf1, ..., adfN) and swap subtrees within the chosen tree pair. Requires both genomes to share n_features and n_adfs.
Arborist.evaluate_adf — Method
evaluate_adf(g::ADFGenome{T}, X::Matrix{T}, y::Vector{T}) -> Float64Expand ADF calls and compute MSE against y over X. Returns Inf on expansion or evaluation failure (typical: ARG references with no enclosing ADF context, i.e. ARG slots leaked into the main tree's expanded form).
Arborist.expand_adfs — Method
expand_adfs(g::ADFGenome{T}) -> Node{T}Produce a fully-expanded copy of g.main where every ADF call has been replaced by the corresponding ADF body with ARG references substituted for the call's argument subtrees. The result uses only the base operators (no placeholders) and references only real features [1, n_features]. Suitable for direct eval_tree_array evaluation against the user's base_operators.
Arborist.initialize_adf — Method
initialize(::Type{ADFGenome{T}}, base_ops, n_features, n_adfs;
arity=2, max_depth=4, rng) -> ADFGenome{T}Construct a random ADFGenome with n_adfs ADFs. Main tree uses base operators plus ADF placeholders; ADF bodies use only base operators (nested ADF calls are not generated). ADF bodies may reference ARG slots in addition to the user's features.
Arborist.mutate — Method
mutate(::SubtreeMutation, g::ADFGenome, rng) -> ADFGenomePick uniformly among (main, adf1, ..., adfN) and apply subtree mutation to that tree. ADF bodies use the base operator set and may reference ARG slots; main tree uses augmented operators and references only real features.
Arborist.tree_depth — Method
tree_depth(g::ADFGenome) -> IntMaximum of count_depth across the main tree and every ADF body. This captures the worst-case depth a caller might evaluate post-expansion; it does not account for expansion-driven inlining, which can compose depths up to depth(main) + depth(any_adf) - 1 in the fully expanded form.
Arborist.SymbolicRegressionEvaluator — Method
SymbolicRegressionEvaluator(f; domain, points=20, operators=_default_operators(Float32), noise=0.0)Convenience constructor for symbolic regression problems. Generates a TreeFitnessEvaluator from a Julia function and domain specification.
Arguments
f: Target function (univariate: acceptsFloat32, multivariate: acceptsVector{Float32})domain:Tuple{T,T}for univariate,Vector{Tuple{T,T}}for multivariatepoints: Sample points per dimension (default: 20)operators:OperatorEnum(default: +, -, *, / with sin, cos, exp, abs)noise: Gaussian noise standard deviation to add to targets (default: 0.0)
Arborist.deserialize — Method
deserialize(::Type{TreeGenome{T}}, s, operators, n_features) -> Union{TreeGenome{T}, Nothing}Parse a string representation of an expression tree back into a TreeGenome{T}. Accepts both the infix form emitted by serialize / DynamicExpressions' string_tree (e.g. x1 + 1.0, sin((x1 + 1.0) * x2)) and the prefix s-expression form used by older code paths (e.g. +(x1, 1.0), sin(*(x1, x2))). Both forms are accepted because Meta.parse normalizes them to the same Expr(:call, ...) structure that _expr_to_node walks.
Returns nothing for unparseable input, unrecognized operators, or out-of-range feature indices. The caller is responsible for any fallback behavior.
Arborist.erc_uniform — Method
erc_uniform(lo::T, hi::T) -> FunctionReturn a callable (rng::AbstractRNG) -> T that samples uniformly from [lo, hi]. Pass the result as GeneticProgramming(; constant_sampler=...) to wire Koza-style Ephemeral Random Constants into TreeGenome creation and mutation.
alg = GeneticProgramming(; constant_sampler = erc_uniform(-5.0f0, 5.0f0))Arborist.evaluate — Method
evaluate(e::TreeFitnessEvaluator{T}, g::TreeGenome{T}) -> Float64Evaluate a TreeGenome against the data matrix. Returns mean squared error. Returns Inf if evaluation throws or produces NaN/Inf values.
Arborist.evaluate_cases — Method
evaluate_cases(g::TreeGenome{T}, e::TreeFitnessEvaluator{T}) -> Vector{Float64}Return per-sample squared error as a Vector{Float64} of length length(e.y). Non-finite samples (NaN/Inf after vectorised evaluation) are reported as Inf. All samples Inf on evaluation failure. Used by lexicase selection.
Arborist.from_migrant — Method
from_migrant(m::MigrantGenome, ctx::TreeGenomeContext{T}) -> TreeGenome{T}Reconstruct a TreeGenome{T} from a MigrantGenome by wrapping the transported Node{T} with the destination island's operators and n_features from ctx.
Arborist.optimize_constants! — Method
optimize_constants!(g::TreeGenome, e::TreeFitnessEvaluator; kwargs...) -> Float64Apply BFGS with central finite-difference gradients to the constants of g.tree against e's dataset. Mutates g.tree in place; returns the post-optimization MSE loss.
Returns the pre-optimization loss unchanged if the tree has zero constants or if the initial evaluation produces Inf / NaN. Never makes the tree worse: if BFGS diverges or line search fails, constants are restored and the original loss is returned.
Keyword arguments
max_iter::Int = 50tol::Float64 = 1e-8fd_step::Float64 = 1e-3
Arborist.to_migrant — Method
to_migrant(g::TreeGenome{T}, fitness::Float64) -> MigrantGenomePack a TreeGenome's Node{T} into a MigrantGenome for cross-island (and cross-process) migration. The Node{T} is carried directly rather than going through the string-based serialize/deserialize path: direct transport avoids any parse ambiguity, preserves exact bit patterns of Float32 constants, and is independent of DynamicExpressions' string_tree output format. Julia's Distributed serializer handles Node{T} natively.
The destination island's OperatorEnum is reattached in from_migrant. Op indices stored in Node{T} are stable across islands because every island holds the same OperatorEnum built from the problem.
Arborist.tree_depth — Method
tree_depth(g::TreeGenome) -> IntLongest root-to-leaf path through the genome's expression tree, computed via DynamicExpressions.count_depth. A bare-leaf tree has depth 1.
CommonSolve.solve — Method
solve(problem::GPProblem{TreeGenome{T}}, algorithm::GeneticProgramming; ...) -> GPResultRun genetic programming evolution with TreeGenome. Uses DynamicExpressions.jl for fast vectorized evaluation without @eval.
Arborist.from_migrant — Method
from_migrant(m::MigrantGenome, state::GenState) -> ExprGenomeReconstruct an ExprGenome from a MigrantGenome using the local GenState.
Arborist.from_migrant — Method
from_migrant(m::MigrantGenome, ctx::GraphGenomeContext) -> GraphGenomeContext-dispatched form used by IslandModel (_inject_migrants_local! calls from_migrant(m, island.state) uniformly across genome types). GraphGenome migrants carry their own n_inputs / n_outputs, so the context is consulted only for dispatch.
Arborist.from_migrant — Method
from_migrant(m::MigrantGenome, ::Type{GraphGenome}) -> GraphGenomeReconstruct a GraphGenome from a MigrantGenome.
Arborist.from_migrant — Method
from_migrant(m::MigrantGenome, primitives::Vector{Symbol},
conditions::Vector{Symbol}, max_depth::Int) -> AntGenomeReconstruct an AntGenome from a MigrantGenome.
Arborist.to_migrant — Method
to_migrant(g::AntGenome, fitness::Float64) -> MigrantGenomeExtract serializable data from an AntGenome for cross-process migration.
Arborist.to_migrant — Method
to_migrant(g::ExprGenome, fitness::Float64) -> MigrantGenomeExtract serializable data from an ExprGenome for cross-process migration.
Arborist.to_migrant — Method
to_migrant(g::GraphGenome, fitness::Float64) -> MigrantGenomeExtract serializable data from a GraphGenome for cross-process migration.
Arborist.fitnesses — Method
fitnesses(hof::HallOfFame) -> Vector{Float64}Return the archive's fitness list in order (best first). Alias for hof.fitnesses — kept as a function for API stability if the internal representation changes.
Base.push! — Method
push!(hof::HallOfFame{G}, genome::G, fitness::Real)Insert (genome, fitness) into the hall if it qualifies. Non-finite fitnesses are rejected. Duplicates (fitness within 1e-12) are rejected. When the archive is at capacity, the worst-fitness entry is evicted if the candidate is strictly better.
Arborist.to_dot — Function
to_dot(g) -> String
to_dot(io::IO, g)Produce a Graphviz DOT document describing the genome g. Supported inputs are TreeGenome, ExprGenome, ADFGenome, AntGenome, and GraphGenome.
For tree-structured genomes the output is a directed acyclic graph with ellipse-shaped nodes labeled by operator / constant / variable. For GraphGenome the output is a left-to-right network diagram with distinct node shapes by role (input, output, bias, hidden) and edges labeled by connection weight, with disabled connections shown dashed and gray.
The function returns the document as a String. The two-argument form writes to io and returns io for chaining. Tree-genome methods do not throw on NaN / Inf constants; they are rendered literally.
Constants
Arborist.ACTIVATION_FNS — Constant
ACTIVATION_FNSDictionary mapping activation Symbol names to their unary Function implementations, used by GraphEvaluator when propagating values through a GraphGenome. The built-in set is:
:sigmoid— NEAT-style steepened logistic1 / (1 + exp(-4.9·x)).:tanh— hyperbolic tangent.:relu— rectified linear,max(0, x).:identity—x(pass-through).:gauss— Gaussian bumpexp(-x²). Common in CPPN / HyperNEAT work.:sin— plainsin(x). Substrate or network is expected to supply any frequency scaling.:abs— absolute value|x|.:step— Heaviside step,1.0forx > 0, else0.0.
New activations can be added by assigning into this dict before solving; each NodeGene stores the activation as a Symbol and looks the function up here at evaluation time.
Note: the NEAT mutation operators (AddNodeMutation, NEATDefaultMutation) only draw from :sigmoid, :tanh, :relu by default when adding a new hidden node. To make CPPN activations available to those operators, pass hidden_activations=[:sigmoid, :tanh, :gauss, :sin, :abs] (or similar) at construction.
Arborist.DEFAULT_TREE_GP_SYSTEM_PROMPT — Constant
Default system prompt for TreeGenome LLM mutation (prefix notation).