Algorithms — API Reference
Autogenerated reference for every exported algorithm type (GeneticProgramming, IslandModel, NSGAII, CMAES, MAPElites), the speciation strategies, migration topologies, and the shared solve entry points.
Types
Arborist.GeneticProgramming — Type
GeneticProgramming <: AbstractEvolutionaryAlgorithmStandard genetic programming algorithm configuration. All fields are set via keyword arguments with sensible defaults.
Fields
pop_size::Int: population size (default: 100)generations::Int: number of generations to run (default: 200)mutation_rate::Float64: probability of mutation per offspring (default: 0.3)crossover_rate::Float64: probability of crossover per offspring pair (default: 0.3)elitism::Int: number of top individuals carried forward unchanged (default: 2)bloat_penalty::Float64: coefficient oncomplexity(g)added to fitness (default: 0.0)parallel::Bool: enable threaded population evaluation (default: true)speciation::AbstractSpeciation: speciation strategy (default:NoSpeciation())mutation_ops::Vector{AbstractMutationOperator}: mutation operatorscrossover_ops::Vector{AbstractCrossoverOperator}: crossover operatorsselection::AbstractSelectionStrategy: selection strategyconvergence_threshold::Float64: best fitness below this value setsconverged=truein result (default:Inf, meaning never converged)constant_sampler::Union{Function, Nothing}: callable(rng) -> Tused byTreeGenomewhen a new numeric-literal leaf is created (Koza-style Ephemeral Random Constants).nothingpreserves the defaultT(randn(rng)). Seeerc_uniformfor a helper that builds the canonical uniform-over-range sampler.
Arborist.GeneticProgramming — Method
GeneticProgramming(; kwargs...) -> GeneticProgrammingConstruct a GeneticProgramming algorithm with keyword arguments and sensible defaults.
Arborist.IslandModel — Type
IslandModel <: AbstractEvolutionaryAlgorithmIsland model that runs multiple independent populations (islands) with periodic migration. Supports pluggable topologies and optional distributed execution where each island runs in its own worker process.
Fields
n_islands::Int: number of islands (default: 4)island_algorithm::GeneticProgramming: algorithm for each islandmigration_interval::Int: generations between migrations (default: 10)migration_size::Int: number of individuals to migrate per event (default: 2)topology::AbstractTopology: migration topology (default:RingTopology())distributed::Bool: run each island in a separate worker process (default: false)async::Bool: asynchronous evolution — islands evolve independently (default: false, requires distributed=true)
Arborist.IslandModel — Method
IslandModel(; n_islands=4, island_algorithm=GeneticProgramming(),
migration_interval=10, migration_size=2,
topology=RingTopology(), distributed=false, async=false)Construct an IslandModel with keyword arguments and sensible defaults. When distributed=true, each island runs in its own worker process via Distributed.jl, eliminating @eval contention between islands.
Arborist.AbstractMultiObjectiveEvaluator — Type
AbstractMultiObjectiveEvaluator <: AbstractEvaluatorBase type for multi-objective fitness evaluators. All objectives are minimized (lower is better), consistent with the rest of Arborist.
Concrete subtypes must implement:
evaluate_multi(e, genome::AbstractGenome) -> Vector{Float64}objective_names(e) -> Vector{String}input_signature(e)andoutput_signature(e)(inherited from AbstractEvaluator)
Arborist.NSGAII — Type
NSGAII <: AbstractEvolutionaryAlgorithmNSGA-II (Non-dominated Sorting Genetic Algorithm II) for multi-objective genetic programming. Uses Pareto non-dominated sorting and crowding distance for selection, with a (mu+lambda) replacement strategy.
No elitism field — NSGA-II's combined parent+offspring sorting IS the elitism mechanism. No speciation — crowding distance maintains diversity. No bloat_penalty — complexity should be an explicit objective via ParsimonyEvaluator.
Fields
pop_size::Int: population size, must be even (default: 100)generations::Int: number of generations (default: 200)mutation_rate::Float64: probability of mutation (default: 0.3)crossover_rate::Float64: probability of crossover (default: 0.3)parallel::Bool: enable threaded evaluation (default: true)mutation_ops::Vector{AbstractMutationOperator}: mutation operatorscrossover_ops::Vector{AbstractCrossoverOperator}: crossover operators
Arborist.NSGAII — Method
NSGAII(; kwargs...) -> NSGAIIConstruct an NSGAII algorithm with keyword arguments and sensible defaults.
Arborist.NSGAIIResult — Type
NSGAIIResult{G<:AbstractGenome} <: AbstractEvolutionResultResult returned by solve with NSGAII. Contains the Pareto front, full population, fitness vectors, and convergence history.
Fields
pareto_front::Vector{G}: non-dominated genomes from the final populationpareto_fitnesses::Vector{Vector{Float64}}: fitness vectors for the Pareto frontpopulation::Vector{G}: full final populationall_fitnesses::Vector{Vector{Float64}}: fitness vectors for all individualshypervolume_history::Vector{Float64}: hypervolume of front 1 per generationgenerations_run::Int: number of generations completedwall_time::Float64: elapsed wall-clock time in secondsobjective_names::Vector{String}: human-readable names for each objective
Arborist.ParsimonyEvaluator — Type
ParsimonyEvaluator{E<:AbstractEvaluator} <: AbstractMultiObjectiveEvaluatorWraps a single-objective evaluator to produce two objectives:
- The original fitness (from the wrapped evaluator)
- Genome complexity (from
complexity(genome))
This is the canonical GP multi-objective setup: accuracy vs. parsimony.
Example
inner = TreeFitnessEvaluator(X, y, operators)
evaluator = ParsimonyEvaluator(inner)
problem = GPProblem(evaluator, TreeGenome{Float32}; seed=42)
result = solve(problem, NSGAII(pop_size=100, generations=50))Arborist.MAPElites — Type
MAPElites{F, FB, MO, CO} <: AbstractEvolutionaryAlgorithmIlluminates a user-defined feature space by evolving a grid-based archive.
Fields (constructor kwargs with defaults)
generations::Int = 100: number of outer iterations (each generation samples and evaluatesbatch_sizechildren from the archive).batch_size::Int = 50: mutations produced per generation.n_init::Int = 50: random genomes generated at generation 0 to seed the archive before any sampling happens.feature_fn::F: genome ->NTuple{N, Float64}. Behavior descriptor used to place genomes into the discretized grid.feature_bounds::FB:Vector{Tuple{Float64,Float64}}of per-dimension(low, high)intervals. Features outside the range clamp to the extreme bin.n_bins::Vector{Int}: per-dimension bin count. Total archive capacity isprod(n_bins).mutation_ops::MO: vector of mutation operators (as inGeneticProgramming).crossover_ops::CO: optional vector of crossover operators. If empty, every child is produced by mutation only.crossover_rate::Float64 = 0.0: probability of picking crossover over mutation when both are available.parallel::Bool = true: threaded evaluation of a batch.elitism::Int = 0: placeholder for interface parity; MAP-Elites has archive-based elitism intrinsically, so this is unused.
Arborist.MAPElitesArchive — Type
MAPElitesArchive{G}Grid-based archive for MAP-Elites. Maps a cell index (an N-tuple of integers) to a (genome, fitness) pair. Fitness is lower-is-better, matching the framework convention.
Not thread-safe — MAP-Elites runs its own evaluation batches and serializes archive updates; parallel evaluation happens within a batch, insertion after the batch synchronizes.
Arborist.MAPElitesResult — Type
MAPElitesResult{G}Result returned by solve(problem, ::MAPElites). Carries the final archive and per-generation coverage/QD-score trajectories.
Fields
archive::MAPElitesArchive{G}: final grid.coverage_history::Vector{Float64}: coverage fraction per generation.qd_score_history::Vector{Float64}: qd_score per generation.best_genome::G: the single best (lowest-fitness) genome in the archive.best_fitness::Float64: its fitness.wall_time::Float64: seconds elapsed.generations_run::Int: outer iterations completed.
Arborist.CMAES — Type
CMAES <: AbstractEvolutionaryAlgorithmCovariance Matrix Adaptation Evolution Strategy. Suitable for continuous parameter optimization where the search dimension is fixed at solve time. Pairs with GraphGenome via flatten_weights / unflatten_weights! to optimize connection weights under a frozen topology.
Fields (all kwargs, sensible defaults)
generations::Int = 100: outer iteration count.pop_size::Int = 0: population size λ.0means auto:4 + floor(3 ln n), computed at solve time once n (the parameter count) is known.sigma0::Float64 = 1.0: initial step size.parallel::Bool = true: thread the per-generation evaluations.convergence_threshold::Float64 = -Inf: stop early when best fitness drops below this (lower-is-better convention; default never triggers).seed_genome::Bool = true: when true (default), the starting mean is taken from the problem's initial genome; when false, drawn from N(0, I) scaled by sigma0.
Arborist.AsyncStatusMessage — Type
AsyncStatusMessageStatus update sent from an async island worker to the coordinator. Concrete struct (not NamedTuple) to ensure reliable cross-process serialization.
Arborist.CompleteTopology — Type
CompleteTopology <: AbstractTopologyAll-to-all migration: island i sends to every other island. Fast dissemination of good solutions but rapidly reduces diversity.
Arborist.RandomTopology — Type
RandomTopology <: AbstractTopologyRandom migration: island i sends to n_targets randomly chosen islands. Good pairing with asynchronous evolution where the randomness of timing provides a natural diversity mechanism.
Fields
n_targets::Int: number of destination islands per migration event
Arborist.RingTopology — Type
RingTopology <: AbstractTopologyRing migration: island i sends to island (i % n) + 1. This is the default and most common topology in the literature. Sparse connectivity preserves diversity across islands.
Arborist.BehavioralSpeciation — Type
BehavioralSpeciation <: AbstractSpeciationSpeciation based on behavioral similarity rather than genomic distance. Two individuals are in the same species if their behavioral fingerprints are within threshold of each other under the provided distance function.
This is useful for behavioral optimization problems (e.g., bin packing, game playing) where multiple syntactically distinct programs can implement the same strategy. Genomic distance treats them as different; behavioral distance correctly collapses them.
Fields
fingerprint_fn::Function:(genome) -> fingerprint— computes a behavioral fingerprint. The return type can be anything thatdistance_fncan compare.distance_fn::Function:(a, b) -> Float64— behavioral distance between two fingerprints. Should return 0.0 for identical behavior.threshold::Float64: behavioral distance below which individuals are same-species (default: 0.15)min_species_size::Int: minimum species size before culling (default: 2)stagnation_limit::Int: generations without improvement before culling (default: 15)sharing_formula::Symbol: fitness sharing strength —:linear,:sqrt,:log2, or:none(default::sqrt)
Arborist.NoSpeciation — Type
NoSpeciation <: AbstractSpeciationTrivial speciation strategy: all individuals belong to a single species.
Arborist.ThresholdSpeciation — Type
ThresholdSpeciation <: AbstractSpeciationSpecies individuals by compatibility distance. Individuals are assigned to the first species whose representative has distance(g1, g2) <= threshold. Tracks stagnation per species and culls stagnant species below min_species_size.
The default threshold of 10.0 is appropriate for ExprGenome where distance is a symmetric-difference node count.
Fitness sharing penalizes members of large species to protect structural innovations in small species.
Fields
threshold::Float64: compatibility distance cutoff (default: 10.0)min_species_size::Int: minimum species size to avoid culling (default: 2)stagnation_limit::Int: generations without improvement before culling (default: 15)sharing_formula::Symbol: fitness sharing strength —:linear,:sqrt,:log2, or:none(default::log2)
Arborist.ThresholdSpeciation — Method
ThresholdSpeciation(; threshold=10.0, min_species_size=2, stagnation_limit=15, sharing_formula=:log2)Construct a ThresholdSpeciation with keyword arguments and sensible defaults.
Functions
Arborist.evaluate_multi — Function
evaluate_multi(e::AbstractMultiObjectiveEvaluator, genome::AbstractGenome) -> Vector{Float64}Evaluate a genome on all objectives. Returns a vector of fitness values, one per objective, all minimized (lower is better). Returns a vector of Inf values on evaluation failure.
Arborist.objective_names — Function
objective_names(e::AbstractMultiObjectiveEvaluator) -> Vector{String}Return human-readable names for each objective.
CommonSolve.solve — Method
solve(problem::GPProblem{G,E}, algorithm::NSGAII; verbose=false, callback=nothing) -> NSGAIIResult{G}Run NSGA-II multi-objective evolution. Returns an NSGAIIResult containing the Pareto front, fitness vectors, and hypervolume history.
Arguments
problem::GPProblem{G,E}: problem with a multi-objective evaluatoralgorithm::NSGAII: NSGA-II configuration
Keyword Arguments
verbose::Bool=false: print generation statisticscallback: optional(gen::Int, front_size::Int, hypervolume::Float64) -> nothing
Arborist.coverage — Method
coverage(a::MAPElitesArchive) -> Float64Fraction of bins in the grid that hold a genome. Ranges [0, 1].
Arborist.qd_score — Method
qd_score(a::MAPElitesArchive; worst_fitness=0.0) -> Float64Quality-Diversity score: sum over filled cells of (worst_fitness - f_cell). Rewards archives that are both deeply filled and broad. Since the framework uses lower-is-better fitness, larger QD score means the archive holds better solutions across more bins. Pass worst_fitness = maximum of the task's plausible loss range for a meaningful score.
CommonSolve.solve — Method
solve(problem::GPProblem{G,E}, alg::MAPElites; verbose=false, log=nothing) -> MAPElitesResult{G}Run MAP-Elites. Requires problem.evaluator to implement evaluate_genome(g, e). The feature function and grid are supplied via alg. Seeds the archive with alg.n_init random individuals via the problem's _initialize_population path, then alternates generations of (sample parent from archive → mutate/crossover → evaluate → maybe insert).
log::Union{Nothing, RunLog} records per-generation coverage and qd_score alongside the standard fitness stats (best from archive as a scalar).
Arborist.flatten_weights — Function
flatten_weights(g::AbstractGenome) -> Vector{Float64}Return the genome's continuous parameters as a Float64 vector. Defined for GraphGenome over its enabled connection weights, sorted by innovation number for determinism.
Raises MethodError for genome types that don't expose a continuous parameter vector — those genomes can't be searched by CMA-ES directly.
Arborist.unflatten_weights! — Function
unflatten_weights!(g::AbstractGenome, w::Vector{Float64}) -> AbstractGenomeInverse of flatten_weights: writes w back into g's continuous parameters. Mutates g and returns it. Length of w must match length(flatten_weights(g)).
CommonSolve.solve — Method
solve(problem::GPProblem{G,E}, alg::CMAES; verbose=false, log=nothing) -> GPResult{G}Run CMA-ES against the problem's evaluator on a fixed-topology genome. Requires flatten_weights(g) / unflatten_weights!(g, w) to be defined for G. The initial genome (constructed via _initialize_population) provides both the topology and, when alg.seed_genome=true, the starting mean of the search distribution.
Returns a GPResult{G} with the best genome found.
Arborist.setup_workers — Method
setup_workers(script_path::AbstractString)Convenience: load a user script on all workers. Equivalent to:
@everywhere include(abspath("your_script.jl"))Also ensures Arborist is loaded on all workers. The primary documented approach is @everywhere include(...) directly.
Arborist.behavioral_initialize — Method
behavioral_initialize(state, evaluator, fingerprint_fn, distance_fn,
target_size; kwargs...) -> Vector{ExprGenome}Generate a behaviorally diverse initial population using a MAP-Elites-style procedure:
- Generate
pool_sizerandom programs - Evaluate each, discard crashed/timed-out programs (fitness == Inf)
- Compute behavioral fingerprints
- Greedily bin by fingerprint distance (threshold-based clustering)
- Select the best-fitness representative from each bin
- Return
target_sizegenomes drawn from distinct bins
This replaces the standard random initialization with a pool that guarantees behavioral diversity — programs that do different things, not just look different syntactically.
Arguments
state::GenState: the genome state (function set, variable types, RNG)evaluator::AbstractEvaluator: for fitness evaluationfingerprint_fn:genome -> fingerprint(any type; same as BehavioralSpeciation)distance_fn:(fp_a, fp_b) -> Float64(same as BehavioralSpeciation)target_size::Int: desired population size
Keyword Arguments
pool_size::Int = target_size * 50: number of random programs to generatebin_threshold::Float64 = 0.1: distance below which two fingerprints are considered the same behaviorbody_generator = nothing: optional(state) -> Vector{Expr}for custom random program generation. Default: 3-7 random statements viacreate_random_statement.bloat_penalty::Float64 = 0.0: bloat penalty for evaluationparallel::Bool = false: use threads for evaluationverbose::Bool = false: print progress
Returns
A Vector{ExprGenome} of length target_size with diverse behaviors. If fewer than target_size distinct behaviors are found in the pool, remaining slots are filled with the best-fitness programs from the pool.
CommonSolve.solve — Method
solve(problem::GPProblem{G,E}, algorithm::GeneticProgramming;
verbose=false, callback=nothing, log=nothing,
checkpoint_every=0, checkpoint_path=nothing,
resume_from=nothing, allow_signature_mismatch=false,
initial_population=nothing, hall_of_fame_size=0) -> GPResult{G}Run a genetic programming evolution using the specified problem and algorithm configuration. Returns a GPResult containing the best genome, fitness history, and run metadata.
Arguments
problem::GPProblem{G,E}: the problem specification (evaluator, genome type, function set)algorithm::GeneticProgramming: the algorithm configuration
Keyword Arguments
verbose::Bool=false: if true, print generation statisticscallback: optional callback function(gen::Int, best_fitness::Float64, best_genome::G) -> nothinglog::Union{Nothing, RunLog}=nothing: optional structured per-generation log. When provided,record!is called once per generation with aggregate fitness, speciation snapshot, structural diversity, and wall-time.checkpoint_every::Int=0: write a checkpoint every N generations.0disables checkpointing. When> 0,checkpoint_pathmust also be set. Supported onExprGenomeandTreeGenomeonly.checkpoint_path::Union{Nothing, AbstractString}=nothing: file path the periodic checkpoints (and the final-generation checkpoint) are written to viasave_checkpoint. Writes are atomic (write-tmp + rename).resume_from::Union{Nothing, AbstractString}=nothing: path to a checkpoint produced by a priorsolve. Loads the population, generation counter, RNG state, fitness history, and all-time best from the checkpoint and continues from there. Mutually exclusive withinitial_population.allow_signature_mismatch::Bool=false: by default,resume_fromrejects a checkpoint whose_algorithm_signaturedoes not match the currentalgorithm. Passtrueto override intentionally (e.g., changing hyperparameters mid-run).initial_population::Union{Nothing, Vector{<:AbstractGenome}}=nothing: warm- start the GA from a user-provided seed pool. Length must equalalgorithm.pop_sizeand element type must matchG. Mutually exclusive withresume_from.hall_of_fame_size::Int=0: when> 0, attach aHallOfFame{G}archive of this capacity to the result. The archive tracks the top-K distinct fitnesses observed across all generations (best-first), surviving elitism loss. The archive is exposed asresult.hall_of_fame.0(default) disables the archive and leavesresult.hall_of_fame === nothing.
CommonSolve.solve — Method
solve(problem::GPProblem{G,E}, algorithm::IslandModel; verbose=false, callback=nothing) -> GPResult{G}Run an island-model evolution with multiple independent populations and periodic migration. Islands run sequentially (no threading). Migration uses a ring topology.
Returns a GPResult containing the global best genome across all islands.
Arborist.migration_targets — Method
migration_targets(topology, i, n_islands, rng) -> Vector{Int}Return destination island indices for emigrants from island i. The rng argument is required by the interface but ignored by deterministic topologies (Ring, Complete).
Arborist.apply_sharing — Method
apply_sharing(raw_fitness::Float64, species_size::Int, formula::Symbol) -> Float64Apply fitness sharing to penalize members of large species (for minimization: lower = better). A larger penalty means more diversity pressure.
Formula options:
:none— no sharing (raw fitness unchanged):linear— multiply by species_size (strong; NEAT default for topology protection):sqrt— multiply by sqrt(species_size) (moderate; good for behavioral optimization):log2— multiply by (1 + log2(species_size)) (gentle; singletons unpenalized)