Algorithms — API Reference

Autogenerated reference for every exported algorithm type (GeneticProgramming, IslandModel, NSGAII, CMAES, MAPElites), the speciation strategies, migration topologies, and the shared solve entry points.

Types

Arborist.GeneticProgramming — Type

GeneticProgramming <: AbstractEvolutionaryAlgorithm

Standard genetic programming algorithm configuration. All fields are set via keyword arguments with sensible defaults.

Fields

pop_size::Int: population size (default: 100)
generations::Int: number of generations to run (default: 200)
mutation_rate::Float64: probability of mutation per offspring (default: 0.3)
crossover_rate::Float64: probability of crossover per offspring pair (default: 0.3)
elitism::Int: number of top individuals carried forward unchanged (default: 2)
bloat_penalty::Float64: coefficient on complexity(g) added to fitness (default: 0.0)
parallel::Bool: enable threaded population evaluation (default: true)
speciation::AbstractSpeciation: speciation strategy (default: NoSpeciation())
mutation_ops::Vector{AbstractMutationOperator}: mutation operators
crossover_ops::Vector{AbstractCrossoverOperator}: crossover operators
selection::AbstractSelectionStrategy: selection strategy
convergence_threshold::Float64: best fitness below this value sets converged=true in result (default: Inf, meaning never converged)
constant_sampler::Union{Function, Nothing}: callable (rng) -> T used by TreeGenome when a new numeric-literal leaf is created (Koza-style Ephemeral Random Constants). nothing preserves the default T(randn(rng)). See erc_uniform for a helper that builds the canonical uniform-over-range sampler.

source

Arborist.GeneticProgramming — Method

GeneticProgramming(; kwargs...) -> GeneticProgramming

Construct a GeneticProgramming algorithm with keyword arguments and sensible defaults.

source

Arborist.IslandModel — Type

IslandModel <: AbstractEvolutionaryAlgorithm

Island model that runs multiple independent populations (islands) with periodic migration. Supports pluggable topologies and optional distributed execution where each island runs in its own worker process.

Fields

n_islands::Int: number of islands (default: 4)
island_algorithm::GeneticProgramming: algorithm for each island
migration_interval::Int: generations between migrations (default: 10)
migration_size::Int: number of individuals to migrate per event (default: 2)
topology::AbstractTopology: migration topology (default: RingTopology())
distributed::Bool: run each island in a separate worker process (default: false)
async::Bool: asynchronous evolution — islands evolve independently (default: false, requires distributed=true)

source

Arborist.IslandModel — Method

IslandModel(; n_islands=4, island_algorithm=GeneticProgramming(),
              migration_interval=10, migration_size=2,
              topology=RingTopology(), distributed=false, async=false)

Construct an IslandModel with keyword arguments and sensible defaults. When distributed=true, each island runs in its own worker process via Distributed.jl, eliminating @eval contention between islands.

source

Arborist.AbstractMultiObjectiveEvaluator — Type

AbstractMultiObjectiveEvaluator <: AbstractEvaluator

Base type for multi-objective fitness evaluators. All objectives are minimized (lower is better), consistent with the rest of Arborist.

Concrete subtypes must implement:

evaluate_multi(e, genome::AbstractGenome) -> Vector{Float64}
objective_names(e) -> Vector{String}
input_signature(e) and output_signature(e) (inherited from AbstractEvaluator)

source

Arborist.NSGAII — Type

NSGAII <: AbstractEvolutionaryAlgorithm

NSGA-II (Non-dominated Sorting Genetic Algorithm II) for multi-objective genetic programming. Uses Pareto non-dominated sorting and crowding distance for selection, with a (mu+lambda) replacement strategy.

No elitism field — NSGA-II's combined parent+offspring sorting IS the elitism mechanism. No speciation — crowding distance maintains diversity. No bloat_penalty — complexity should be an explicit objective via ParsimonyEvaluator.

Fields

pop_size::Int: population size, must be even (default: 100)
generations::Int: number of generations (default: 200)
mutation_rate::Float64: probability of mutation (default: 0.3)
crossover_rate::Float64: probability of crossover (default: 0.3)
parallel::Bool: enable threaded evaluation (default: true)
mutation_ops::Vector{AbstractMutationOperator}: mutation operators
crossover_ops::Vector{AbstractCrossoverOperator}: crossover operators

source

Arborist.NSGAII — Method

NSGAII(; kwargs...) -> NSGAII

Construct an NSGAII algorithm with keyword arguments and sensible defaults.

source

Arborist.NSGAIIResult — Type

NSGAIIResult{G<:AbstractGenome} <: AbstractEvolutionResult

Result returned by solve with NSGAII. Contains the Pareto front, full population, fitness vectors, and convergence history.

Fields

pareto_front::Vector{G}: non-dominated genomes from the final population
pareto_fitnesses::Vector{Vector{Float64}}: fitness vectors for the Pareto front
population::Vector{G}: full final population
all_fitnesses::Vector{Vector{Float64}}: fitness vectors for all individuals
hypervolume_history::Vector{Float64}: hypervolume of front 1 per generation
generations_run::Int: number of generations completed
wall_time::Float64: elapsed wall-clock time in seconds
objective_names::Vector{String}: human-readable names for each objective

source

Arborist.ParsimonyEvaluator — Type

ParsimonyEvaluator{E<:AbstractEvaluator} <: AbstractMultiObjectiveEvaluator

Wraps a single-objective evaluator to produce two objectives:

The original fitness (from the wrapped evaluator)
Genome complexity (from complexity(genome))

This is the canonical GP multi-objective setup: accuracy vs. parsimony.

Example

inner = TreeFitnessEvaluator(X, y, operators)
evaluator = ParsimonyEvaluator(inner)
problem = GPProblem(evaluator, TreeGenome{Float32}; seed=42)
result = solve(problem, NSGAII(pop_size=100, generations=50))

source

Arborist.MAPElites — Type

MAPElites{F, FB, MO, CO} <: AbstractEvolutionaryAlgorithm

Illuminates a user-defined feature space by evolving a grid-based archive.

Fields (constructor kwargs with defaults)

generations::Int = 100: number of outer iterations (each generation samples and evaluates batch_size children from the archive).
batch_size::Int = 50: mutations produced per generation.
n_init::Int = 50: random genomes generated at generation 0 to seed the archive before any sampling happens.
feature_fn::F: genome -> NTuple{N, Float64}. Behavior descriptor used to place genomes into the discretized grid.
feature_bounds::FB: Vector{Tuple{Float64,Float64}} of per-dimension (low, high) intervals. Features outside the range clamp to the extreme bin.
n_bins::Vector{Int}: per-dimension bin count. Total archive capacity is prod(n_bins).
mutation_ops::MO: vector of mutation operators (as in GeneticProgramming).
crossover_ops::CO: optional vector of crossover operators. If empty, every child is produced by mutation only.
crossover_rate::Float64 = 0.0: probability of picking crossover over mutation when both are available.
parallel::Bool = true: threaded evaluation of a batch.
elitism::Int = 0: placeholder for interface parity; MAP-Elites has archive-based elitism intrinsically, so this is unused.

source

Arborist.MAPElitesArchive — Type

MAPElitesArchive{G}

Grid-based archive for MAP-Elites. Maps a cell index (an N-tuple of integers) to a (genome, fitness) pair. Fitness is lower-is-better, matching the framework convention.

Not thread-safe — MAP-Elites runs its own evaluation batches and serializes archive updates; parallel evaluation happens within a batch, insertion after the batch synchronizes.

source

Arborist.MAPElitesResult — Type

MAPElitesResult{G}

Result returned by solve(problem, ::MAPElites). Carries the final archive and per-generation coverage/QD-score trajectories.

Fields

archive::MAPElitesArchive{G}: final grid.
coverage_history::Vector{Float64}: coverage fraction per generation.
qd_score_history::Vector{Float64}: qd_score per generation.
best_genome::G: the single best (lowest-fitness) genome in the archive.
best_fitness::Float64: its fitness.
wall_time::Float64: seconds elapsed.
generations_run::Int: outer iterations completed.

source

Arborist.CMAES — Type

CMAES <: AbstractEvolutionaryAlgorithm

Covariance Matrix Adaptation Evolution Strategy. Suitable for continuous parameter optimization where the search dimension is fixed at solve time. Pairs with GraphGenome via flatten_weights / unflatten_weights! to optimize connection weights under a frozen topology.

Fields (all kwargs, sensible defaults)

generations::Int = 100: outer iteration count.
pop_size::Int = 0: population size λ. 0 means auto: 4 + floor(3 ln n), computed at solve time once n (the parameter count) is known.
sigma0::Float64 = 1.0: initial step size.
parallel::Bool = true: thread the per-generation evaluations.
convergence_threshold::Float64 = -Inf: stop early when best fitness drops below this (lower-is-better convention; default never triggers).
seed_genome::Bool = true: when true (default), the starting mean is taken from the problem's initial genome; when false, drawn from N(0, I) scaled by sigma0.

source

Arborist.AsyncStatusMessage — Type

AsyncStatusMessage

Status update sent from an async island worker to the coordinator. Concrete struct (not NamedTuple) to ensure reliable cross-process serialization.

source

Arborist.CompleteTopology — Type

CompleteTopology <: AbstractTopology

All-to-all migration: island i sends to every other island. Fast dissemination of good solutions but rapidly reduces diversity.

source

Arborist.RandomTopology — Type

RandomTopology <: AbstractTopology

Random migration: island i sends to n_targets randomly chosen islands. Good pairing with asynchronous evolution where the randomness of timing provides a natural diversity mechanism.

Fields

n_targets::Int: number of destination islands per migration event

source

Arborist.RingTopology — Type

RingTopology <: AbstractTopology

Ring migration: island i sends to island (i % n) + 1. This is the default and most common topology in the literature. Sparse connectivity preserves diversity across islands.

source

Arborist.BehavioralSpeciation — Type

BehavioralSpeciation <: AbstractSpeciation

Speciation based on behavioral similarity rather than genomic distance. Two individuals are in the same species if their behavioral fingerprints are within threshold of each other under the provided distance function.

This is useful for behavioral optimization problems (e.g., bin packing, game playing) where multiple syntactically distinct programs can implement the same strategy. Genomic distance treats them as different; behavioral distance correctly collapses them.

Fields

fingerprint_fn::Function: (genome) -> fingerprint — computes a behavioral fingerprint. The return type can be anything that distance_fn can compare.
distance_fn::Function: (a, b) -> Float64 — behavioral distance between two fingerprints. Should return 0.0 for identical behavior.
threshold::Float64: behavioral distance below which individuals are same-species (default: 0.15)
min_species_size::Int: minimum species size before culling (default: 2)
stagnation_limit::Int: generations without improvement before culling (default: 15)
sharing_formula::Symbol: fitness sharing strength — :linear, :sqrt, :log2, or :none (default: :sqrt)

source

Arborist.NoSpeciation — Type

NoSpeciation <: AbstractSpeciation

Trivial speciation strategy: all individuals belong to a single species.

source

Arborist.ThresholdSpeciation — Type

ThresholdSpeciation <: AbstractSpeciation

Species individuals by compatibility distance. Individuals are assigned to the first species whose representative has distance(g1, g2) <= threshold. Tracks stagnation per species and culls stagnant species below min_species_size.

The default threshold of 10.0 is appropriate for ExprGenome where distance is a symmetric-difference node count.

Fitness sharing penalizes members of large species to protect structural innovations in small species.

Fields

threshold::Float64: compatibility distance cutoff (default: 10.0)
min_species_size::Int: minimum species size to avoid culling (default: 2)
stagnation_limit::Int: generations without improvement before culling (default: 15)
sharing_formula::Symbol: fitness sharing strength — :linear, :sqrt, :log2, or :none (default: :log2)

source

Arborist.ThresholdSpeciation — Method

ThresholdSpeciation(; threshold=10.0, min_species_size=2, stagnation_limit=15, sharing_formula=:log2)

Construct a ThresholdSpeciation with keyword arguments and sensible defaults.

source

Functions

Arborist.evaluate_multi — Function

evaluate_multi(e::AbstractMultiObjectiveEvaluator, genome::AbstractGenome) -> Vector{Float64}

Evaluate a genome on all objectives. Returns a vector of fitness values, one per objective, all minimized (lower is better). Returns a vector of Inf values on evaluation failure.

source

Arborist.objective_names — Function

objective_names(e::AbstractMultiObjectiveEvaluator) -> Vector{String}

Return human-readable names for each objective.

source

CommonSolve.solve — Method

solve(problem::GPProblem{G,E}, algorithm::NSGAII; verbose=false, callback=nothing) -> NSGAIIResult{G}

Run NSGA-II multi-objective evolution. Returns an NSGAIIResult containing the Pareto front, fitness vectors, and hypervolume history.

Arguments

problem::GPProblem{G,E}: problem with a multi-objective evaluator
algorithm::NSGAII: NSGA-II configuration

Keyword Arguments

verbose::Bool=false: print generation statistics
callback: optional (gen::Int, front_size::Int, hypervolume::Float64) -> nothing

source

Arborist.coverage — Method

coverage(a::MAPElitesArchive) -> Float64

Fraction of bins in the grid that hold a genome. Ranges [0, 1].

source

Arborist.qd_score — Method

qd_score(a::MAPElitesArchive; worst_fitness=0.0) -> Float64

Quality-Diversity score: sum over filled cells of (worst_fitness - f_cell). Rewards archives that are both deeply filled and broad. Since the framework uses lower-is-better fitness, larger QD score means the archive holds better solutions across more bins. Pass worst_fitness = maximum of the task's plausible loss range for a meaningful score.

source

CommonSolve.solve — Method

solve(problem::GPProblem{G,E}, alg::MAPElites; verbose=false, log=nothing) -> MAPElitesResult{G}

Run MAP-Elites. Requires problem.evaluator to implement evaluate_genome(g, e). The feature function and grid are supplied via alg. Seeds the archive with alg.n_init random individuals via the problem's _initialize_population path, then alternates generations of (sample parent from archive → mutate/crossover → evaluate → maybe insert).

log::Union{Nothing, RunLog} records per-generation coverage and qd_score alongside the standard fitness stats (best from archive as a scalar).

source

Arborist.flatten_weights — Function

flatten_weights(g::AbstractGenome) -> Vector{Float64}

Return the genome's continuous parameters as a Float64 vector. Defined for GraphGenome over its enabled connection weights, sorted by innovation number for determinism.

Raises MethodError for genome types that don't expose a continuous parameter vector — those genomes can't be searched by CMA-ES directly.

source

Arborist.unflatten_weights! — Function

unflatten_weights!(g::AbstractGenome, w::Vector{Float64}) -> AbstractGenome

Inverse of flatten_weights: writes w back into g's continuous parameters. Mutates g and returns it. Length of w must match length(flatten_weights(g)).

source

CommonSolve.solve — Method

solve(problem::GPProblem{G,E}, alg::CMAES; verbose=false, log=nothing) -> GPResult{G}

Run CMA-ES against the problem's evaluator on a fixed-topology genome. Requires flatten_weights(g) / unflatten_weights!(g, w) to be defined for G. The initial genome (constructed via _initialize_population) provides both the topology and, when alg.seed_genome=true, the starting mean of the search distribution.

Returns a GPResult{G} with the best genome found.

source

Arborist.setup_workers — Method

setup_workers(script_path::AbstractString)

Convenience: load a user script on all workers. Equivalent to:

@everywhere include(abspath("your_script.jl"))

Also ensures Arborist is loaded on all workers. The primary documented approach is @everywhere include(...) directly.

source

Arborist.behavioral_initialize — Method

behavioral_initialize(state, evaluator, fingerprint_fn, distance_fn,
                      target_size; kwargs...) -> Vector{ExprGenome}

Generate a behaviorally diverse initial population using a MAP-Elites-style procedure:

Generate pool_size random programs
Evaluate each, discard crashed/timed-out programs (fitness == Inf)
Compute behavioral fingerprints
Greedily bin by fingerprint distance (threshold-based clustering)
Select the best-fitness representative from each bin
Return target_size genomes drawn from distinct bins

This replaces the standard random initialization with a pool that guarantees behavioral diversity — programs that do different things, not just look different syntactically.

Arguments

state::GenState: the genome state (function set, variable types, RNG)
evaluator::AbstractEvaluator: for fitness evaluation
fingerprint_fn: genome -> fingerprint (any type; same as BehavioralSpeciation)
distance_fn: (fp_a, fp_b) -> Float64 (same as BehavioralSpeciation)
target_size::Int: desired population size

Keyword Arguments

pool_size::Int = target_size * 50: number of random programs to generate
bin_threshold::Float64 = 0.1: distance below which two fingerprints are considered the same behavior
body_generator = nothing: optional (state) -> Vector{Expr} for custom random program generation. Default: 3-7 random statements via create_random_statement.
bloat_penalty::Float64 = 0.0: bloat penalty for evaluation
parallel::Bool = false: use threads for evaluation
verbose::Bool = false: print progress

Returns

A Vector{ExprGenome} of length target_size with diverse behaviors. If fewer than target_size distinct behaviors are found in the pool, remaining slots are filled with the best-fitness programs from the pool.

source

CommonSolve.solve — Method

solve(problem::GPProblem{G,E}, algorithm::GeneticProgramming;
      verbose=false, callback=nothing, log=nothing,
      checkpoint_every=0, checkpoint_path=nothing,
      resume_from=nothing, allow_signature_mismatch=false,
      initial_population=nothing, hall_of_fame_size=0) -> GPResult{G}

Run a genetic programming evolution using the specified problem and algorithm configuration. Returns a GPResult containing the best genome, fitness history, and run metadata.

Arguments

problem::GPProblem{G,E}: the problem specification (evaluator, genome type, function set)
algorithm::GeneticProgramming: the algorithm configuration

Keyword Arguments

verbose::Bool=false: if true, print generation statistics
callback: optional callback function (gen::Int, best_fitness::Float64, best_genome::G) -> nothing
log::Union{Nothing, RunLog}=nothing: optional structured per-generation log. When provided, record! is called once per generation with aggregate fitness, speciation snapshot, structural diversity, and wall-time.
checkpoint_every::Int=0: write a checkpoint every N generations. 0 disables checkpointing. When > 0, checkpoint_path must also be set. Supported on ExprGenome and TreeGenome only.
checkpoint_path::Union{Nothing, AbstractString}=nothing: file path the periodic checkpoints (and the final-generation checkpoint) are written to via save_checkpoint. Writes are atomic (write-tmp + rename).
resume_from::Union{Nothing, AbstractString}=nothing: path to a checkpoint produced by a prior solve. Loads the population, generation counter, RNG state, fitness history, and all-time best from the checkpoint and continues from there. Mutually exclusive with initial_population.
allow_signature_mismatch::Bool=false: by default, resume_from rejects a checkpoint whose _algorithm_signature does not match the current algorithm. Pass true to override intentionally (e.g., changing hyperparameters mid-run).
initial_population::Union{Nothing, Vector{<:AbstractGenome}}=nothing: warm- start the GA from a user-provided seed pool. Length must equal algorithm.pop_size and element type must match G. Mutually exclusive with resume_from.
hall_of_fame_size::Int=0: when > 0, attach a HallOfFame{G} archive of this capacity to the result. The archive tracks the top-K distinct fitnesses observed across all generations (best-first), surviving elitism loss. The archive is exposed as result.hall_of_fame. 0 (default) disables the archive and leaves result.hall_of_fame === nothing.

source

CommonSolve.solve — Method

solve(problem::GPProblem{G,E}, algorithm::IslandModel; verbose=false, callback=nothing) -> GPResult{G}

Run an island-model evolution with multiple independent populations and periodic migration. Islands run sequentially (no threading). Migration uses a ring topology.

Returns a GPResult containing the global best genome across all islands.

source

Arborist.migration_targets — Method

migration_targets(topology, i, n_islands, rng) -> Vector{Int}

Return destination island indices for emigrants from island i. The rng argument is required by the interface but ignored by deterministic topologies (Ring, Complete).

source

Arborist.apply_sharing — Method

apply_sharing(raw_fitness::Float64, species_size::Int, formula::Symbol) -> Float64

Apply fitness sharing to penalize members of large species (for minimization: lower = better). A larger penalty means more diversity pressure.

Formula options:

:none — no sharing (raw fitness unchanged)
:linear — multiply by species_size (strong; NEAT default for topology protection)
:sqrt — multiply by sqrt(species_size) (moderate; good for behavioral optimization)
:log2 — multiply by (1 + log2(species_size)) (gentle; singletons unpenalized)

source

Algorithms — API Reference

Types

Functions

Constants