biota

Architecture

biota is a distributed quality-diversity search platform for Flow-Lenia. Two subsystems share the same vectorized PyTorch runtime and the same Ray cluster: a MAP-Elites search that fills a behavioral archive, and an ecosystem dispatch that runs selected archive creatures on shared canvases. Both run on a homelab 3-node RTX 5060 Ti cluster.

archive capacity

1,024 centroids

archive type

CVT-MAP-Elites

descriptor library

18 built-in axes

MAP-Elites search

MAP-Elites runs a loop: sample parameters, simulate a creature, measure its behavior, insert it into the matching archive cell if it beats whatever is already there. After thousands of rollouts the archive fills with creatures covering the behavioral space as broadly as possible — not optimizing toward one solution, but toward diversity.

The archive is a CVT-MAP-Elites archive. Before search begins, a calibration phase runs a small number of rollouts and fits k-means centroids to the observed descriptor distribution. Archive cells are Voronoi regions around those centroids — dense where creatures naturally live in behavioral space, sparse where they do not. Each cell holds the highest-quality creature found in its region.

The driver owns the archive and the search loop. Each Ray task evaluates B creatures as a single (B, H, W) vectorized forward pass on one GPU. Workers are stateless; nothing persistent lives on the cluster between tasks. --workers N controls how many batches are in flight simultaneously: one means synchronous MAP-Elites with a maximally fresh archive, higher values trade freshness for throughput.

Quality metric

Every rollout passes three hard filters before it can enter the archive. First, the conserved quantity (mass + signal for signal runs, mass alone otherwise) must stay within [0.5, 2.0] of its initial value — creatures that explode or die are rejected. Second, the mean bounding-box fraction over the trailing window must be below 0.6 — creatures that spread across the entire grid are rejected. Third, the three behavioral descriptors measured over adjacent 50-step windows must not drift by more than 20% of the observed descriptor range — creatures that are still changing shape or speed are rejected.

Survivors are ranked by a quality score that creates genuine selection pressure within the viable population:

quality score (non-signal) q = 0.6 \cdot compactness + 0.4 \cdot stability quality score (signal) q = 0.5 \cdot compactness + 0.3 \cdot stability + 0.2 \cdot signal activity

compactness — two-point minimum compactness = min( compact(state T/2), compact(state T) ) where compact(s) = mass inside bounding box / total mass stability — continuous drift score stability = clip( 1 - drift / 0.2, 0, 1 ) where drift = max descriptor change across adjacent 50-step windows signal activity — signal runs only signal activity = clip( final_signal_mass / initial_signal_mass, 0, 1 )

The two-point compactness term is the key addition over a naive single-snapshot metric. Almost all viable Flow-Lenia solitons score above 0.95 compactness at the final step, making a single-point measurement nearly constant across the population and providing no selection pressure. By taking the minimum of the midpoint and final compactness, creatures that peak early and gradually become diffuse score below creatures that maintain their structure throughout — exactly the property that matters for long ecosystem runs.

The stability term converts the binary persistent filter into a continuous reward: a creature that barely passes (drift = 0.19) scores near zero, while a rock-stable creature (drift = 0.01) scores near 1.0. This ensures the archive prioritises behaviorally consistent creatures over lucky borderline survivors.

Ecosystem dispatch

Once the archive is populated, biota ecosystem takes specific archive cells and runs them on a shared grid to see how creatures interact. A homogeneous run spawns N copies of one species. A heterogeneous run mixes two or more species, each with its own full parameter set — kernel radii, growth windows, weights — using species-indexed LocalizedFlowLenia: per-cell species ownership tracks which lineage owns the local mass, blends growth fields by ownership, and advects with the flow.

After the simulation, a suite of spatial observables is computed from the captured snapshots without re-running the simulation. For heterogeneous runs: patch count per species over time, interface area and center-of-mass distance per species pair, and spatial entropy per species. For homogeneous runs: patch count over time, spatial entropy, and patch size distribution. Interaction coefficients between species are gated to snapshot windows where species actually co-occur, so they measure contact dynamics rather than spatial separation. A temporal outcome classifier assigns per-species labeled windows — coexistence, exclusion, merger, or fragmentation for heterogeneous runs; stable isolation, full merger, partial clustering, cannibalism, or fragmentation for homogeneous runs — and derives a dominant run-level label shown as a badge. The ecosystem viewer renders all charts alongside the animated GIF.

Ecosystem dispatch is Ray-correct in both directions: each experiment is a self-contained payload. The driver loads creatures from its local archive and ships them as part of the task input; workers simulate and render to bytes; the driver materializes outputs to its own local filesystem. No shared filesystem is assumed at any step, so experiments run correctly on real multi-node clusters without NFS or rsync setup.

Signal field

An optional chemical communication layer adds a shared (H, W, 16) signal field to the simulation. When --signal-field is passed to biota search, eight additional parameters become searchable per creature across 16 independent channels:

per-creature signal parameters emission_vector (C,) in [0, 1] # how emitted signal distributes across channels receptor_profile (C,) in [−1, 1] # channel weights; negative = inhibitory (GABA-like) emission_rate scalar in [0.001, 0.05] # fraction of G&sup+; converted to signal per step decay_rates (C,) in [0, 0.9] # per-channel dissipation; low = long-range gradients signal_kernel_r / a / b / w # spatial diffusion kernel (per-creature)

One simulation step: (1) convolve mass to get growth G(H,W); (2) convolve signal field with signal kernel; (3) compute reception: dot(convolved_signal, receptor_profile) — negative receptor weights are inhibitory; (4) apply alpha_coupling: G × (1 + α × reception) clamped to [0,∞) — positive α amplifies growth where signal is favorable, including inside other species’ territory (the cross-species predation pathway); negative α is chemorepulsion; (5) modulate emission rate via beta_modulation: rate × (1 + β × mean(reception)) — positive β = quorum sensing, negative β = feedback inhibition; (6) emit G&sup+ × rate_eff × emission_vector, draining mass into signal field; (7) reintegrate mass via Flow-Lenia advection; (8) decay signal per-channel at decay_rates. Total conserved: mass + signal.

Searches with --signal-field automatically use 800 steps (vs 500 for standard) so emission and reception dynamics have time to build meaningful gradients. The quality metric gains a signal activity term: clip(final_signal_mass / initial_signal_mass, 0, 1) weighted at 0.2, rewarding creatures that maintain or build up signal mass rather than letting the background field decay. A creature mass floor of 0.2 × initial_mass is enforced as a hard filter. In ecosystem runs, all species share one signal field: each species emits and decays by its own parameters, ownership-weighted. Signal archives and non-signal archives cannot be mixed in a single ecosystem run.

The ecosystem viewer exposes signal-specific observables: total signal mass history, signal mass fraction per step (signal / (mass + signal)), receptor alignment per species per snapshot (dot(receptor_profile, mean signal received in territory)), and an emission-reception compatibility matrix (dot(emission_vector[i], receptor_profile[j])) showing which species pairs have chemically compatible signal profiles.

Behavioral descriptors

Three scalars measured from each rollout index the archive's three axes. The active set is chosen per run — any three from the built-in library of eighteen (15 general + 3 signal-only).

velocity

Mean COM displacement per step over the trailing 50 steps. Separates drifters from stationary creatures.

gyradius

Mass-weighted RMS distance from the center of mass. Separates compact dots from spread-out rings and lattices.

spectral entropy

Shannon entropy of the radially-averaged FFT spectrum. Separates smooth blobs from sharp-edged structured creatures.

oscillation

Variance of bounding-box area over the trace tail. Separates pulsing, breathing creatures from rigid translators.

compactness

Fraction of total mass inside the bounding box. Separates tight, well-defined creatures from diffuse scattered ones.

mass asymmetry

Directional bias of COM motion. Separates straight-line movers from orbiters and creatures with circular trajectories.

PNG compressibility

Compressed-to-raw size ratio of the final state. Low = smooth and boring, high = noisy, middle = structured and interesting.

rotational symmetry

Angular variance of mass around the center of mass. Low = rings and disks, high = dumbbells, L-shapes, asymmetric gliders.

persistence score

Maximum descriptor drift across the trace tail. Low = behaviorally stable over time, high = the creature is still changing.

displacement ratio

Total displacement divided by total path length over the trace tail. Zero = pure orbiter, one = perfect straight-line glider.

angular velocity

Mean absolute angular speed of COM motion over the trace tail. Separates rotors and orbiters from translators and stationary creatures.

growth gradient

Mass-weighted mean spatial gradient magnitude at the final step. Low = smooth symmetric creature, high = labyrinthine internal structure.

morphological instability

Variance of gyradius over the trace tail. Low = rigid stable form, high = creature constantly reshapes or fragments over time.

activity

Mean absolute gyradius change per step. Measures internal work rate — static creatures score near zero, pulsing or morphing ones score high.

spatial entropy

Shannon entropy of mass distribution over a coarse spatial grid. Low = compact localized mass, high = diffuse or multi-body spread.

signal field variance signal only

Spatial variance of the total signal field (summed across channels) at the end of the rollout. High = signal is concentrated and localized near the creature body. Low = diffused evenly or barely accumulated. Captures the creature’s chemical footprint structure. Zero for non-signal rollouts.

signal mass ratio signal only

Final signal field mass divided by initial signal mass. Measures how much chemical substance has accumulated relative to the background field. A pure listener scores near 1 (field unchanged); a broadcaster scores high (field built up). Zero for non-signal rollouts.

dominant channel fraction signal only

Fraction of total signal mass carried by the channel with the highest accumulation. High = chemical specialist (strongly emits one channel). Low = generalist (spreads evenly across channels). Captures the effective emission_vector direction. Returns 1/C for non-signal rollouts.

Run your own atlas

biota is a self-contained CLI tool. You can run it on a laptop for quick iteration or across a Ray cluster for full-scale searches. The only hard dependencies are Python 3.12, PyTorch, and Ray.

Install

# Clone and install
git clone https://github.com/rkv0id/biota
cd biota
uv sync

Quick start

Run a search

The dev preset (64×64 grid, 200 steps) is fast enough for a laptop. No Ray needed.

uv run biota search --preset dev --budget 200

Build the viewer

Generates a self-contained HTML file for each run, plus an index.

python scripts/build_index.py
open runs/index.html

GPU and cluster

On Apple Silicon pass --device mps --batch-size 32 for a meaningful speedup. On a CUDA cluster, use --batch-size 64 --workers N where N is the number of nodes. The standard preset (192×192, 300 steps) at B=64 on an RTX 5060 Ti cluster runs 500 rollouts in about 97 seconds.

# Apple Silicon
uv run biota search --preset standard --budget 500 \
    --device mps --batch-size 32

# Single CUDA GPU, no Ray
uv run biota search --preset standard --budget 500 \
    --device cuda --batch-size 64

# Multi-node Ray cluster, custom descriptor axes
uv run biota search --ray-address head:6379 \
    --preset standard --budget 2000 \
    --device cuda --batch-size 64 --workers 3 \
    --descriptors oscillation,compactness,png_compressibility

# Signal field: adds emission, receptor, and kernel parameters to the search
uv run biota search --ray-address head:6379 \
    --preset standard --budget 2000 \
    --device cuda --batch-size 64 --workers 3 \
    --signal-field

Choosing descriptors

Pass --descriptors with three comma-separated names to control which behavioral axes the archive uses. With 18 built-ins (15 general + 3 signal-only) there are 816 possible three-axis configurations. You can also supply your own via --descriptor-module path/to/file.py — the file must define a list named DESCRIPTORS containing Descriptor objects.

# Default axes
uv run biota search --descriptors velocity,gyradius,spectral_entropy

# Shape and complexity axes
uv run biota search --descriptors oscillation,compactness,png_compressibility

Ecosystem experiments

Once you have an archive, define one or more ecosystem experiments in a YAML config. A homogeneous run spawns multiple copies of one creature; a heterogeneous run mixes creatures from different archive cells.

# experiments.yaml
experiments:
  - name: self-interaction
    grid: 256
    steps: 5000
    snapshot_every: 50
    border: torus
    output_format: gif
    spawn: {seed: 42, min_dist: 80, patch: 48}
    sources:
      - run: <your-run-id>
        creature_id: <your-creature-id>
        n: 4

  - name: two-species
    grid: 256
    steps: 5000
    snapshot_every: 50
    border: torus
    output_format: gif
    spawn: {seed: 42, min_dist: 100, patch: 48}
    sources:
      - run: <your-run-id>
        creature_id: <your-creature-id>
        n: 2
      - run: <your-run-id>
        creature_id: <another-creature-id>
        n: 2

# Run experiments, then rebuild the atlas to include ecosystem results
biota ecosystem --config experiments.yaml --device cuda
python scripts/build_index.py --output-dir archive --ecosystem-dir ecosystem --publish

More detail

Full documentation on GitHub README

biotaatlas