SAGESim (Scalable Agent-Based GPU-Enabled Simulator) is a scalable, pure-Python, general-purpose agent-based modeling framework that supports both distributed computing and GPU acceleration.
This tutorial walks through how to build and run agent-based simulations using SAGESim. The core idea centers on subclassing the Model class to define your custom model.
For architecture details, see Architecture Overview. For synchronization details, see Synchronization and Double Buffering.
Building a custom model class that subclasses the base Model class is the core part of using SAGESim. This enables access to the built-in simulate() method to execute your simulations.
The model class is responsible for:
- Registering Breeds: Register breeds in the model's
__init__()method usingregister_breed(). - Registering Global Properties: Register shared properties using
register_global_property(). - Creating and Connecting Agents: Use
create_agent_of_breed()andconnect_agents().
from sagesim.model import Model
from sagesim.space import NetworkSpace
class MyModel(Model):
def __init__(self, p_infection=0.2) -> None:
space = NetworkSpace()
super().__init__(space)
# Register breeds
self._my_breed = MyBreed()
self.register_breed(breed=self._my_breed)
# Register global properties
self.register_global_property("p_infection", p_infection)
def create_agent(self, state):
agent_id = self.create_agent_of_breed(self._my_breed, state=state)
return agent_id
def connect_agents(self, agent_0, agent_1):
self.get_space().connect_agents(agent_0, agent_1)Every agent in SAGESim belongs to a specific breed. To define a breed, subclass the Breed class:
- Register properties using
self.register_property(name, default_value). - Register step functions using
self.register_step_func(func, file_path, priority).
from sagesim.breed import Breed
class MyBreed(Breed):
def __init__(self) -> None:
super().__init__("MyBreed")
self.register_property("state", 1) # Default value = 1
self.register_step_func(my_step_func, __file__, priority=0)A step function defines how an agent behaves during each simulation tick. It must be decorated with @jit.rawkernel(device="cuda").
from cupyx import jit
@jit.rawkernel(device="cuda")
def my_step_func(
tick, # Current simulation tick
agent_index, # Index of this agent in the arrays
globals, # Global properties array
agent_ids, # Agent ID array
breeds, # Breed ID array
locations, # Neighbor indices array
state, # User-defined property arrays...
):
"""Agent behavior logic goes here."""
# Read current state
current_state = state[agent_index]
# Access neighbors
neighbor_indices = locations[agent_index]
# Update state
state[agent_index] = new_value-
Required parameters:
tick,agent_index,globals,agent_ids,breeds, andlocationsmust always be included in this exact order. -
All properties included: All registered properties from all breeds must be in the signature, even if not used.
-
Property order: Properties appear in breed registration order, then property registration order within each breed.
SAGESim uses CuPy's jit.rawkernel for GPU execution. When writing step functions, be aware of these constraints:
| Limitation | Workaround |
|---|---|
| NaN checks don't work normally | Use x != x to check for NaN |
| No dicts or custom objects | Use arrays and primitives only |
No *args or **kwargs |
Use fixed argument lists |
| No nested functions | Define helpers at module level |
No for-each loops |
Use for i in range(n) |
No return statements |
Write results to arrays |
No break or continue |
Use boolean flags |
| No variable reassignment in scopes | Declare variables at top level |
No -1 indexing |
Use len(array) - 1 |
See CuPy documentation for supported operations.
If your simulation fits in one GPU's memory, use a single worker for best performance:
# Run with: python my_simulation.py
# Create model and agents
model = MyModel(p_infection=0.2)
for i in range(1000):
model.create_agent(state=1)
# Connect agents
for i in range(999):
model.connect_agents(i, i + 1)
# Setup and run
model.setup(use_gpu=True)
model.simulate(ticks=100, sync_workers_every_n_ticks=1)
# Get results
for agent_id in range(10):
state = model.get_agent_property_value(agent_id, "state")
print(f"Agent {agent_id}: state={state}")For simulations that exceed single GPU memory, distribute across multiple GPUs with one worker per GPU:
# 4 workers on 4 GPUs (one worker per GPU)
mpirun -n 4 python my_simulation.pyRecommendation: One Worker = One GPU
While MPI can run multiple workers on a single GPU, this is not recommended due to:
- MPI communication overhead between workers
- GPU memory contention
- No performance benefit over single-worker execution
For best performance, use one MPI worker per physical GPU. If your simulation fits in one GPU, use a single worker (
python my_simulation.py). Only use multiple workers when distributing across multiple physical GPUs.
SAGESim is designed for HPC clusters where each compute node has multiple GPUs. The key principle is one MPI rank per GPU.
#!/bin/bash
#SBATCH -A your_account
#SBATCH -J sagesim_run
#SBATCH -o logs/sagesim_%j.out
#SBATCH -e logs/sagesim_%j.err
#SBATCH -t 00:30:00
#SBATCH -p batch
#SBATCH -N 10
# Load modules
module load PrgEnv-gnu/8.6.0
module load miniforge3/23.11.0-0
module load rocm/5.7.1
module load craype-accel-amd-gfx90a
# Activate environment
source activate your_env_name
# Run simulation (8 GPUs per node)
num_nodes=10
num_mpi_ranks=$((8 * num_nodes))
srun -N${num_nodes} -n${num_mpi_ranks} -c7 \
--ntasks-per-gpu=1 --gpu-bind=closest \
python3 -u ./run.py- Match MPI ranks to GPUs: Set
num_ranks = gpus_per_node * num_nodes - Use GPU binding:
--gpu-bind=closestreduces memory latency - Isolate runs: Use job-specific output directories
- Log management: Include
%jin log filenames for job ID
- Architecture Overview - System design and data flow
- Synchronization and Double Buffering - Race condition prevention
- Network Partitioning - Load balancing for distributed execution
- Runtime Optimizations - Performance tuning