Ecosystem Components¶
To support scalable epidemic modeling and research, the ecosystem is built on four independent repositories that can each evolve independently:
| Component | Role |
|---|---|
| epydemix | Epidemic modeling engine in Python with support for model calibration using Approximate Bayesian Computation |
| epymodelingsuite | YAML-configured modeling suite for routine epidemic forecasting with support for interventions like school closures and vaccines. Uses epydemix as engine. |
| epymodelingsuite-cloud | Cloud infrastructure and epycloud CLI for running parallel workloads on Google Cloud with local development support |
| Experiment data repository | YAML experiment configurations, shared surveillance data, and custom functions. Typically one per project (e.g. flu, COVID-19, RSV). |
1. epydemix (Engine)¶
epistorm/epydemixepydemix, the ABC of epidemics.General-purpose epidemic modeling engine that provides the core compartmental model framework.
What It Provides:
- Compartmental Models: Framework for building and simulating epidemic models (SIR, SEIR, etc.)
- Calibration: Model calibration using Approximate Bayesian Computation (ABC)
- Data Integration: Built-in support for population and mobility data
How It's Used: Installed as a dependency of epymodelingsuite.
2. epymodelingsuite (Modeling Suite)¶
mobs-lab/epymodelingsuiteModeling suite for infectious disease forecasting and scenario modelingYAML-configured modeling suite for routine epidemic forecasting with support for interventions like school closures and vaccines.
What It Provides:
- Configuration System: YAML-based experiment definitions with Pydantic validation
- Model Building: Builders for compartmental models, parameter sampling, scenario generation
- Execution Engine: Dispatcher for workload generation (
dispatch_builder), simulation/calibration execution (dispatch_runner), and result aggregation (dispatch_output_generator)
Key Dependencies: epydemix (core modeling library), pandas, numpy, scipy, pydantic, PyYAML
How It's Used: Installed into the Docker image during build from the GitHub repository. The repository and branch/tag are configurable, allowing version pinning for reproducible runs.
3. epymodelingsuite-cloud (Infrastructure)¶
mobs-lab/epymodelingsuite-cloudGoogle Cloud pipeline for epymodelingsuite.Provides serverless cloud infrastructure for running epymodelingsuite workflows at massive scale on Google Cloud. This is the current repository.
What It Provides:
- epycloud CLI: Command-line tool for managing the entire pipeline lifecycle (configuration, building, running, monitoring)
- Docker Images: Multi-stage Dockerfile packaging
epymodelingsuite, all dependencies, and pipeline scripts - Pipeline Scripts: Entry points for each stage (
main_builder.py,main_runner.py,main_output.py) plus the storage abstraction layer - Infrastructure as Code: Terraform modules for GCS, Artifact Registry, Cloud Workflows, Cloud Batch, IAM, and monitoring
- Configuration Management: Hierarchical config system with base config, secrets, environment overrides, and profile settings
4. Experiment Data Repository¶
Stores experiment configurations and input data separately from code, enabling version control of experiments and data sharing among researchers.
What It Provides:
- Experiment Configurations: YAML files defining modeling experiments (
basemodel.yaml,modelset.yaml,output.yaml) - Common Data: Shared datasets (surveillance data, population demographics, mobility data)
- Custom Functions: User-defined Python modules for experiment-specific modeling logic
How It's Used: In cloud mode, the builder stage clones the repository at runtime (using a GitHub PAT if the repository is private). In local mode, users copy experiment configurations to ./local/forecast/ which is mounted into containers.
How Components Work Together¶
sequenceDiagram
participant User
participant CLI as epycloud CLI
participant Docker as Docker Image
participant EPY as epymodelingsuite
participant FORECAST as Experiment Repo
User->>CLI: epycloud run workflow --exp-id test
CLI->>Docker: Start builder container
Docker->>FORECAST: git clone (cloud) or mount (local)
Docker->>FORECAST: Load config YAMLs
Docker->>EPY: dispatch_builder(config)
EPY->>Docker: Generate N input files
Note over Docker,EPY: Stage B (N parallel tasks)
Docker->>EPY: dispatch_runner(input_file)
EPY->>Docker: Return result file
Note over Docker,EPY: Stage C (aggregation)
Docker->>EPY: dispatch_output_generator(all_results)
EPY->>Docker: Generate CSV outputs
Docker->>User: Results available in GCS/local Next Steps¶
- Pipeline Stages: How stages use these components
- Docker Images: How components are packaged
- User Guide: Day-to-day usage