Skip to content

Prerequisites

Before deploying to Google Cloud, ensure you have the following tools installed and configured.

Google Cloud Account

You need a Google Cloud account with billing enabled.

Note

The steps below use gcloud CLI commands, but most of these can also be done through the Google Cloud Console web interface.

  1. Create an account at cloud.google.com if you don't have one
  2. Create a project or use an existing one:

    $ gcloud projects create my-project-id --name="My Project"
    

    Project ID vs. Project Name

    • Project ID (my-project-id): A globally unique identifier used in CLI commands, APIs, and URLs. Cannot be changed after creation.
    • Project Name (My Project): A human-readable display label shown in the Cloud Console. Can be changed anytime.

    Choose a project ID that's meaningful to your team (e.g., mobs-epi-pipeline, epi-modeling-prod).

  3. Link a billing account at Billing Console

gcloud CLI

The Google Cloud CLI (gcloud) is used for authentication, API management, and infrastructure operations.

gcloud CLI overview  |  Google Cloud SDK  |  Google Cloud Documentationdocs.cloud.google.com

Install

$ brew install --cask google-cloud-sdk

Or follow the official installer.

$ curl https://sdk.cloud.google.com | bash
$ exec -l $SHELL

Or follow the official guide.

Install inside your WSL2 Linux distribution using the Linux instructions above.

Configure

Set your default project and authenticate:

$ gcloud config set project my-project-id
$ gcloud auth login
$ gcloud auth application-default login

Enable Required APIs

The pipeline uses several Google Cloud services. Enable them all at once:

$ gcloud services enable \
    compute.googleapis.com \
    batch.googleapis.com \
    workflows.googleapis.com \
    storage.googleapis.com \
    artifactregistry.googleapis.com \
    cloudbuild.googleapis.com \
    logging.googleapis.com \
    monitoring.googleapis.com \
    secretmanager.googleapis.com

IAM Permissions

To deploy and run the infrastructure, your user account needs these roles in addition to the Editor role (roles/editor):

Role Purpose
Project IAM Admin (roles/resourcemanager.projectIamAdmin) Manage project-level IAM bindings (Terraform)
Secret Manager Admin (roles/secretmanager.admin) Manage IAM policies for secrets (Terraform)
Service Account Admin (roles/iam.serviceAccountAdmin) Manage IAM policies on service accounts (Terraform)
Cloud Build Editor (roles/cloudbuild.builds.editor) Submit and manage Cloud Build jobs (Docker builds)

Grant these roles to your account:

$ PROJECT_ID="my-project-id"
$ USER_EMAIL="user@example.com"

$ for ROLE in \
    roles/resourcemanager.projectIamAdmin \
    roles/secretmanager.admin \
    roles/iam.serviceAccountAdmin \
    roles/cloudbuild.builds.editor; do
  gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="user:$USER_EMAIL" \
    --role="$ROLE"
done
Common permission errors
Error Missing Role
Error 403: Policy update access denied Project IAM Admin or Secret Manager Admin
Permission 'iam.serviceAccounts.setIamPolicy' denied Service Account Admin
The caller does not have permission (Cloud Build) Cloud Build Editor

Ask your Google Cloud project administrator to grant these roles if needed.

Terraform

Terraform manages the cloud infrastructure (GCS bucket references, Artifact Registry, service accounts, Cloud Workflows, etc.).

What is Terraform | Terraform | HashiCorp Developerdeveloper.hashicorp.com

Install

$ brew tap hashicorp/tap
$ brew install hashicorp/tap/terraform

Follow the official guide for your distribution.

For Ubuntu/Debian:

$ wget -O - https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
$ echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
$ sudo apt update && sudo apt install terraform

Verify

$ terraform --version
Terraform v1.5.0+

Requires Terraform 1.5 or higher.

Experiment Repository

The pipeline clones a GitHub repository at runtime to read experiment configuration files.

Repository structure

You need a repository with the following structure. Each folder under experiments/ is an experiment ID (the value you pass to --exp-id):

my-flu-experiment-repo/
├── experiments/
│   └── my-experiment-001/    # <-- this is the experiment ID
│       └── config/
│           ├── basemodel.yaml
│           ├── modelset.yaml
│           └── output.yaml
├── common-data/          # Optional: shared data files
└── functions/            # Optional: custom Python modules

Experiment ID naming

The experiment ID is the folder path under experiments/ and is used in storage paths and logs. It can be a single folder or nested directories. Choose a descriptive name, e.g., smc-rmse-202606-hosp, flu-calibration-v2, or 202606/smc-rmse-hosp.

If you already have an experiment repository, you can skip ahead. Otherwise, follow these steps to create one:

Create the repository

  1. Create a new repository on GitHub
  2. Clone it locally and set up the directory structure:

    $ git clone https://github.com/your-org/my-flu-experiment-repo.git
    $ cd my-flu-experiment-repo
    $ mkdir -p experiments/my-experiment-001/config
    $ mkdir -p common-data
    $ mkdir -p functions
    
  3. Add your experiment configuration files to experiments/my-experiment-001/config/

  4. Commit and push:

    $ git add .
    $ git commit -m "Add initial experiment config"
    $ git push origin main
    

The repository can be public or private (private requires a GitHub PAT).

Important

Experiments must be pushed to the default branch (usually main) before running cloud workflows, since Stage A clones the repository at runtime. To use a different branch, set forecast_repo_ref in your config or pass --forecast-repo-ref when submitting a workflow.

GitHub Personal Access Token

A GitHub PAT is required if you use private repositories (e.g., private experiment data repositories). It is used at runtime by Batch jobs to clone the repository. If all your repositories are public, you can skip this section.

Create a Fine-Grained PAT

  1. Go to GitHub Settings > Developer settings > Personal access tokens > Fine-grained tokens
  2. Click Generate new token
  3. Configure:
    • Token name: epi-pipeline (or similar)
    • Expiration: Set an appropriate expiration
    • Repository access: Select Only select repositories and add your private repositories (e.g., experiment data repository)
    • Repository permissions: Grant Contents > Read-only
  4. Click Generate token and copy it immediately

Save Your Token

GitHub only shows the token once. Copy it before closing the page.

Managing your personal access tokens - GitHub DocsYou can use a personal access token in place of a password when authenticating to GitHub in the command line or with the API.docs.github.com

Store in epycloud Secrets

This is used for local Docker builds (epycloud build dev, epycloud build local). Cloud builds and cloud runs use Secret Manager instead (see below).

Add the PAT to your local epycloud configuration:

$ epycloud config edit-secrets

Add your token:

# secrets.yaml
github:
  pat: "github_pat_xxxxxxxxxxxx"

This file is stored at ~/.config/epymodelingsuite-cloud/secrets.yaml with 0600 permissions (owner-only read/write).

Store in Google Secret Manager

The PAT must also be stored in Secret Manager for cloud builds and Batch jobs to access:

$ echo -n "github_pat_xxxxxxxxxxxx" | gcloud secrets create github-pat \
    --data-file=- \
    --project=my-project-id

To update an existing secret with a new token version:

$ echo -n "github_pat_new_token" | gcloud secrets versions add github-pat \
    --data-file=- \
    --project=my-project-id

Verify the secret exists:

$ gcloud secrets describe github-pat --project=my-project-id

Important

  • The secret name must be github-pat to match the Terraform configuration
  • Never commit the PAT to version control
  • Rotate the token regularly and set an appropriate expiration date

GCS Bucket (Optional)

The pipeline stores artifacts and results in a Google Cloud Storage bucket. Terraform references an existing bucket rather than creating one, so if you don't already have a bucket, create one before deploying infrastructure.

$ gsutil mb -p my-project-id -l us-central1 gs://my-bucket-name/

Or create one in the Cloud Storage Console.

Tip

If you already have a GCS bucket you'd like to use, you can skip this step and provide its name during Setup.

Docker

Docker is required for building container images (both local development and pushing to Artifact Registry).

We recommend OrbStack for a lightweight, fast Docker engine on macOS.

OrbStack · Fast, light, simple Docker & Linuxorbstack.dev

Install via Homebrew or download the installer:

$ brew install orbstack

After installation, open OrbStack once to complete setup. It runs the Docker engine in the background.

Note

Docker Desktop also works if you have already installed it. However, note that the performance may not be as good as OrbStack.

Install Docker Engine following the official guide for your distribution.

$ # Ubuntu/Debian
$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin

Windows requires WSL2 with a Linux distribution. All commands in this guide should be run inside WSL2.

Install WSLInstall Windows Subsystem for Linux with the command, wsl --install. Use a Bash terminal on your Windows machine run by your preferred Linux distribution - Ubuntu, Debian, SUSE, Kali, Fedora, Pengwin, Alpine, and more are available.learn.microsoft.com
  1. Install WSL2 and Ubuntu (from PowerShell as Administrator):

    wsl --install
    

  2. Install Docker Engine inside WSL2 following the Linux instructions, or install Docker Desktop for Windows with WSL2 backend enabled.

Verify Docker is running:

$ docker info
$ docker compose version

epycloud CLI

If you haven't installed epycloud yet, follow the Installation Guide.

Verify it's available:

$ epycloud --version

Checklist

Before proceeding to Setup, confirm:

  • Google Cloud account with billing enabled
  • gcloud CLI installed and authenticated
  • Required APIs enabled
  • IAM permissions granted
  • Terraform 1.5+ installed
  • GitHub PAT created and stored (only if using private repositories)
  • GCS bucket created or existing bucket identified
  • Experiment repository set up with configurations pushed
  • Docker installed and running
  • epycloud CLI installed