Deployment Flow

Last updated: 2026-03-27

Overview

Code flows from local development to production via:

git push main -> GitHub Actions CI -> Docker build -> ECR push -> ECS deploy -> smoke test

There is one environment (see docs/admin/ops/environment-strategy.md). All merges to main trigger the full pipeline.

CI/CD Pipeline (`.github/workflows/ci.yml`)

Trigger

Push to main: runs full pipeline including deploy
Pull request to main: runs lint, build, security scan, and tests (no deploy)

Jobs

1. `lint-and-build` (all pushes and PRs)

Runs in parallel with security-scan. Concurrency group cancels superseded runs.

Checkout code
Setup Node.js (version from .nvmrc)
npm ci (falls back to npm install)
npm run lint
npm run type-check
npm run build

2. `security-scan` (all pushes and PRs)

Runs Semgrep with rulesets: p/typescript, p/security-audit, p/secrets, p/eslint-plugin-security. Fails on findings (--error).

3. `test` (all pushes and PRs)

Depends on lint-and-build. Runs npm test -- --coverage. Coverage threshold check (70%) is currently warn-only.

4. `e2e-tests` (main branch push only)

Depends on lint-and-build + test. Spins up a pgvector/pgvector:pg16 service container, bootstraps the CI database (scripts/ci-bootstrap-db.sql), runs migrations (npm run db:migrate), builds the app, and runs Playwright E2E tests against the standalone build. Uses real Cognito credentials from GitHub secrets.

5. `deploy` (main branch push only)

Depends on lint-and-build + test + security-scan. Has its own concurrency group with cancel-in-progress: false (in-progress deploys are never cancelled).

Steps:

Change detection: Compares HEAD~1..HEAD. If only docs/tests/config changed, skips the deploy entirely.
AWS auth: OIDC federation via aws-actions/configure-aws-credentials@v4. Role ARN stored in AWS_ROLE_ARN GitHub secret. No long-lived IAM keys.
ECR login: aws-actions/amazon-ecr-login@v2
Docker build + push: Multi-stage build (see Dockerfile):
- Stage 1 (deps): npm ci in Alpine Node 24.12
- Stage 2 (builder): Copy deps, copy source, npm run build. NEXT_PUBLIC_* vars passed as build args (baked into client JS). Server-side secrets are NOT build args.
- Stage 3 (runner): Alpine Node 24.12, copies standalone output only, runs as non-root nextjs user.
- Image tagged with git SHA: <ecr-registry>/invapp-dev-app:<sha>
Register task definition: Fetches current ECS task definition, updates container image to new SHA tag, upserts environment variables (quality model config), registers new revision.
Update ECS service: Points service at new task definition revision, forces new deployment. If desired count is 0, sets it to 1.
Wait for stabilization: aws ecs wait services-stable blocks until the new task is healthy.
Smoke test: Curls https://app.nquir.ai/api/health and checks for HTTP 200.

Key CI Environment Variables

Variable	Source	Purpose
`AWS_ROLE_ARN`	GitHub secret	OIDC role for AWS access
`ECR_REPOSITORY`	Hardcoded `invapp-dev-app`	ECR repo name
`ECS_CLUSTER`	Hardcoded `invapp-dev-cluster`	ECS cluster name
`ECS_SERVICE`	Hardcoded `invapp-dev-app-service`	ECS service name
`COGNITO_USER_POOL_ID`	GitHub secret	Auth config (baked into client build)
`COGNITO_CLIENT_ID`	GitHub secret	Auth config (baked into client build)

Other Workflows

eval-check.yml: Evaluation/quality checks (separate from deploy)
retention-cron.yml: Scheduled retention/cleanup tasks

Infrastructure Architecture

Route53 (app.nquir.ai)
  -> CloudFront (CDN + WAF + Basic Auth gate)
    -> ALB (HTTPS termination, health checks)
      -> ECS Fargate (private subnet, 1 task)
        -> RDS PostgreSQL 15 (private subnet, encrypted, pgvector)
        -> ElastiCache Redis (private subnet, TLS + auth token)
        -> S3 (evidence files, signed URLs)
        -> Bedrock (Claude Sonnet 4, Haiku 4.5, Titan embeddings, Cohere rerank)
        -> Cognito (auth, MFA enabled)

All compute and data resources are in private subnets. Outbound traffic (Bedrock, external APIs) goes through NAT Gateway.

Database Access

Via SSM Port Forwarding (Bastion)

The bastion is a t3.micro EC2 instance in a private subnet with SSM agent. No SSH keys, no inbound security group rules. Access is via AWS Systems Manager only.

Prerequisites:

AWS CLI v2
Session Manager plugin installed (brew install --cask session-manager-plugin on macOS)
IAM permissions for ssm:StartSession

Get the bastion instance ID:

# From terraform output
cd infrastructure/terraform/environments/dev
terraform output bastion_instance_id

# Or find it in AWS console: EC2 -> Instances -> invapp-dev-bastion

Start a port forwarding session to RDS:

aws ssm start-session \
  --target <bastion-instance-id> \
  --document-name AWS-StartPortForwardingSessionToRemoteHost \
  --parameters '{
    "host": ["<rds-endpoint>"],
    "portNumber": ["5432"],
    "localPortNumber": ["5433"]
  }'

This forwards localhost:5433 to the RDS instance on port 5432. Keep this terminal open.

Connect with psql:

psql -h localhost -p 5433 -U app_admin -d investigation_app

Get RDS endpoint:

cd infrastructure/terraform/environments/dev
terraform output database_endpoint

Running Migrations

Migrations use the custom runner at scripts/run-migration.ts, NOT the Supabase CLI. The runner connects via pg and tracks applied migrations in a _migrations table.

Locally (against local DB):

npm run db:migrate            # Run all pending
npm run db:migrate:run <file> # Run specific file

Against production RDS (via bastion tunnel):

Start the SSM port forwarding session (see above)
Set environment variables pointing to the tunnel:

DB_HOST=localhost DB_PORT=5433 DB_NAME=investigation_app \
  DB_USER=app_admin DB_PASSWORD=<password> DB_SSL=true \
  npm run db:migrate

Or for a specific migration:

DB_HOST=localhost DB_PORT=5433 DB_NAME=investigation_app \
  DB_USER=app_admin DB_PASSWORD=<password> DB_SSL=true \
  npm run db:migrate:run supabase/migrations/20260327000000_example.sql

Creating a new migration:

touch supabase/migrations/$(date +%Y%m%d%H%M%S)_migration_name.sql
# Edit the file, then run it

ECS Exec (Container Shell)

For debugging the running container:

aws ecs execute-command \
  --cluster invapp-dev-cluster \
  --task <task-id> \
  --container invapp-dev-app \
  --interactive \
  --command "/bin/sh"

enable_execute_command = true is set in the ECS module.

Secrets Management

Server-side secrets are stored in AWS Secrets Manager and injected into ECS tasks at runtime via the task definition's secrets block. They are NOT baked into the Docker image.

Secrets managed:

DB_PASSWORD
REDIS_AUTH_TOKEN
STRIPE_SECRET_KEY, STRIPE_WEBHOOK_SECRET
RESEND_API_KEY
NEXT_PUBLIC_SENTRY_DSN
ANTHROPIC_API_KEY
CRON_SECRET

To update a secret, modify it in AWS Secrets Manager and force a new ECS deployment (the task definition references the secret by name, so the new value is pulled on next task start).

Rollback

There is no automated rollback. To roll back:

Identify the last known-good git SHA
Update the ECS service to use the task definition revision that used that SHA's image
Or: revert the commit on main and let CI redeploy

ECR retains all pushed images (tagged by git SHA), so any previous version can be deployed without rebuilding.

Overview​

CI/CD Pipeline (.github/workflows/ci.yml)​

Trigger​

Jobs​

1. lint-and-build (all pushes and PRs)​

2. security-scan (all pushes and PRs)​

3. test (all pushes and PRs)​

4. e2e-tests (main branch push only)​

5. deploy (main branch push only)​

Key CI Environment Variables​

Other Workflows​

Infrastructure Architecture​

Database Access​

Via SSM Port Forwarding (Bastion)​

Running Migrations​

ECS Exec (Container Shell)​

Secrets Management​

Rollback​