fiddy/docs/09_DEPLOYMENT_EXECUTION_PLAYBOOK.md
Nico a0514f0823
Some checks failed
Build & Deploy Fiddy (SSH Compose) / build (push) Failing after 1s
Build & Deploy Fiddy (SSH Compose) / deploy (push) Has been skipped
docs: switch active deployment runbooks from dokploy to ssh compose
2026-02-22 01:51:44 -08:00

129 lines
4.4 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Deployment Execution Playbook (Hands-On Checkpoints)
Purpose: keep implementation work prepared in-repo, and call for operator actions only when local infrastructure access is required.
## Status Icon Legend
Use these in execution updates for fast scanning:
- `🔄` in progress
- `✅` completed
- `🧪` test/lint/verification result
- `📄` documentation update
- `🗄️` database or migration change
- `🚀` deploy/release step
- `⚠️` risk, blocker, or manual operator action needed
- `❌` failed command or unsuccessful attempt
- `` informational context
- `🧭` recommendation or next-step option
## Phase 0: Preflight (No Infra Changes)
- [ ] `npm run lint`
- [ ] `npm test`
- [ ] `npm run build`
- [ ] Confirm docs are up to date:
- [ ] `docs/public-launch-runbook.md`
- [ ] `docs/07_PUBLIC_LAUNCH_CHECKLIST.md`
- [ ] `docs/08_NGINX_PROXY_MANAGER_SETUP.md`
- [ ] `docs/06_SECURITY_REVIEW.md`
## Phase 1: Registry + SSH Compose Wiring (Operator Needed)
Hands-on checkpoints:
1. Create/verify secrets in Gitea:
- `REGISTRY_USER`
- `REGISTRY_PASS`
- `DEPLOY_KEY`
- `DEPLOY_HOST`
- `DEPLOY_USER`
- `DEPLOY_HEALTHCHECK_URL`
2. Prepare deploy host for SSH Compose:
- install Docker Engine + Compose plugin
- create `/opt/fiddy/.env` with production variables
- run `docker login git.nicosaya.com` as deploy user
3. Confirm production compose contract:
- web image source: `git.nicosaya.com/nalalangan/fiddy/web`
- scheduler image source: `git.nicosaya.com/nalalangan/fiddy/scheduler`
- web publishes `3010:3000`
- scheduler has no public port
Validation:
- [ ] Push-to-main triggers `.gitea/workflows/deploy-ssh-compose.yml`
- [ ] SSH deploy updates both web and scheduler containers
- [ ] Deploy guard confirms web and scheduler are running
- [ ] Health gate completes via `scripts/wait-for-health.sh`
## Phase 2: NPM Edge Setup (Operator Needed)
Use `docs/08_NGINX_PROXY_MANAGER_SETUP.md`.
Execution order helper: `docs/10_NPM_HANDS_ON_RUNSHEET.md`.
Hands-on checkpoints:
1. Proxy Host for Fiddy domain configured to internal app IP:port.
2. Proxy Host Advanced:
- `docker/nginx/npm/proxy-host-advanced.conf.example`
3. Custom Location `/`:
- `docker/nginx/npm/location-root-advanced.conf.example`
4. Custom auth/write locations:
- `docker/nginx/npm/location-auth-advanced.conf.example`
- `docker/nginx/npm/location-write-advanced.conf.example`
5. Global NPM `http` config includes:
- `docker/nginx/npm/http_top.conf.example`
Validation:
- [ ] `scripts/smoke-public-launch.sh https://<domain>` passes
- [ ] Response header `X-Request-Id` present
- [ ] Response body includes `request_id`
- [ ] Rate limits are active under burst tests
## Phase 3: Host Security Baseline (Operator Needed)
Hands-on checkpoints:
1. Firewall baseline:
- dry run: `SSH_ALLOW_CIDR=<cidr> DRY_RUN=1 scripts/harden-host-ufw.sh`
- apply: `SSH_ALLOW_CIDR=<cidr> DRY_RUN=0 sudo scripts/harden-host-ufw.sh`
2. Security snapshot:
- `scripts/check-host-security.sh`
3. Auto-ban tooling:
- fail2ban and/or crowdsec using `docker/security/*`
Validation:
- [ ] Only expected public ports exposed (`80/443`)
- [ ] SSH restricted by allowlist/VPN
- [ ] Ban tooling sees nginx logs and can ban test offender
## Phase 4: Observability + Alerts (Operator Needed)
Hands-on checkpoints:
1. Start stack:
- `docker compose -f docker/observability/docker-compose.observability.yml up -d`
2. Grafana datasource:
- Loki `http://loki:3100`
3. Uptime Kuma monitors:
- `/api/health/live`
- `/api/health/ready`
- `/`
Validation:
- [ ] nginx logs appear in Loki (`job="nginx"`)
- [ ] alert rules configured (5xx/auth spikes/DB failures/resource pressure)
## Phase 5: Backup + DR (Operator Needed)
Hands-on checkpoints:
1. Schedule logical backups:
- `scripts/backup-postgres.sh`
2. Schedule periodic base backups:
- `PRIMARY_DATABASE_URL=<replication-url> scripts/basebackup-postgres.sh`
3. Run restore drill:
- `scripts/restore-drill-postgres.sh <dump> <target_db_url>`
4. Log drill:
- `scripts/log-restore-drill.sh <env> <dump> <target> <status> <rto_min> <notes>`
Validation:
- [ ] latest drill entry in `docs/restore-drill-log.csv`
- [ ] measured RTO acceptable
## Phase 6: Launch Gate
Run final checklist:
- `docs/07_PUBLIC_LAUNCH_CHECKLIST.md`
Go-live only after all required boxes are checked.
## Deferred Item (Intentional)
- NPM host-specific tailoring (domain/upstream/custom locations) is intentionally deferred and tracked for a later hands-on session.