4.6 KiB
4.6 KiB
Public Launch Runbook (Self-Hosted + SSH Compose)
1) Goals
- Deploy Fiddy publicly without stack rewrite.
- Keep Postgres self-hosted.
- Enable fast rollback and basic operational visibility.
- Keep security baseline enforceable for direct home-IP exposure.
2) Deploy Host (SSH Compose)
- Prepare Linux deploy host with Docker Engine + Compose plugin.
- Ensure deploy target directory exists (
/opt/fiddy). - Configure web image source:
git.nicosaya.com/nalalangan/fiddy/web. - Configure scheduler image source:
git.nicosaya.com/nalalangan/fiddy/scheduler. - Deploy by immutable tag (
github.sha) and keepmainas convenience tag. - Configure health check endpoint:
/api/health/ready. - Keep previous image tags for rollback.
Required secrets/variables
DATABASE_URLDATABASE_SSLALLOWED_DB_NAMESSESSION_COOKIE_NAMESESSION_TTL_DAYSDEBUG_API=0SCHEDULER_POLL_MS(scheduler app, optional)SCHEDULER_BATCH_SIZE(scheduler app, optional)
3) CI/CD (Gitea Actions)
- Use
.gitea/workflows/deploy-ssh-compose.yml. - Required secrets:
REGISTRY_USERREGISTRY_PASSDEPLOY_KEYDEPLOY_HOSTDEPLOY_USERDEPLOY_HEALTHCHECK_URL
- Health gate:
- workflow calls
scripts/wait-for-health.shagainstDEPLOY_HEALTHCHECK_URL - default retry window: 5 minutes (30 attempts x 10s)
- workflow calls
4) Reverse Proxy + Network Hardening
- Use your existing Nginx reverse proxy/vhost.
- Apply the required Fiddy directives using
docker/nginx/fiddy.confanddocker/nginx/includes/fiddy-proxy.confas templates. - For Nginx Proxy Manager-specific setup, follow
docs/08_NGINX_PROXY_MANAGER_SETUP.md. - NPM note: apply
add_header/proxy_set_headerin Custom Location/(and specific API locations), not only Proxy Host Advanced. - Install certificate with Let's Encrypt.
- Route 443 -> app container only.
- Keep Postgres private; never expose 5432 publicly.
- Restrict SSH to allowlist/VPN.
- Add host firewall rules:
- Allow inbound
80/443. - Deny all other inbound by default.
- Allow inbound
- Confirm Nginx writes JSON logs:
/var/log/nginx/fiddy-access.log/var/log/nginx/fiddy-error.log
- If your log paths differ, update:
docker/observability/promtail-config.ymldocker/security/fail2ban/jail.d/fiddy-nginx.confdocker/security/crowdsec/acquis.yaml
- Apply/verify host baseline using scripts:
- dry-run firewall apply:
SSH_ALLOW_CIDR=<your-cidr> DRY_RUN=1 scripts/harden-host-ufw.sh - real firewall apply:
SSH_ALLOW_CIDR=<your-cidr> DRY_RUN=0 sudo scripts/harden-host-ufw.sh - host status audit:
scripts/check-host-security.sh
- dry-run firewall apply:
- Auto-ban templates:
- fail2ban:
docker/security/fail2ban/* - crowdsec (optional):
docker/security/crowdsec/acquis.yaml
- fail2ban:
5) Observability
- Bring up monitoring stack:
docker compose -f docker/observability/docker-compose.observability.yml up -d
- Configure Grafana datasource to Loki (
http://loki:3100). - Verify nginx logs are ingested by Promtail (
job="nginx"). - Add Uptime Kuma monitors:
/api/health/live/api/health/ready- home page (
/)
5.1) Deployment Smoke Check
- Run after every deploy and rollback:
scripts/smoke-public-launch.sh https://your-domain
- The script verifies:
/api/health/liveand/api/health/readyreturn200- both responses include
X-Request-Idheader - both response bodies include
request_id
6) Backup + Restore
- Daily backup command:
scripts/backup-postgres.sh
- Periodic base backup (for faster full recovery):
PRIMARY_DATABASE_URL=<replication-url> scripts/basebackup-postgres.sh
- Retention:
- default 7 days (
RETENTION_DAYS=7)
- default 7 days (
- Restore drill:
scripts/restore-drill-postgres.sh backups/postgres/<file>.dump <target_database_url>
- Run restore drill on non-prod DB before public launch.
- Record drill outcome:
scripts/log-restore-drill.sh <environment> <backup_file> <restore_target> <status> <rto_minutes> <notes>- log file:
docs/restore-drill-log.csv
7) Incident Response Quick Flow
- Identify failing request and
request_id. - Correlate application logs (Loki) by
request_id. - Check
/api/health/readystatus and DB connectivity. - Roll back to previous known-good image tag via SSH Compose if needed.
- Capture root cause and update this runbook/checklist.
8) Rollback Checklist
- Select previous healthy image tag for both
webandscheduler. - Trigger rollback deploy and wait for completion.
- Run
scripts/smoke-public-launch.sh https://your-domain. - Verify error-rate drop in Grafana/Loki and confirm no DB migration mismatch.
- Log the rolled back version, timestamp, and reason.