93 lines
3.4 KiB
Markdown
93 lines
3.4 KiB
Markdown
# Public Launch Runbook (Self-Hosted + Dokploy)
|
|
|
|
## 1) Goals
|
|
- Deploy Fiddy publicly without stack rewrite.
|
|
- Keep Postgres self-hosted.
|
|
- Enable fast rollback and basic operational visibility.
|
|
- Keep security baseline enforceable for direct home-IP exposure.
|
|
|
|
## 2) Deploy Control Plane (Dokploy)
|
|
1. Install Dokploy on your Proxmox Docker host.
|
|
2. Add project in Dokploy and connect Gitea repository.
|
|
3. Configure image source: `git.nicosaya.com/nalalangan/fiddy/web`.
|
|
4. Deploy by immutable tag (`github.sha`) and keep `main` as convenience tag.
|
|
5. Configure health check endpoint: `/api/health/ready`.
|
|
6. Keep previous releases for rollback and verify rollback button path.
|
|
|
|
### Required secrets/variables
|
|
- `DATABASE_URL`
|
|
- `DATABASE_SSL`
|
|
- `ALLOWED_DB_NAMES`
|
|
- `SESSION_COOKIE_NAME`
|
|
- `SESSION_TTL_DAYS`
|
|
- `DEBUG_API=0`
|
|
|
|
## 3) CI/CD (Gitea Actions)
|
|
- Use `.gitea/workflows/deploy-dokploy.yml`.
|
|
- Required secrets:
|
|
- `REGISTRY_USER`
|
|
- `REGISTRY_PASS`
|
|
- `DOKPLOY_DEPLOY_HOOK`
|
|
- `DOKPLOY_HEALTHCHECK_URL`
|
|
- Health gate:
|
|
- workflow calls `scripts/wait-for-health.sh` against `DOKPLOY_HEALTHCHECK_URL`
|
|
- default retry window: 5 minutes (30 attempts x 10s)
|
|
|
|
## 4) Reverse Proxy + Network Hardening
|
|
- Use `docker/nginx/fiddy.conf` as baseline.
|
|
- Install certificate with Let's Encrypt.
|
|
- Route 443 -> app container only.
|
|
- Keep Postgres private; never expose 5432 publicly.
|
|
- Restrict SSH to allowlist/VPN.
|
|
- Add host firewall rules:
|
|
- Allow inbound `80/443`.
|
|
- Deny all other inbound by default.
|
|
- Confirm Nginx writes JSON logs:
|
|
- `/var/log/nginx/fiddy-access.log`
|
|
- `/var/log/nginx/fiddy-error.log`
|
|
- Apply/verify host baseline using scripts:
|
|
- dry-run firewall apply: `SSH_ALLOW_CIDR=<your-cidr> DRY_RUN=1 scripts/harden-host-ufw.sh`
|
|
- real firewall apply: `SSH_ALLOW_CIDR=<your-cidr> DRY_RUN=0 sudo scripts/harden-host-ufw.sh`
|
|
- host status audit: `scripts/check-host-security.sh`
|
|
|
|
## 5) Observability
|
|
- Bring up monitoring stack:
|
|
- `docker compose -f docker/observability/docker-compose.observability.yml up -d`
|
|
- Configure Grafana datasource to Loki (`http://loki:3100`).
|
|
- Verify nginx logs are ingested by Promtail (`job="nginx"`).
|
|
- Add Uptime Kuma monitors:
|
|
- `/api/health/live`
|
|
- `/api/health/ready`
|
|
- home page (`/`)
|
|
|
|
## 5.1) Deployment Smoke Check
|
|
- Run after every deploy and rollback:
|
|
- `scripts/smoke-public-launch.sh https://your-domain`
|
|
- The script verifies:
|
|
- `/api/health/live` and `/api/health/ready` return `200`
|
|
- both responses include `X-Request-Id` header
|
|
- both response bodies include `request_id`
|
|
|
|
## 6) Backup + Restore
|
|
- Daily backup command:
|
|
- `scripts/backup-postgres.sh`
|
|
- Retention:
|
|
- default 7 days (`RETENTION_DAYS=7`)
|
|
- Restore drill:
|
|
- `scripts/restore-drill-postgres.sh backups/postgres/<file>.dump <target_database_url>`
|
|
- Run restore drill on non-prod DB before public launch.
|
|
|
|
## 7) Incident Response Quick Flow
|
|
1. Identify failing request and `request_id`.
|
|
2. Correlate application logs (Loki) by `request_id`.
|
|
3. Check `/api/health/ready` status and DB connectivity.
|
|
4. Roll back to previous known-good Dokploy release if needed.
|
|
5. Capture root cause and update this runbook/checklist.
|
|
|
|
## 8) Rollback Checklist
|
|
1. Select previous healthy image in Dokploy release history.
|
|
2. Trigger rollback and wait for deployment completion.
|
|
3. Run `scripts/smoke-public-launch.sh https://your-domain`.
|
|
4. Verify error-rate drop in Grafana/Loki and confirm no DB migration mismatch.
|
|
5. Log the rolled back version, timestamp, and reason.
|