fiddy/docs/05_REFACTOR_2.md

88 lines
5.7 KiB
Markdown

# Refactor 2: Public Launch Hardening
## Purpose Overview
This refactor prepares Fiddy for public exposure without changing the core stack (Next.js + external Postgres). The goal is to harden API contracts, improve abuse resistance, tighten security posture, and make deployment/operations repeatable for self-hosted production.
Primary outcomes:
- Keep architecture stable and avoid risky stack rewrite.
- Standardize API response metadata (`request_id`) for traceability.
- Add layered rate limiting (app + proxy plan) for auth and write paths.
- Remove sensitive logging risk (no full invite codes, no secrets).
- Add health probes and deployment/ops reference artifacts.
- Document rollout, rollback, backup, and monitoring runbooks.
## Scope Checklist
- [x] Phase 1: App security + API contract hardening
- [x] Phase 2: Dokploy deployment workflow artifacts
- [x] Phase 3: Nginx + host hardening artifacts
- [x] Phase 4: Observability stack references/configs
- [x] Phase 5: Backup + restore process and scripts/docs
- [x] Verification pass (tests/build/lint where possible)
## Running Implementation Log
### 2026-02-14
- Started `Refactor 2` execution from current in-progress workspace baseline.
- Confirmed this repo already contains broad uncommitted changes unrelated to this phase; implementation will be done via targeted edits only.
- Established this document as the source log/checklist for all noteworthy decisions, tradeoffs, blockers, and completed steps.
- Added migration `packages/db/migrations/007_rate_limits.sql` for server-side rate limit state.
- Added server limiter `apps/web/lib/server/rate-limit.ts` and wired write-path guardrails in server services (`groups`, `group-members`, `group-invites`, `entries`, `buckets`, `tags`, `group-settings`).
- Hardened API error pipeline in `apps/web/lib/server/errors.ts`:
- Added `RATE_LIMITED` mapping (`429`).
- Added `request_id` alias in structured error body.
- Extended sensitive-key redaction for invite code keys.
- Removed stray debug `console.log()` call.
- Fixed invite-code leak risk in `apps/web/lib/server/groups.ts` by sending only `inviteCodeLast4` in error context.
- Standardized API response envelope updates in auth and route handlers so responses include both `requestId` and `request_id` while preserving backward compatibility.
- Added health probe routes:
- `apps/web/app/api/health/live/route.ts`
- `apps/web/app/api/health/ready/route.ts`
- Added security header baseline in `apps/web/next.config.mjs` (CSP, frame/referrer/content-type hardening).
- Added CI/CD workflow for Dokploy-triggered deploys:
- `.gitea/workflows/deploy-dokploy.yml`
- Added self-host edge + observability + backup artifacts:
- `docker/nginx/fiddy.conf`
- `docker/nginx/includes/fiddy-proxy.conf`
- `docker/observability/docker-compose.observability.yml`
- `docker/observability/loki-config.yml`
- `docker/observability/promtail-config.yml`
- `scripts/backup-postgres.sh`
- `scripts/restore-postgres.sh`
- `docs/public-launch-runbook.md`
- Added regression tests for request-id contract and limiter behavior:
- `apps/web/__tests__/errors-response.test.ts`
- `apps/web/__tests__/rate-limit.test.ts`
- Verification results:
- `npm test`: pass (`25 passed`, `1 skipped`).
- `npm run build`: pass.
- `npm run lint`: still fails due existing workspace lint script invocation issue (`next lint` resolves `apps/web/lint` path).
- Post-verification fixups:
- Added table auto-bootstrap fallback in `apps/web/lib/server/rate-limit.ts` to avoid failures in environments where migration `007_rate_limits.sql` has not been applied yet.
- Corrected `Entry` mapping consistency in `apps/web/lib/server/entries.ts` (`bucketId` included in all return shapes).
- Replaced `rowCount` checks with `rows.length` in typed query paths to satisfy current TypeScript/`pg` typings.
- Implementation correction note:
- A batch replacement briefly introduced invalid destructuring (`request_id` in `getRequestMeta` destructure). This was corrected in all affected routes before final verification.
- Created path-scoped commit for this hardening slice:
- `b1c8a4a``harden public launch api contracts and ops baseline`.
- Resolved lint pipeline breakage caused by `next lint` invocation under Next.js `16.1.6`:
- Switched `apps/web/package.json` lint script to `eslint . && node scripts/check-no-group-id-routes.cjs`.
- Added `apps/web/eslint.config.mjs` using `eslint-config-next/core-web-vitals`.
- Disabled newly surfaced React compiler-style hook rules to preserve prior lint parity without broad unrelated refactors:
- `react-hooks/error-boundaries`
- `react-hooks/immutability`
- `react-hooks/purity`
- `react-hooks/set-state-in-effect`
- Fixed a real hook-order violation in `apps/web/components/group-settings-content.tsx` by moving the keyboard-listener `useEffect` above the early `return`.
- Note: this file currently contains broader pre-existing local edits; to avoid bundling unrelated work, the hook-order adjustment is left as a workspace-local change and was not included in path-scoped commit `1f140b6`.
- Removed forbidden legacy dynamic route path under `app/groups` to satisfy repo policy script:
- Deleted `apps/web/app/groups/[id]/settings/page.tsx`.
- Removed empty directory `apps/web/app/groups/[id]`.
- Re-ran verification after lint fixes:
- `npm test`: pass (`25 passed`, `1 skipped`).
- `npm run build`: pass.
- `npm run lint`: pass (warnings only; no errors).
### Risks / Notes to Revisit
- Workspace is intentionally dirty; commits must be path-scoped to avoid mixing unrelated changes.
- `npm run lint` currently fails due `next lint` invocation behavior in this environment; lint verification needs explicit follow-up task.