Add release checklist and operations runbook docs

This commit is contained in:
Rbanh 2026-02-18 20:37:43 -05:00
parent bf64195c45
commit 7b6b176b1b
3 changed files with 123 additions and 0 deletions

View File

@ -32,6 +32,10 @@ Operational limits:
- `MAX_BODY_BYTES` (default `2097152`) limits request payload size. - `MAX_BODY_BYTES` (default `2097152`) limits request payload size.
- `MAX_REQUESTS_PER_MINUTE` (default `120`) applies per-client throttling for POST API endpoints. - `MAX_REQUESTS_PER_MINUTE` (default `120`) applies per-client throttling for POST API endpoints.
Docs:
- `docs/release-checklist.md`
- `docs/operations-runbook.md`
## REST API ## REST API
### `POST /compile` ### `POST /compile`

View File

@ -0,0 +1,76 @@
# Schemeta Operations Runbook
This runbook covers baseline production operation for Schemeta API + UI.
## Runtime
- Node.js 18+ recommended.
- Start command: `npm run start`
- Default bind: `0.0.0.0:8787`
## Environment Variables
- `PORT` (default `8787`)
- `MAX_BODY_BYTES` (default `2097152`)
- Hard limit for request body size on `POST` endpoints.
- `MAX_REQUESTS_PER_MINUTE` (default `120`)
- Per-client IP rate limit window for `POST` endpoints.
- `CORS_ORIGIN` (optional)
- If set, CORS is enabled for this origin only.
## Endpoints
- `GET /health`
- Liveness probe, returns process uptime and status.
- `GET /`
- Serves workspace UI.
- `POST /compile`
- Compile + render with ERC/diagnostics and layout metrics.
- `POST /analyze`
- Topology and diagnostics summary.
- `GET /mcp/ui-bundle`
- Metadata for MCP UI embedding.
## Production Checks
1. Verify process liveness:
- `curl -s http://localhost:${PORT:-8787}/health`
2. Verify compile endpoint:
- post `frontend/sample.schemeta.json` to `/compile`.
3. Verify analyze endpoint:
- post same sample to `/analyze`.
4. Verify rate limiting:
- exceed `MAX_REQUESTS_PER_MINUTE` with repeated `POST` and confirm `429`.
## Incident Playbook
## High error rate (5xx)
1. Check process logs for stack traces and malformed payload spikes.
2. Validate request body sizes; lower/raise `MAX_BODY_BYTES` as appropriate.
3. Reproduce with `frontend/sample.schemeta.json` to isolate model-driven payload issues.
4. Roll back to previous known-good tag if regression confirmed.
## Elevated 429 responses
1. Confirm traffic source and whether bursts are expected.
2. If trusted internal clients are throttled, tune `MAX_REQUESTS_PER_MINUTE`.
3. Consider fronting with reverse proxy rate limit tiers for external users.
## UI/compile mismatch reports
1. Capture JSON from user (`Copy Repro` in workspace).
2. Re-run through `/compile` and inspect `warnings`, `errors`, and `layout_metrics`.
3. Compare with last release baseline for crossing/overlap regressions.
## Release / Rollback
1. Follow `docs/release-checklist.md`.
2. Tag releases after checklist completion and test pass.
3. Keep previous stable tag ready for fast rollback.
## Observability Recommendations
- Add structured request logs at reverse proxy layer.
- Track latency percentiles for `/compile` and `/analyze`.
- Track per-endpoint status code rates and top warning/error IDs.

43
docs/release-checklist.md Normal file
View File

@ -0,0 +1,43 @@
# Schemeta Release Checklist
Use this checklist before cutting a release tag.
## Pre-merge
- [ ] All relevant issue acceptance criteria are reviewed.
- [ ] PRs are linked to tracked Gitea issues.
- [ ] No unresolved critical/major diagnostics regressions.
## Validation Gates
- [ ] `npm test` passes.
- [ ] Core smoke flow tested in UI:
- [ ] Load sample
- [ ] Edit component/pin/net/symbol
- [ ] Undo/redo works
- [ ] Export/import roundtrip works
- [ ] Compile/analyze API smoke checks return expected envelopes.
- [ ] MCP tool calls (`schemeta_compile`, `schemeta_analyze`, `schemeta_ui_bundle`) work.
## Visual Quality
- [ ] Representative circuits reviewed for routing readability.
- [ ] Labels remain legible at common zoom levels.
- [ ] No major overlap/crossing regressions vs previous release baseline.
## Security / Operations
- [ ] Deployment env vars verified:
- [ ] `PORT`
- [ ] `MAX_BODY_BYTES`
- [ ] `MAX_REQUESTS_PER_MINUTE`
- [ ] `CORS_ORIGIN`
- [ ] Rate limiting behavior manually validated.
- [ ] Health endpoint checked in target environment.
## Release Artifacts
- [ ] `README.md` updated for any interface/version changes.
- [ ] `api_version` and `schema_version` changes reviewed/documented.
- [ ] Changelog/release notes generated with notable UX/compat changes.
- [ ] Tag pushed and release announcement links to milestone/issues.