3.0 KiB
3.0 KiB
Schemeta Operations Runbook
This runbook covers baseline production operation for Schemeta API + UI.
Runtime
- Node.js 18+ recommended.
- Start command:
npm run start - Default bind:
0.0.0.0:8787
Environment Variables
PORT(default8787)MAX_BODY_BYTES(default2097152)- Hard limit for request body size on
POSTendpoints.
- Hard limit for request body size on
MAX_REQUESTS_PER_MINUTE(default120)- Per-client IP rate limit window for
POSTendpoints.
- Per-client IP rate limit window for
SCHEMETA_AUTH_TOKEN(optional)- When set, all
POSTAPI routes require either:Authorization: Bearer <token>x-api-key: <token>
- When set, all
CORS_ORIGIN(optional)- If set, CORS is enabled for this origin only.
Endpoints
GET /health- Liveness probe, returns process uptime and status.
GET /- Serves workspace UI.
POST /compile- Compile + render with ERC/diagnostics and layout metrics.
POST /analyze- Topology and diagnostics summary.
GET /mcp/ui-bundle- Metadata for MCP UI embedding.
Request Correlation and Audit Logs
- Every response includes
x-request-id. - API envelopes include
request_idfor correlation in clients and logs. - Server emits one JSON audit log entry per request on response finish with:
request_idmethodpathstatusduration_msclient
Production Checks
- Verify process liveness:
curl -s http://localhost:${PORT:-8787}/health
- Verify compile endpoint:
- post
frontend/sample.schemeta.jsonto/compile.
- post
- Verify analyze endpoint:
- post same sample to
/analyze.
- post same sample to
- Verify rate limiting:
- exceed
MAX_REQUESTS_PER_MINUTEwith repeatedPOSTand confirm429.
- exceed
- Verify auth (if enabled):
- request
POST /compilewithout token and confirm401. - request with valid token and confirm
200.
- request
Incident Playbook
High error rate (5xx)
- Check process logs for stack traces and malformed payload spikes.
- Validate request body sizes; lower/raise
MAX_BODY_BYTESas appropriate. - Reproduce with
frontend/sample.schemeta.jsonto isolate model-driven payload issues. - Roll back to previous known-good tag if regression confirmed.
Elevated 429 responses
- Confirm traffic source and whether bursts are expected.
- If trusted internal clients are throttled, tune
MAX_REQUESTS_PER_MINUTE. - Consider fronting with reverse proxy rate limit tiers for external users.
UI/compile mismatch reports
- Capture JSON from user (
Copy Reproin workspace). - Re-run through
/compileand inspectwarnings,errors, andlayout_metrics. - Compare with last release baseline for crossing/overlap regressions.
Release / Rollback
- Follow
docs/release-checklist.md. - Tag releases after checklist completion and test pass.
- Keep previous stable tag ready for fast rollback.
Observability Recommendations
- Structured request logs are emitted by the app; keep proxy logs for edge-level traces.
- Track latency percentiles for
/compileand/analyze. - Track per-endpoint status code rates and top warning/error IDs.