Self-Hosted Deployment Guide
Follow this playbook to run Relay in your own infrastructure while preserving the governance, telemetry, and marketplace tooling available in managed environments.
Prerequisites
- Kubernetes cluster with ingress controller (NGINX or Istio)
- Postgres 14+ and Redis 6+ instances (managed services or self-hosted)
- Access to container registry for publishing Relay services
- TLS certificates managed via your preferred mechanism (Let’s Encrypt, ACM, etc.)
1. Prepare Environment Files
Create environment files for each tier:
.env.selfhosted # Core API/runtime variables
.env.telemetry # Datadog / OpenTelemetry credentials
.env.marketplace # Stripe + billing credentials (optional)
Populate secrets using the Secrets Management guide. Never commit these files to source control.
2. Provision Dependencies
- Postgres — run migrations with
python3 scripts/deployment/railway_migration_runner.pypointing to your database URL. - Redis — configure
CACHE_URLandSESSION_STORE_BACKEND=redis. - Object storage (optional) — if you enable function artefact storage, set
ARTEFACT_STORAGE_URL.
3. Deploy Core Services
Use the provided manifests under deployment/self_hosted/ or helm charts:
kubectl apply -k deployment/self_hosted/base
Core components:
- API service (
relay-api) - Execution engine workers (
relay-worker) - Scheduler / background jobs
- Monitoring sidecars (Prometheus exporters if needed)
4. Configure Networking & TLS
- Map domains (
api.example.com,console.example.com) through your ingress controller. - Attach certificates. For Istio, reference the Istio Control Plane guide.
- Enable Cloudflare Access or your corporate IdP in front of public endpoints.
5. Observability
- Set
DATADOG_API_KEY,DATADOG_APP_KEY, andDATADOG_SITE. - Run the telemetry smoke script:
scripts/monitoring/api_telemetry_smoke.sh. - Confirm dashboards load the baseline metrics defined in
deployment/datadog/.
6. Marketplace & Billing (Optional)
If you expose the marketplace or billing flows:
- Configure Stripe keys (
STRIPE_SECRET_KEY,STRIPE_WEBHOOK_SECRET). - Enable marketplace background jobs for entitlements and payouts.
- Verify billing webhooks with the sandbox environment before going live.
7. Operational Checklist
- Secrets rotated and stored outside git
- Databases backed up on schedule
- Monitoring + alerting verified (latency, error rate, cost budgets)
- Incident runbooks linked in your internal wiki
- Zero-trust audit (
python3 dev_process/validation/zero_trust_audit.py) passes
8. Upgrades & Maintenance
- Use Git tags + RUFID metadata to track deployed versions.
- Run migrations before rolling out new workers.
- Perform rolling updates via Kubernetes deployments or blue/green strategies.
- Archive artefacts and logs per compliance requirements.
For detailed architecture and runtime behaviour, see the Execution Engine Deep Dive.