Deployment SOPs
Standard operating procedures for deploying Burdenoff products.
Deployment Overview
Environments
- Alpha: Staging/testing environment (alphaapp.[product].com)
- Production: Live production environment (app.[product].com)
Deployment Methods
- Automated: CI/CD via GitHub Actions
- Manual: Emergency deployments only
Pre-Deployment Checklist
Code Review
- All PRs reviewed and approved
- No pending review comments
- CI/CD pipeline passing
- Tests passing (unit, integration, E2E)
Testing
- Unit tests passing
- Integration tests passing
- Manual testing completed
- Security scans clean
- Performance tests passing
Documentation
- README updated
- API docs updated
- Changelog updated
- Migration notes documented
Configuration
- Environment variables set
- Secrets updated
- Database migrations ready
- Feature flags configured
Automated Deployment (CI/CD)
Alpha Deployment
Trigger
# Merge to alpha branch
git checkout alpha
git merge feature/my-feature
git push origin alpha
# Or create PR to alpha
gh pr create --base alpha
Pipeline Steps
- Checkout code
- Run linters and formatters
- Run security scans
- Run tests
- Build Docker image
- Push to Azure Container Registry
- Update Kubernetes deployment
- Run health checks
- Notify team
Monitoring
# Watch deployment
kubectl rollout status deployment/[product]-backend -n [product]-alpha
# Check logs
kubectl logs -f deployment/[product]-backend -n [product]-alpha
# Check pods
kubectl get pods -n [product]-alpha
Production Deployment
Trigger
# Merge to main branch (requires approval)
git checkout main
git merge alpha
git push origin main
Manual Approval
- PR created to main branch
- Review checklist completed
- Stakeholder approval
- Approval in GitHub Actions
- Deployment proceeds
Pipeline Steps
- All alpha pipeline steps
- Wait for manual approval
- Deploy to production
- Run smoke tests
- Monitor metrics
- Update status page
- Notify stakeholders
Manual Deployment
When to Use
- Emergency hotfixes
- Rollbacks
- Infrastructure changes
- CI/CD unavailable
Process
1. Build Image
# Build Docker image
docker build -t [product]-backend:manual-$(date +%s) .
# Tag for ACR
docker tag [product]-backend:manual-$(date +%s) \
burdenoff.azurecr.io/[product]-backend:manual-$(date +%s)
# Login to ACR
az acr login --name burdenoff
# Push image
docker push burdenoff.azurecr.io/[product]-backend:manual-$(date +%s)
2. Deploy to Kubernetes
# Update deployment
kubectl set image deployment/[product]-backend \
[product]-backend=burdenoff.azurecr.io/[product]-backend:manual-$(date +%s) \
-n [product]-alpha
# Watch rollout
kubectl rollout status deployment/[product]-backend -n [product]-alpha
3. Verify Deployment
# Check pods
kubectl get pods -n [product]-alpha
# Check logs
kubectl logs -f deployment/[product]-backend -n [product]-alpha
# Run health check
curl https://alphaapp.[product].com/health
Database Migrations
Pre-Migration
- Backup database
- Test migration locally
- Review migration script
- Plan rollback strategy
- Schedule maintenance window
Running Migration
Node.js (Prisma)
# Dry run
npm run migrate:deploy -- --create-only
# Apply migration
kubectl exec -it deployment/[product]-backend -- npm run migrate:deploy
# Verify
kubectl exec -it deployment/[product]-backend -- npm run prisma db pull
Python (Alembic)
# Check current version
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic current
# Apply migration
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic upgrade head
# Verify
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic history
Post-Migration
- Verify data integrity
- Check application logs
- Monitor error rates
- Test affected features
- Update documentation
Rollback Procedures
Application Rollback
Kubernetes Rollback
# View rollout history
kubectl rollout history deployment/[product]-backend -n [product]-alpha
# Rollback to previous version
kubectl rollout undo deployment/[product]-backend -n [product]-alpha
# Rollback to specific revision
kubectl rollout undo deployment/[product]-backend \
--to-revision=3 \
-n [product]-alpha
Helm Rollback
# List releases
helm list -n [product]-alpha
# View history
helm history [product] -n [product]-alpha
# Rollback
helm rollback [product] [revision] -n [product]-alpha
Database Rollback
Prisma
# Rollback is not automatic - use restore from backup
# Restore database backup from before migration
Alembic
# Rollback one version
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic downgrade -1
# Rollback to specific version
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic downgrade [revision]
Health Checks
Application Health
# Check liveness
curl https://alphaapp.[product].com/health/live
# Check readiness
curl https://alphaapp.[product].com/health/ready
# Expected response
{
"status": "healthy",
"checks": {
"database": "ok",
"redis": "ok",
"storage": "ok"
}
}
Kubernetes Health
# Check pod status
kubectl get pods -n [product]-alpha
# Check events
kubectl get events -n [product]-alpha --sort-by='.lastTimestamp'
# Check resource usage
kubectl top pods -n [product]-alpha
Smoke Tests
Automated Smoke Tests
# Run smoke tests
npm run test:smoke
# Or using curl
curl -f https://alphaapp.[product].com/health || exit 1
curl -f https://alphaapp.[product].com/api/status || exit 1
Manual Smoke Tests
- Homepage loads
- Login works
- API responds
- Database queries work
- File uploads work
- Key features functional
Monitoring Post-Deployment
Metrics to Watch
# Error rate
rate(http_requests_total{status=~"5.."}[5m])
# Response time (p95)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
# Request rate
rate(http_requests_total[5m])
# Pod restarts
kube_pod_container_status_restarts_total
Alerts to Monitor
- High error rate
- Slow response time
- Pod restarts
- High CPU/memory
- Failed health checks
Status Page Updates
During Deployment
Title: Scheduled Maintenance - [Product]
We are performing scheduled maintenance on [Product].
Expected duration: 30 minutes
Impact: Brief service interruption possible
Status: In Progress
After Deployment
Title: Maintenance Complete - [Product]
Scheduled maintenance has been completed successfully.
All systems are operational.
Status: Resolved
Communication
Team Notification
Slack: #deployments
Message: "🚀 Deploying [product] to alpha
- Version: v1.2.0
- Changes: [link to changelog]
- ETA: 10 minutes"
Stakeholder Notification
Subject: [Product] Deployment - v1.2.0
We are deploying version 1.2.0 to production.
New Features:
- Feature 1
- Feature 2
Bug Fixes:
- Fix 1
- Fix 2
Expected completion: [time]
Deployment Schedule
Alpha
- Anytime during business hours
- After tests pass
- Reviewed by team
Production
- Tuesday-Thursday (preferred)
- 10:00 AM - 4:00 PM UTC
- No Friday deployments
- No deployments before holidays
Emergency Hotfix
- Anytime as needed
- Follow expedited process
- Notify on-call team
- Document in post-mortem
Post-Deployment
Verification
- Deployment successful
- Health checks passing
- Smoke tests passing
- Metrics normal
- No alerts firing
- Status page updated
- Team notified
Documentation
- Update changelog
- Document any issues
- Update runbooks
- Share learnings
Emergency Procedures
Deployment Failure
- Alert on-call team
- Check logs and metrics
- Attempt rollback
- If rollback fails, restore from backup
- Investigate root cause
- Document incident
Data Loss
- Stop all write operations
- Restore from latest backup
- Verify data integrity
- Resume operations
- Incident post-mortem