Skip to main content

Deployment SOPs

Standard operating procedures for deploying Burdenoff products.

Deployment Overview

Environments

  • Alpha: Staging/testing environment (alphaapp.[product].com)
  • Production: Live production environment (app.[product].com)

Deployment Methods

  • Automated: CI/CD via GitHub Actions
  • Manual: Emergency deployments only

Pre-Deployment Checklist

Code Review

  • All PRs reviewed and approved
  • No pending review comments
  • CI/CD pipeline passing
  • Tests passing (unit, integration, E2E)

Testing

  • Unit tests passing
  • Integration tests passing
  • Manual testing completed
  • Security scans clean
  • Performance tests passing

Documentation

  • README updated
  • API docs updated
  • Changelog updated
  • Migration notes documented

Configuration

  • Environment variables set
  • Secrets updated
  • Database migrations ready
  • Feature flags configured

Automated Deployment (CI/CD)

Alpha Deployment

Trigger

# Merge to alpha branch
git checkout alpha
git merge feature/my-feature
git push origin alpha

# Or create PR to alpha
gh pr create --base alpha

Pipeline Steps

  1. Checkout code
  2. Run linters and formatters
  3. Run security scans
  4. Run tests
  5. Build Docker image
  6. Push to Azure Container Registry
  7. Update Kubernetes deployment
  8. Run health checks
  9. Notify team

Monitoring

# Watch deployment
kubectl rollout status deployment/[product]-backend -n [product]-alpha

# Check logs
kubectl logs -f deployment/[product]-backend -n [product]-alpha

# Check pods
kubectl get pods -n [product]-alpha

Production Deployment

Trigger

# Merge to main branch (requires approval)
git checkout main
git merge alpha
git push origin main

Manual Approval

  1. PR created to main branch
  2. Review checklist completed
  3. Stakeholder approval
  4. Approval in GitHub Actions
  5. Deployment proceeds

Pipeline Steps

  1. All alpha pipeline steps
  2. Wait for manual approval
  3. Deploy to production
  4. Run smoke tests
  5. Monitor metrics
  6. Update status page
  7. Notify stakeholders

Manual Deployment

When to Use

  • Emergency hotfixes
  • Rollbacks
  • Infrastructure changes
  • CI/CD unavailable

Process

1. Build Image

# Build Docker image
docker build -t [product]-backend:manual-$(date +%s) .

# Tag for ACR
docker tag [product]-backend:manual-$(date +%s) \
burdenoff.azurecr.io/[product]-backend:manual-$(date +%s)

# Login to ACR
az acr login --name burdenoff

# Push image
docker push burdenoff.azurecr.io/[product]-backend:manual-$(date +%s)

2. Deploy to Kubernetes

# Update deployment
kubectl set image deployment/[product]-backend \
[product]-backend=burdenoff.azurecr.io/[product]-backend:manual-$(date +%s) \
-n [product]-alpha

# Watch rollout
kubectl rollout status deployment/[product]-backend -n [product]-alpha

3. Verify Deployment

# Check pods
kubectl get pods -n [product]-alpha

# Check logs
kubectl logs -f deployment/[product]-backend -n [product]-alpha

# Run health check
curl https://alphaapp.[product].com/health

Database Migrations

Pre-Migration

  • Backup database
  • Test migration locally
  • Review migration script
  • Plan rollback strategy
  • Schedule maintenance window

Running Migration

Node.js (Prisma)

# Dry run
npm run migrate:deploy -- --create-only

# Apply migration
kubectl exec -it deployment/[product]-backend -- npm run migrate:deploy

# Verify
kubectl exec -it deployment/[product]-backend -- npm run prisma db pull

Python (Alembic)

# Check current version
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic current

# Apply migration
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic upgrade head

# Verify
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic history

Post-Migration

  • Verify data integrity
  • Check application logs
  • Monitor error rates
  • Test affected features
  • Update documentation

Rollback Procedures

Application Rollback

Kubernetes Rollback

# View rollout history
kubectl rollout history deployment/[product]-backend -n [product]-alpha

# Rollback to previous version
kubectl rollout undo deployment/[product]-backend -n [product]-alpha

# Rollback to specific revision
kubectl rollout undo deployment/[product]-backend \
--to-revision=3 \
-n [product]-alpha

Helm Rollback

# List releases
helm list -n [product]-alpha

# View history
helm history [product] -n [product]-alpha

# Rollback
helm rollback [product] [revision] -n [product]-alpha

Database Rollback

Prisma

# Rollback is not automatic - use restore from backup
# Restore database backup from before migration

Alembic

# Rollback one version
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic downgrade -1

# Rollback to specific version
kubectl exec -it deployment/[product]-backend -- \
poetry run alembic downgrade [revision]

Health Checks

Application Health

# Check liveness
curl https://alphaapp.[product].com/health/live

# Check readiness
curl https://alphaapp.[product].com/health/ready

# Expected response
{
"status": "healthy",
"checks": {
"database": "ok",
"redis": "ok",
"storage": "ok"
}
}

Kubernetes Health

# Check pod status
kubectl get pods -n [product]-alpha

# Check events
kubectl get events -n [product]-alpha --sort-by='.lastTimestamp'

# Check resource usage
kubectl top pods -n [product]-alpha

Smoke Tests

Automated Smoke Tests

# Run smoke tests
npm run test:smoke

# Or using curl
curl -f https://alphaapp.[product].com/health || exit 1
curl -f https://alphaapp.[product].com/api/status || exit 1

Manual Smoke Tests

  • Homepage loads
  • Login works
  • API responds
  • Database queries work
  • File uploads work
  • Key features functional

Monitoring Post-Deployment

Metrics to Watch

# Error rate
rate(http_requests_total{status=~"5.."}[5m])

# Response time (p95)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

# Request rate
rate(http_requests_total[5m])

# Pod restarts
kube_pod_container_status_restarts_total

Alerts to Monitor

  • High error rate
  • Slow response time
  • Pod restarts
  • High CPU/memory
  • Failed health checks

Status Page Updates

During Deployment

Title: Scheduled Maintenance - [Product]

We are performing scheduled maintenance on [Product].
Expected duration: 30 minutes
Impact: Brief service interruption possible

Status: In Progress

After Deployment

Title: Maintenance Complete - [Product]

Scheduled maintenance has been completed successfully.
All systems are operational.

Status: Resolved

Communication

Team Notification

Slack: #deployments
Message: "🚀 Deploying [product] to alpha
- Version: v1.2.0
- Changes: [link to changelog]
- ETA: 10 minutes"

Stakeholder Notification

Subject: [Product] Deployment - v1.2.0

We are deploying version 1.2.0 to production.

New Features:
- Feature 1
- Feature 2

Bug Fixes:
- Fix 1
- Fix 2

Expected completion: [time]

Deployment Schedule

Alpha

  • Anytime during business hours
  • After tests pass
  • Reviewed by team

Production

  • Tuesday-Thursday (preferred)
  • 10:00 AM - 4:00 PM UTC
  • No Friday deployments
  • No deployments before holidays

Emergency Hotfix

  • Anytime as needed
  • Follow expedited process
  • Notify on-call team
  • Document in post-mortem

Post-Deployment

Verification

  • Deployment successful
  • Health checks passing
  • Smoke tests passing
  • Metrics normal
  • No alerts firing
  • Status page updated
  • Team notified

Documentation

  • Update changelog
  • Document any issues
  • Update runbooks
  • Share learnings

Emergency Procedures

Deployment Failure

  1. Alert on-call team
  2. Check logs and metrics
  3. Attempt rollback
  4. If rollback fails, restore from backup
  5. Investigate root cause
  6. Document incident

Data Loss

  1. Stop all write operations
  2. Restore from latest backup
  3. Verify data integrity
  4. Resume operations
  5. Incident post-mortem

Next Steps