FastAPI Docker Deployment: Complete Production Guide

To deploy FastAPI to production with Docker, create a multi-stage Dockerfile using Python slim images for smaller size, configure Docker Compose for local development with PostgreSQL and Redis services, set up health checks and environment variables, implement CI/CD with GitHub Actions for automated testing and deployment, use Kubernetes for orchestration with proper resource limits and auto-scaling, configure reverse proxy with Nginx for SSL termination, set up monitoring with Prometheus and Grafana, and implement logging with centralized log aggregation. This architecture ensures scalable, maintainable, and production-ready deployment.

🎓 What You’ll Learn

In previous episode, we have learned about the JWT Authentications. By the end of this tutorial, you’ll be able to:

Containerize FastAPI with Docker multi-stage builds
Set up local development with Docker Compose
Implement CI/CD pipelines with GitHub Actions
Deploy to Kubernetes clusters
Configure production-grade Nginx reverse proxy
Set up monitoring and alerting
Implement log aggregation
Deploy to cloud platforms (AWS, GCP, Azure)
Optimize for performance and cost
Implement zero-downtime deployments

📖 Understanding Production Deployment

Deployment Architecture

┌─────────────────────────────────────────────────┐
│                  Load Balancer                  │
│              (Nginx / Cloud LB)                 │
└────────────┬────────────────────────┬───────────┘
             │                        │
    ┌────────▼────────┐      ┌───────▼────────┐
    │   API Server 1  │      │  API Server 2  │
    │   (Container)   │      │  (Container)   │
    └────────┬────────┘      └────────┬────────┘
             │                        │
             └────────────┬───────────┘
                          │
                 ┌────────▼────────┐
                 │   PostgreSQL    │
                 │    (Managed)    │
                 └─────────────────┘

🛠️ Step-by-Step Implementation

Step 1: Create Production Dockerfile

Create Dockerfile:

# Multi-stage build for smaller final image
FROM python:3.11-slim as builder

# Set working directory
WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    postgresql-client \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir --user -r requirements.txt

# Final stage
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Copy Python dependencies from builder
COPY --from=builder /root/.local /root/.local

# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH

# Copy application code
COPY . .

# Create non-root user
RUN useradd -m -u 1000 fastapi && \
    chown -R fastapi:fastapi /app

# Switch to non-root user
USER fastapi

# Expose port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD python -c "import requests; requests.get('http://localhost:8000/api/v1/health')"

# Run application
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]

Create .dockerignore:

# Python
__pycache__
*.pyc
*.pyo
*.pyd
.Python
*.so
*.egg
*.egg-info
dist
build
venv/
ENV/

# IDEs
.vscode/
.idea/
*.swp
*.swo

# Git
.git
.gitignore

# Documentation
*.md
docs/

# Tests
tests/
test_*.py
*.pytest_cache

# Environment
.env
.env.local
.env.production

# OS
.DS_Store
Thumbs.db

# Logs
*.log

# Database
*.db
*.sqlite

# Alembic
alembic/versions/*.pyc

Step 2: Create Docker Compose for Development

Create docker-compose.yml:

version: '3.8'

services:
  # PostgreSQL Database
  postgres:
    image: postgres:15-alpine
    container_name: aiverse_postgres
    restart: unless-stopped
    environment:
      POSTGRES_USER: ${DB_USER:-aiverse_user}
      POSTGRES_PASSWORD: ${DB_PASSWORD:-aiverse_pass}
      POSTGRES_DB: ${DB_NAME:-aiverse_db}
    ports:
      - "${DB_PORT:-5432}:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./scripts/init-db.sql:/docker-entrypoint-initdb.d/init.sql
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER:-aiverse_user}"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - aiverse-network

  # Redis Cache
  redis:
    image: redis:7-alpine
    container_name: aiverse_redis
    restart: unless-stopped
    ports:
      - "${REDIS_PORT:-6379}:6379"
    volumes:
      - redis_data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 3s
      retries: 5
    networks:
      - aiverse-network

  # FastAPI Application
  api:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: aiverse_api
    restart: unless-stopped
    ports:
      - "${API_PORT:-8000}:8000"
    environment:
      - DATABASE_URL=postgresql+asyncpg://${DB_USER:-aiverse_user}:${DB_PASSWORD:-aiverse_pass}@postgres:5432/${DB_NAME:-aiverse_db}
      - REDIS_URL=redis://redis:6379/0
      - SECRET_KEY=${SECRET_KEY}
      - ENVIRONMENT=${ENVIRONMENT:-production}
      - DEBUG=${DEBUG:-False}
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    volumes:
      - ./logs:/app/logs
    networks:
      - aiverse-network
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/api/v1/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  # Nginx Reverse Proxy
  nginx:
    image: nginx:alpine
    container_name: aiverse_nginx
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/ssl:/etc/nginx/ssl:ro
      - ./nginx/logs:/var/log/nginx
    depends_on:
      - api
    networks:
      - aiverse-network

volumes:
  postgres_data:
    driver: local
  redis_data:
    driver: local

networks:
  aiverse-network:
    driver: bridge

Create docker-compose.dev.yml for development:

version: '3.8'

services:
  postgres:
    environment:
      POSTGRES_DB: aiverse_dev
    ports:
      - "5433:5432"

  api:
    build:
      context: .
      dockerfile: Dockerfile.dev
    environment:
      - DEBUG=True
      - ENVIRONMENT=development
      - DATABASE_ECHO=True
    volumes:
      - .:/app
      - /app/venv
    command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

Step 3: Create Nginx Configuration

Create nginx/nginx.conf:

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

events {
    worker_connections 1024;
    use epoll;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # Logging
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for" '
                    'rt=$request_time uct="$upstream_connect_time" '
                    'uht="$upstream_header_time" urt="$upstream_response_time"';

    access_log /var/log/nginx/access.log main;

    # Performance
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;

    # Gzip compression
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css text/xml text/javascript 
               application/json application/javascript application/xml+rss 
               application/rss+xml font/truetype font/opentype 
               application/vnd.ms-fontobject image/svg+xml;

    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=auth_limit:10m rate=5r/m;

    # Upstream
    upstream api_backend {
        least_conn;
        server api:8000 max_fails=3 fail_timeout=30s;
        keepalive 32;
    }

    # HTTP server (redirect to HTTPS)
    server {
        listen 80;
        server_name api.yourdomain.com;
        
        location /.well-known/acme-challenge/ {
            root /var/www/certbot;
        }

        location / {
            return 301 https://$server_name$request_uri;
        }
    }

    # HTTPS server
    server {
        listen 443 ssl http2;
        server_name api.yourdomain.com;

        # SSL configuration
        ssl_certificate /etc/nginx/ssl/cert.pem;
        ssl_certificate_key /etc/nginx/ssl/key.pem;
        ssl_protocols TLSv1.2 TLSv1.3;
        ssl_ciphers HIGH:!aNULL:!MD5;
        ssl_prefer_server_ciphers on;
        ssl_session_cache shared:SSL:10m;
        ssl_session_timeout 10m;

        # Security headers
        add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
        add_header X-Frame-Options "DENY" always;
        add_header X-Content-Type-Options "nosniff" always;
        add_header X-XSS-Protection "1; mode=block" always;
        add_header Referrer-Policy "strict-origin-when-cross-origin" always;

        # Client body size
        client_max_body_size 10M;

        # API endpoints
        location /api/ {
            limit_req zone=api_limit burst=20 nodelay;

            proxy_pass http://api_backend;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection 'upgrade';
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
            proxy_cache_bypass $http_upgrade;
            
            # Timeouts
            proxy_connect_timeout 60s;
            proxy_send_timeout 60s;
            proxy_read_timeout 60s;
        }

        # Auth endpoints with stricter rate limiting
        location /api/v1/auth/ {
            limit_req zone=auth_limit burst=5 nodelay;
            
            proxy_pass http://api_backend;
            proxy_http_version 1.1;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }

        # Health check
        location /health {
            access_log off;
            proxy_pass http://api_backend/api/v1/health;
        }

        # Metrics (restrict access)
        location /metrics {
            allow 10.0.0.0/8;
            deny all;
            proxy_pass http://api_backend/metrics;
        }
    }
}

Step 4: Create GitHub Actions CI/CD Pipeline

Create .github/workflows/ci-cd.yml:

name: CI/CD Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  # Test Job
  test:
    runs-on: ubuntu-latest
    
    services:
      postgres:
        image: postgres:15-alpine
        env:
          POSTGRES_USER: test_user
          POSTGRES_PASSWORD: test_pass
          POSTGRES_DB: test_db
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432

    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.11'
          cache: 'pip'

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
          pip install pytest pytest-asyncio httpx

      - name: Run database migrations
        env:
          DATABASE_URL: postgresql+asyncpg://test_user:test_pass@localhost:5432/test_db
        run: |
          alembic upgrade head

      - name: Run tests
        env:
          DATABASE_URL: postgresql+asyncpg://test_user:test_pass@localhost:5432/test_db
          SECRET_KEY: test-secret-key
        run: |
          pytest tests/ -v --tb=short

      - name: Lint with flake8
        run: |
          pip install flake8
          flake8 app/ --count --select=E9,F63,F7,F82 --show-source --statistics

  # Build and Push Docker Image
  build:
    needs: test
    runs-on: ubuntu-latest
    if: github.event_name == 'push'
    
    permissions:
      contents: read
      packages: write

    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Log in to Container Registry
        uses: docker/login-action@v2
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v4
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=ref,event=branch
            type=ref,event=pr
            type=semver,pattern={{version}}
            type=semver,pattern={{major}}.{{minor}}
            type=sha

      - name: Build and push Docker image
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  # Deploy to Production
  deploy:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Deploy to Kubernetes
        uses: azure/k8s-deploy@v4
        with:
          manifests: |
            k8s/deployment.yaml
            k8s/service.yaml
          images: |
            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
          kubectl-version: 'latest'

      - name: Notify deployment
        if: success()
        run: |
          echo "Deployment successful!"
          # Add Slack/Discord notification here

Step 5: Create Kubernetes Manifests

Create k8s/deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: aiverse-api
  labels:
    app: aiverse-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: aiverse-api
  template:
    metadata:
      labels:
        app: aiverse-api
    spec:
      containers:
      - name: api
        image: ghcr.io/yourusername/aiverse-backend:latest
        ports:
        - containerPort: 8000
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: aiverse-secrets
              key: database-url
        - name: SECRET_KEY
          valueFrom:
            secretKeyRef:
              name: aiverse-secrets
              key: secret-key
        - name: ENVIRONMENT
          value: "production"
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /api/v1/health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /api/v1/health
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 5

Create k8s/service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: aiverse-api-service
spec:
  selector:
    app: aiverse-api
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8000
  type: LoadBalancer

Create k8s/hpa.yaml for auto-scaling:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: aiverse-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: aiverse-api
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Step 6: Add Monitoring with Prometheus

Create app/utils/metrics.py:

"""
Prometheus metrics for monitoring

Track application performance and health
"""

from prometheus_client import Counter, Histogram, Gauge, generate_latest
from prometheus_client import CONTENT_TYPE_LATEST
from starlette.responses import Response
import time

# Request metrics
http_requests_total = Counter(
    'http_requests_total',
    'Total HTTP requests',
    ['method', 'endpoint', 'status']
)

http_request_duration_seconds = Histogram(
    'http_request_duration_seconds',
    'HTTP request duration',
    ['method', 'endpoint']
)

# Database metrics
db_connections_active = Gauge(
    'db_connections_active',
    'Active database connections'
)

db_query_duration_seconds = Histogram(
    'db_query_duration_seconds',
    'Database query duration',
    ['query_type']
)

# Authentication metrics
auth_attempts_total = Counter(
    'auth_attempts_total',
    'Total authentication attempts',
    ['status']
)

# Business metrics
active_users = Gauge(
    'active_users',
    'Number of active users'
)

conversations_total = Counter(
    'conversations_total',
    'Total conversations created'
)


async def metrics_endpoint():
    """Prometheus metrics endpoint"""
    return Response(generate_latest(), media_type=CONTENT_TYPE_LATEST)

Update app/middleware/performance_middleware.py:

# Add after existing imports
from app.utils.metrics import http_requests_total, http_request_duration_seconds

# In dispatch method, add:
async def dispatch(self, request: Request, call_next):
    start_time = time.time()
    
    response = await call_next(request)
    
    duration = time.time() - start_time
    
    # Record metrics
    http_requests_total.labels(
        method=request.method,
        endpoint=request.url.path,
        status=response.status_code
    ).inc()
    
    http_request_duration_seconds.labels(
        method=request.method,
        endpoint=request.url.path
    ).observe(duration)
    
    # Existing slow request logging...
    
    return response

Update app/main.py to add metrics endpoint:

# Add after imports
from app.utils.metrics import metrics_endpoint

# Add before routers
@app.get("/metrics")
async def metrics():
    """Prometheus metrics endpoint"""
    return await metrics_endpoint()

Create monitoring/prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'aiverse-api'
    static_configs:
      - targets: ['api:8000']
    metrics_path: '/metrics'
    scrape_interval: 10s

Create monitoring/docker-compose.monitoring.yml:

version: '3.8'

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: aiverse_prometheus
    restart: unless-stopped
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
    networks:
      - aiverse-network

  grafana:
    image: grafana/grafana:latest
    container_name: aiverse_grafana
    restart: unless-stopped
    ports:
      - "3001:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
      - GF_INSTALL_PLUGINS=grafana-piechart-panel
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/dashboards:/etc/grafana/provisioning/dashboards
      - ./grafana/datasources:/etc/grafana/provisioning/datasources
    networks:
      - aiverse-network

volumes:
  prometheus_data:
  grafana_data:

networks:
  aiverse-network:
    external: true

Step 7: Add Logging with ELK Stack (Optional)

Create logging/docker-compose.logging.yml:

version: '3.8'

services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
    container_name: aiverse_elasticsearch
    environment:
      - discovery.type=single-node
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - xpack.security.enabled=false
    ports:
      - "9200:9200"
    volumes:
      - elasticsearch_data:/usr/share/elasticsearch/data
    networks:
      - aiverse-network

  logstash:
    image: docker.elastic.co/logstash/logstash:8.11.0
    container_name: aiverse_logstash
    volumes:
      - ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
    ports:
      - "5000:5000"
    environment:
      - "LS_JAVA_OPTS=-Xmx256m -Xms256m"
    networks:
      - aiverse-network
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:8.11.0
    container_name: aiverse_kibana
    ports:
      - "5601:5601"
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
    networks:
      - aiverse-network
    depends_on:
      - elasticsearch

volumes:
  elasticsearch_data:

networks:
  aiverse-network:
    external: true

Step 8: Create Deployment Scripts

Create scripts/deploy.sh:

#!/bin/bash
set -e

echo "🚀 Starting deployment..."

# Load environment
source .env.production

# Build and push Docker image
echo "📦 Building Docker image..."
docker build -t $IMAGE_NAME:$VERSION .
docker tag $IMAGE_NAME:$VERSION $IMAGE_NAME:latest

echo "📤 Pushing to registry..."
docker push $IMAGE_NAME:$VERSION
docker push $IMAGE_NAME:latest

# Apply Kubernetes manifests
echo "☸️  Deploying to Kubernetes..."
kubectl apply -f k8s/secrets.yaml
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
kubectl apply -f k8s/hpa.yaml

# Wait for rollout
echo "⏳ Waiting for rollout..."
kubectl rollout status deployment/aiverse-api

# Run health check
echo "🏥 Running health check..."
HEALTH_URL=$(kubectl get svc aiverse-api-service -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
curl -f http://$HEALTH_URL/api/v1/health || exit 1

echo "✅ Deployment complete!"

Create scripts/rollback.sh:

#!/bin/bash
set -e

echo "🔄 Rolling back deployment..."

# Rollback Kubernetes deployment
kubectl rollout undo deployment/aiverse-api

# Wait for rollback
kubectl rollout status deployment/aiverse-api

echo "✅ Rollback complete!"

Create scripts/backup-db.sh:

#!/bin/bash
set -e

BACKUP_DIR="/backups"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="$BACKUP_DIR/backup_$DATE.sql.gz"

echo "💾 Starting database backup..."

# Create backup
pg_dump $DATABASE_URL | gzip > $BACKUP_FILE

# Upload to S3 (optional)
if [ ! -z "$AWS_S3_BUCKET" ]; then
    echo "☁️  Uploading to S3..."
    aws s3 cp $BACKUP_FILE s3://$AWS_S3_BUCKET/backups/
fi

# Clean old backups (keep last 7 days)
find $BACKUP_DIR -name "backup_*.sql.gz" -mtime +7 -delete

echo "✅ Backup complete: $BACKUP_FILE"

Make scripts executable:

chmod +x scripts/*.sh

Step 9: Create Health Check Endpoint

Update app/api/v1/endpoints/health.py:

# Add comprehensive health check
from app.db.utils import check_database_connection
from app.utils.metrics import active_users, db_connections_active

@router.get("/health/detailed")
async def detailed_health_check():
    """
    Detailed health check with component status
    
    Returns health status of all system components
    """
    health_status = {
        "status": "healthy",
        "timestamp": datetime.now().isoformat(),
        "version": settings.APP_VERSION,
        "components": {}
    }
    
    # Check database
    try:
        db_healthy = await check_database_connection()
        health_status["components"]["database"] = {
            "status": "healthy" if db_healthy else "unhealthy",
            "response_time_ms": 0  # Add actual timing
        }
    except Exception as e:
        health_status["components"]["database"] = {
            "status": "unhealthy",
            "error": str(e)
        }
        health_status["status"] = "degraded"
    
    # Check Redis (if implemented)
    # health_status["components"]["redis"] = {...}
    
    # System metrics
    import psutil
    health_status["system"] = {
        "cpu_percent": psutil.cpu_percent(),
        "memory_percent": psutil.virtual_memory().percent,
        "disk_percent": psutil.disk_usage('/').percent
    }
    
    return health_status

Step 10: Cloud Platform Deployment Examples

AWS Elastic Beanstalk

Create Dockerrun.aws.json:

{
  "AWSEBDockerrunVersion": "1",
  "Image": {
    "Name": "ghcr.io/yourusername/aiverse-backend:latest",
    "Update": "true"
  },
  "Ports": [
    {
      "ContainerPort": 8000,
      "HostPort": 8000
    }
  ],
  "Logging": "/var/log/aiverse"
}

Create .ebextensions/01_packages.config:

packages:
  yum:
    postgresql15-devel: []
    
option_settings:
  aws:elasticbeanstalk:application:environment:
    DATABASE_URL: RDS_CONNECTION_STRING
    SECRET_KEY: YOUR_SECRET_KEY

Google Cloud Run

Create cloudbuild.yaml:

steps:
  # Build the container image
  - name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', 'gcr.io/$PROJECT_ID/aiverse-api:$COMMIT_SHA', '.']
  
  # Push the container image to Container Registry
  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', 'gcr.io/$PROJECT_ID/aiverse-api:$COMMIT_SHA']
  
  # Deploy container image to Cloud Run
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    entrypoint: gcloud
    args:
      - 'run'
      - 'deploy'
      - 'aiverse-api'
      - '--image'
      - 'gcr.io/$PROJECT_ID/aiverse-api:$COMMIT_SHA'
      - '--region'
      - 'us-central1'
      - '--platform'
      - 'managed'
      - '--allow-unauthenticated'

images:
  - 'gcr.io/$PROJECT_ID/aiverse-api:$COMMIT_SHA'

Azure Container Apps

Create deployment script:

#!/bin/bash

# Create resource group
az group create --name aiverse-rg --location eastus

# Create container app environment
az containerapp env create \
  --name aiverse-env \
  --resource-group aiverse-rg \
  --location eastus

# Deploy container app
az containerapp create \
  --name aiverse-api \
  --resource-group aiverse-rg \
  --environment aiverse-env \
  --image ghcr.io/yourusername/aiverse-backend:latest \
  --target-port 8000 \
  --ingress external \
  --env-vars \
    DATABASE_URL=secretref:db-url \
    SECRET_KEY=secretref:secret-key

Step 11: Performance Optimization

Update app/core/config.py to add caching settings:

# Add to Settings class
    # Redis Configuration
    REDIS_URL: str = "redis://localhost:6379/0"
    REDIS_CACHE_ENABLED: bool = True
    REDIS_CACHE_TTL: int = 300  # 5 minutes

Create app/utils/cache.py:

"""
Redis caching utilities

Implements caching for frequently accessed data
"""

import redis.asyncio as redis
from typing import Optional, Any
import json
from functools import wraps
from app.core.config import settings

# Redis client
redis_client: Optional[redis.Redis] = None


async def get_redis() -> redis.Redis:
    """Get Redis client"""
    global redis_client
    if redis_client is None:
        redis_client = redis.from_url(
            settings.REDIS_URL,
            encoding="utf-8",
            decode_responses=True
        )
    return redis_client


async def cache_get(key: str) -> Optional[Any]:
    """Get value from cache"""
    if not settings.REDIS_CACHE_ENABLED:
        return None
    
    client = await get_redis()
    value = await client.get(key)
    
    if value:
        return json.loads(value)
    return None


async def cache_set(key: str, value: Any, ttl: int = None) -> None:
    """Set value in cache"""
    if not settings.REDIS_CACHE_ENABLED:
        return
    
    client = await get_redis()
    ttl = ttl or settings.REDIS_CACHE_TTL
    
    await client.setex(
        key,
        ttl,
        json.dumps(value, default=str)
    )


async def cache_delete(key: str) -> None:
    """Delete value from cache"""
    if not settings.REDIS_CACHE_ENABLED:
        return
    
    client = await get_redis()
    await client.delete(key)


def cached(ttl: int = None):
    """
    Decorator for caching function results
    
    Usage:
        @cached(ttl=300)
        async def get_user(user_id: int):
            ...
    """
    def decorator(func):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            # Generate cache key
            cache_key = f"{func.__name__}:{str(args)}:{str(kwargs)}"
            
            # Try to get from cache
            cached_value = await cache_get(cache_key)
            if cached_value is not None:
                return cached_value
            
            # Call function
            result = await func(*args, **kwargs)
            
            # Store in cache
            await cache_set(cache_key, result, ttl)
            
            return result
        return wrapper
    return decorator

Add caching to user service:

# In app/services/user_service.py
from app.utils.cache import cached, cache_delete

# Add decorator to frequently accessed methods
@cached(ttl=300)
async def get_user_by_id(self, user_id: int) -> User:
    # Existing implementation
    ...

# Invalidate cache on update/delete
async def update_user(self, user_id: int, user_update: UserUpdate) -> User:
    result = await self.repository.update(user_id, **update_data)
    await cache_delete(f"get_user_by_id:{user_id}")
    return result

Step 12: Create Production Checklist

Create PRODUCTION_CHECKLIST.md:

# Production Deployment Checklist

## Pre-Deployment

### Security
- [ ] Strong SECRET_KEY generated (32+ characters)
- [ ] Database passwords are strong and unique
- [ ] All secrets stored in environment variables (not in code)
- [ ] HTTPS/TLS certificates configured
- [ ] CORS origins properly restricted
- [ ] Rate limiting configured
- [ ] Security headers configured in Nginx

### Database
- [ ] Database migrations tested
- [ ] Backup strategy implemented
- [ ] Connection pooling configured
- [ ] Read replicas set up (if needed)
- [ ] Database monitoring enabled

### Application
- [ ] DEBUG=False in production
- [ ] Error tracking configured (Sentry)
- [ ] Logging configured properly
- [ ] Health checks implemented
- [ ] Metrics collection enabled

### Infrastructure
- [ ] Docker images built and pushed
- [ ] Kubernetes manifests validated
- [ ] Auto-scaling configured
- [ ] Load balancer configured
- [ ] DNS records configured

### Monitoring
- [ ] Prometheus/Grafana dashboards set up
- [ ] Alerts configured (CPU, memory, errors)
- [ ] Log aggregation configured
- [ ] Uptime monitoring enabled
- [ ] Performance monitoring enabled

## During Deployment

- [ ] Notify team of deployment
- [ ] Create database backup
- [ ] Run database migrations
- [ ] Deploy new version
- [ ] Run smoke tests
- [ ] Monitor error rates
- [ ] Check performance metrics

## Post-Deployment

- [ ] Verify health checks passing
- [ ] Test critical user flows
- [ ] Monitor for errors (24 hours)
- [ ] Update documentation
- [ ] Notify stakeholders
- [ ] Tag release in Git

## Rollback Plan

If issues occur:
1. Run rollback script: `./scripts/rollback.sh`
2. Verify health checks
3. Restore database from backup if needed
4. Investigate root cause
5. Document incident

## Emergency Contacts

- DevOps Lead: [contact]
- Database Admin: [contact]
- On-Call Engineer: [contact]

📚 Frequently Asked Questions (FAQs)

What’s the difference between Docker and Docker Compose?

Docker runs single containers, Docker Compose orchestrates multiple containers. Use Docker for building images: docker build -t myapp .. Use Docker Compose for local development with multiple services (API + PostgreSQL + Redis): docker-compose up. In production, use Kubernetes or cloud container services for orchestration at scale. Docker Compose is perfect for development; Kubernetes is for production multi-container deployments across multiple servers.

How do I implement zero-downtime deployment?

Use rolling updates in Kubernetes. Set strategy: type: RollingUpdate with maxUnavailable: 0 and maxSurge: 1. Kubernetes starts new pods before terminating old ones. Implement health checks so new pods only receive traffic when ready. Use database migrations that are backward-compatible (add columns before making them required). Blue-green deployment is another option: deploy to new environment, switch traffic after verification. Always test migrations on staging first.

How do I handle database migrations in production?

Run migrations before deploying new code in a separate CI/CD step. Always make backward-compatible changes: add new columns as nullable, deprecate old columns before removing. Use Alembic’s alembic upgrade head in the deployment pipeline. For large tables, use online migration tools to avoid locking. Always backup before migrations: pg_dump > backup.sql. Test on staging with production-like data. Never run migrations manually—automate through CI/CD.

What’s the best way to manage environment variables in production?

Never commit secrets to Git. Use Kubernetes Secrets: kubectl create secret generic aiverse-secrets –from-literal=SECRET_KEY=xxx. For AWS, use AWS Secrets Manager or Parameter Store. For GCP, use Secret Manager. Mount secrets as environment variables in containers. Rotate secrets regularly. Use different secrets per environment (dev/staging/prod). Tools like HashiCorp Vault provide advanced secret management with dynamic secrets and rotation.

How do I monitor application performance in production?

Use multiple monitoring layers: (1) Metrics: Prometheus collects metrics (CPU, memory, request rate), Grafana visualizes. (2) Logging: ELK stack (Elasticsearch, Logstash, Kibana) or cloud logging. (3) APM: Tools like Datadog, New Relic track request traces. (4) Uptime: Pingdom or UptimeRobot for availability. Set alerts for high error rates, slow responses, high resource usage. Monitor business metrics (user signups, API usage) alongside system metrics.

How many replicas should I run in production?

Minimum 3 replicas for high availability—tolerates one pod failure during updates and one random failure. Use HorizontalPodAutoscaler to scale 3-10 replicas based on CPU (70%) and memory (80%) utilization. More replicas cost more but provide better availability and handle traffic spikes. For critical services, spread across availability zones. Monitor actual traffic patterns and adjust. Start with 3, use autoscaling to handle peaks.

How do I secure API endpoints in production?

Implement multiple security layers: (1) HTTPS only with TLS 1.2+. (2) Rate limiting with nginx (5 req/sec for login, 100 req/sec for API). (3) API key/JWT authentication on all endpoints. (4) CORS properly configured. (5) Security headers (HSTS, X-Frame-Options, CSP). (6) Input validation with Pydantic. (7) SQL injection prevention (use ORMs). (8) Regular security audits. (9) DDoS protection via Cloudflare or AWS Shield. (10) Monitor for suspicious activity.

How do I optimize Docker image size?

Use multi-stage builds (we do this in our Dockerfile). Start with python:3.11-slim not python:3.11 (saves 800MB). Use .dockerignore to exclude unnecessary files. Combine RUN commands: RUN apt-get update && apt-get install -y pkg && rm -rf /var/lib/apt/lists/*. Don’t install dev dependencies in production. Use –no-cache-dir with pip. Remove build tools after use. Layer caching: put changing code last. Result: our image is ~200MB vs 1GB+ unoptimized.

How do I handle CORS in production?

Configure in FastAPI middleware with specific origins: allow_origins=[“https://app.yourdomain.com”], never [“*”] in production. Set allow_credentials=True for cookie-based auth. Use nginx for preflight caching. For multiple subdomains, use regex or list all explicitly. Consider using API gateway for centralized CORS. Different environments need different origins (localhost for dev, staging domain for staging, production domain for prod). Test with browser DevTools Network tab.

What’s the difference between Kubernetes and Docker Swarm?

Kubernetes is more complex but more powerful—industry standard for production. Provides auto-scaling, rolling updates, self-healing, extensive ecosystem. Docker Swarm is simpler but less feature-rich—easier to learn, good for small deployments. Kubernetes has steeper learning curve but worth it for production. Most cloud providers offer managed Kubernetes (EKS, GKE, AKS). If you’re just starting, use Docker Compose for local dev, managed Kubernetes for production. Docker Swarm is being phased out.

How do I debug issues in production containers?

Never SSH into containers—they’re ephemeral. Use: (1) kubectl logs pod-name for application logs. (2) kubectl describe pod pod-name for events. (3) Centralized logging (ELK) to search logs. (4) Metrics dashboards to identify patterns. (5) kubectl exec -it pod-name — /bin/bash for emergency inspection (temporary). (6) Increase log level temporarily. (7) Use APM tools for distributed tracing. (8) Reproduce in staging with production data (sanitized). Enable DEBUG logging only in non-production.

How do I prepare for high traffic/Black Friday scenarios?

Preparation checklist: (1) Load testing with tools like k6, Locust (simulate 10x traffic). (2) Increase HPA max replicas before event. (3) Scale database (read replicas, connection pooling). (4) Enable Redis caching aggressively. (5) Use CDN for static assets. (6) Rate limiting to prevent abuse. (7) Circuit breakers for external services. (8) Pre-warm instances before traffic spike. (9) Monitor dashboard ready. (10) On-call team available. (11) Rollback plan tested. (12) Post-event analysis to improve.

🎯 Summary

What You’ve Accomplished:

✅ Complete Production Deployment

Docker containerization with multi-stage builds
Docker Compose for local development
Production-grade Nginx configuration
CI/CD pipeline with GitHub Actions
Kubernetes deployment manifests
Auto-scaling configuration

✅ Monitoring & Observability

Prometheus metrics collection
Grafana dashboards
Health check endpoints
Log aggregation options
Performance monitoring

✅ Cloud Deployment

AWS Elastic Beanstalk setup
Google Cloud Run configuration
Azure Container Apps deployment
Multi-cloud deployment strategies

✅ Production Best Practices

Zero-downtime deployment
Graceful shutdown handling
Database backup strategies
Security hardening
Cost optimization

📊 Complete Tutorial Series – Final Status

✅ Ep. 1-5: FastAPI Core & Advanced Features
✅ Ep. 6: AI Integration with Ollama
✅ Ep. 7: React TypeScript Frontend
✅ Ep. 8: PostgreSQL Database Integration
✅ Ep. 9: JWT Authentication & Authorization
✅ Ep. 10: Production Deployment

🎉 COMPLETE PRODUCTION-READY APPLICATION 🎉

Version: 1.0.0
Status: Production Deployment Ready
Stack: FastAPI + PostgreSQL + Redis + React + Docker + Kubernetes

Congratulations! You’ve built and deployed a complete, production-ready, full-stack AI application! 🎊

Your application now includes:

⚡ High-performance FastAPI backend
🔐 Secure JWT authentication
📊 PostgreSQL database with migrations
🤖 AI integration with Ollama
⚛️ Modern React TypeScript frontend
🐳 Docker containerization
☸️ Kubernetes orchestration
📈 Monitoring and metrics
🚀 CI/CD automation
☁️ Multi-cloud deployment options

This is a massive achievement – you’ve gone from zero to a complete production system! 🌟

Ep.10 FastAPI Docker Deployment: Complete Production Guide