# 🐳 Docker Architecture for Last-In AI ## Overview This document outlines the containerization strategy for the Last-In AI application, including multi-stage builds, service dependencies, and security considerations. ## Base Image Selection ```dockerfile # Build stage FROM python:3.8-slim-bullseye as builder # Slim-bullseye chosen for: # - Minimal size while maintaining compatibility # - Security updates and stability # - Python 3.8+ compatibility ``` ## Multi-Stage Build Strategy 1. **Builder Stage** - Install build dependencies - Install Python packages - Compile any necessary components 2. **Production Stage** - Copy only necessary files from builder - Minimal runtime dependencies - Non-root user setup ## Dependencies and Requirements - Python packages from requirements.txt - System dependencies: - build-essential (for some Python packages) - libpq-dev (for PostgreSQL) - Optional: tesseract-ocr (for PDF processing) ## Directory Structure ``` /app ├── src/ # Application code ├── config/ # Configuration files ├── cache/ # Paper cache directory ├── data/ # Vector store data └── logs/ # Application logs ``` ## Environment Configuration - Source: .env.example - Runtime variables: - Database credentials - API keys - Redis configuration - Storage paths - Security settings ## Service Dependencies 1. **PostgreSQL** - Primary database - Persistent volume for data - Environment: POSTGRES_* variables 2. **Redis** - Caching layer - Port: 6379 - No persistence needed ## Security Considerations 1. Non-root user execution 2. Secret management via Docker secrets 3. Read-only filesystem where possible 4. Minimal base image 5. Regular security updates 6. Proper file permissions ## Docker Compose Configuration Services: 1. Main application 2. PostgreSQL database 3. Redis cache 4. Optional: Monitoring ## Resource Management - Memory limits - CPU allocation - Volume mounts for: - Paper cache - Vector store - Logs ## Health Checks ```dockerfile HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ CMD curl -f http://localhost:8000/health || exit 1 ``` ## Build and Run Commands ```bash # Build docker build --tag lastin-ai:prod --target production . # Run docker run -p 8000:8000 \ --env-file .env.prod \ --volume ./cache:/app/cache \ --volume ./data:/app/data \ --volume ./logs:/app/logs \ lastin-ai:prod ``` ## File Exclusions (.dockerignore) ``` .git .env* __pycache__ *.pyc .pytest_cache .coverage htmlcov .vscode *.log cache/* data/* logs/* ``` ## Implementation Steps 1. Switch to Code mode 2. Create Dockerfile 3. Create docker-compose.yml 4. Create .dockerignore 5. Test build and deployment 6. Implement health checks 7. Configure monitoring ## Security Hardening Steps 1. Implement least privilege principle 2. Regular dependency updates 3. Image vulnerability scanning 4. Secrets management 5. Network security policies ## Recommendations 1. Use multi-stage builds for minimal production image 2. Implement proper logging configuration 3. Regular security audits 4. Backup strategy for persistent data 5. Monitoring and alerting setup This containerization strategy ensures: - Efficient builds - Secure runtime - Scalable deployment - Proper resource management - Easy maintenance