This repository has been archived on 2025-02-10. You can view files and clone it, but cannot push or open issues or pull requests.
lastin-ai/docker/README.md

3.3 KiB

🐳 Docker Architecture for Last-In AI

Overview

This document outlines the containerization strategy for the Last-In AI application, including multi-stage builds, service dependencies, and security considerations.

Base Image Selection

# Build stage
FROM python:3.8-slim-bullseye as builder
# Slim-bullseye chosen for:
# - Minimal size while maintaining compatibility
# - Security updates and stability
# - Python 3.8+ compatibility

Multi-Stage Build Strategy

  1. Builder Stage

    • Install build dependencies
    • Install Python packages
    • Compile any necessary components
  2. Production Stage

    • Copy only necessary files from builder
    • Minimal runtime dependencies
    • Non-root user setup

Dependencies and Requirements

  • Python packages from requirements.txt
  • System dependencies:
    • build-essential (for some Python packages)
    • libpq-dev (for PostgreSQL)
    • Optional: tesseract-ocr (for PDF processing)

Directory Structure

/app
├── src/             # Application code
├── config/          # Configuration files
├── cache/           # Paper cache directory
├── data/           # Vector store data
└── logs/           # Application logs

Environment Configuration

  • Source: .env.example
  • Runtime variables:
    • Database credentials
    • API keys
    • Redis configuration
    • Storage paths
    • Security settings

Service Dependencies

  1. PostgreSQL

    • Primary database
    • Persistent volume for data
    • Environment: POSTGRES_* variables
  2. Redis

    • Caching layer
    • Port: 6379
    • No persistence needed

Security Considerations

  1. Non-root user execution
  2. Secret management via Docker secrets
  3. Read-only filesystem where possible
  4. Minimal base image
  5. Regular security updates
  6. Proper file permissions

Docker Compose Configuration

Services:

  1. Main application
  2. PostgreSQL database
  3. Redis cache
  4. Optional: Monitoring

Resource Management

  • Memory limits
  • CPU allocation
  • Volume mounts for:
    • Paper cache
    • Vector store
    • Logs

Health Checks

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:8000/health || exit 1

Build and Run Commands

# Build
docker build --tag lastin-ai:prod --target production .

# Run
docker run -p 8000:8000 \
    --env-file .env.prod \
    --volume ./cache:/app/cache \
    --volume ./data:/app/data \
    --volume ./logs:/app/logs \
    lastin-ai:prod

File Exclusions (.dockerignore)

.git
.env*
__pycache__
*.pyc
.pytest_cache
.coverage
htmlcov
.vscode
*.log
cache/*
data/*
logs/*

Implementation Steps

  1. Switch to Code mode
  2. Create Dockerfile
  3. Create docker-compose.yml
  4. Create .dockerignore
  5. Test build and deployment
  6. Implement health checks
  7. Configure monitoring

Security Hardening Steps

  1. Implement least privilege principle
  2. Regular dependency updates
  3. Image vulnerability scanning
  4. Secrets management
  5. Network security policies

Recommendations

  1. Use multi-stage builds for minimal production image
  2. Implement proper logging configuration
  3. Regular security audits
  4. Backup strategy for persistent data
  5. Monitoring and alerting setup

This containerization strategy ensures:

  • Efficient builds
  • Secure runtime
  • Scalable deployment
  • Proper resource management
  • Easy maintenance