This repository has been archived on 2025-02-10. You can view files and clone it, but cannot push or open issues or pull requests.
lastin-ai/docker/README.md

151 lines
3.3 KiB
Markdown

# 🐳 Docker Architecture for Last-In AI
## Overview
This document outlines the containerization strategy for the Last-In AI application, including multi-stage builds, service dependencies, and security considerations.
## Base Image Selection
```dockerfile
# Build stage
FROM python:3.8-slim-bullseye as builder
# Slim-bullseye chosen for:
# - Minimal size while maintaining compatibility
# - Security updates and stability
# - Python 3.8+ compatibility
```
## Multi-Stage Build Strategy
1. **Builder Stage**
- Install build dependencies
- Install Python packages
- Compile any necessary components
2. **Production Stage**
- Copy only necessary files from builder
- Minimal runtime dependencies
- Non-root user setup
## Dependencies and Requirements
- Python packages from requirements.txt
- System dependencies:
- build-essential (for some Python packages)
- libpq-dev (for PostgreSQL)
- Optional: tesseract-ocr (for PDF processing)
## Directory Structure
```
/app
├── src/ # Application code
├── config/ # Configuration files
├── cache/ # Paper cache directory
├── data/ # Vector store data
└── logs/ # Application logs
```
## Environment Configuration
- Source: .env.example
- Runtime variables:
- Database credentials
- API keys
- Redis configuration
- Storage paths
- Security settings
## Service Dependencies
1. **PostgreSQL**
- Primary database
- Persistent volume for data
- Environment: POSTGRES_* variables
2. **Redis**
- Caching layer
- Port: 6379
- No persistence needed
## Security Considerations
1. Non-root user execution
2. Secret management via Docker secrets
3. Read-only filesystem where possible
4. Minimal base image
5. Regular security updates
6. Proper file permissions
## Docker Compose Configuration
Services:
1. Main application
2. PostgreSQL database
3. Redis cache
4. Optional: Monitoring
## Resource Management
- Memory limits
- CPU allocation
- Volume mounts for:
- Paper cache
- Vector store
- Logs
## Health Checks
```dockerfile
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
```
## Build and Run Commands
```bash
# Build
docker build --tag lastin-ai:prod --target production .
# Run
docker run -p 8000:8000 \
--env-file .env.prod \
--volume ./cache:/app/cache \
--volume ./data:/app/data \
--volume ./logs:/app/logs \
lastin-ai:prod
```
## File Exclusions (.dockerignore)
```
.git
.env*
__pycache__
*.pyc
.pytest_cache
.coverage
htmlcov
.vscode
*.log
cache/*
data/*
logs/*
```
## Implementation Steps
1. Switch to Code mode
2. Create Dockerfile
3. Create docker-compose.yml
4. Create .dockerignore
5. Test build and deployment
6. Implement health checks
7. Configure monitoring
## Security Hardening Steps
1. Implement least privilege principle
2. Regular dependency updates
3. Image vulnerability scanning
4. Secrets management
5. Network security policies
## Recommendations
1. Use multi-stage builds for minimal production image
2. Implement proper logging configuration
3. Regular security audits
4. Backup strategy for persistent data
5. Monitoring and alerting setup
This containerization strategy ensures:
- Efficient builds
- Secure runtime
- Scalable deployment
- Proper resource management
- Easy maintenance