Architecture

High-level architecture of the RAG Chatbot Platform.

System Overview

┌─────────────────────────────────────────────────────────┐
│                    Your Website                          │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐ │
│  │  Dashboard  │    │   Widget    │    │     API     │ │
│  │  (Next.js)  │    │  (Iframe)   │    │  (Clients)   │ │
│  └──────┬──────┘    └──────┬───────┘    └──────┬──────┘ │
└─────────┼──────────────────┼────────────────────┼────────┘
          │                  │                    │
          ▼                  ▼                    ▼
┌─────────────────────────────────────────────────────────┐
│                    API Server (FastAPI)                  │
│  Auth │ Projects │ Chat │ Sources │ Rate Lim │ Cache  │
└────────────────────────────┬────────────────────────────┘
                             │
        ┌────────────────────┼────────────────────┐
        ▼                    ▼                    ▼
┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│  PostgreSQL  │    │    Redis     │    │      S3      │
│  (pgvector)  │    │ Cache/Queue  │    │   Storage    │
└──────────────┘    └──────────────┘    └──────────────┘

Components

API Server (FastAPI)

  • Authentication (JWT and API keys)
  • Chat API with RAG
  • Project & Source management
  • Rate limiting

Background Workers (Dramatiq)

  • Document parsing (PDF, URL)
  • Chunking
  • Embedding generation

Database (PostgreSQL + pgvector)

  • Projects, Sources, Chunks
  • Vector embeddings

Cache (Redis)

  • Session cache
  • Rate limit counters
  • Background job queue

Data Flow

Document Ingestion

User Upload → API → S3 → Queue → Worker Parse → Chunk → Embed → DB

Chat Request

Widget → API → Embed Query → Vector Search → LLM → Response

Technology Stack

ComponentTechnology
APIFastAPI
WorkersDramatiq
DatabasePostgreSQL + pgvector
Cache/QueueRedis
StorageS3 (Cloudflare R2)
LLMAzure OpenAI
FrontendNext.js