Architecture Overview

DeepWiki is a modern AI-powered documentation generation system built with a scalable microservices architecture. This document provides a comprehensive overview of the system’s architecture, components, and design decisions.

High-Level System Architecture

Frontend Architecture

The frontend is built with Next.js 14, leveraging modern React patterns and TypeScript for type safety.

Component Structure

Key Frontend Features

Server-Side Rendering (SSR)
- Pre-renders pages for better SEO and initial load performance
- Dynamic route generation for repository pages
Real-time Communication
- WebSocket connection for streaming chat responses
- Progressive content rendering
Internationalization (i18n)
- Support for 10+ languages
- Context-based language switching
Responsive Design
- Mobile-first approach
- Adaptive layouts for different screen sizes

Backend Architecture

The backend is built with FastAPI, providing high-performance async capabilities and automatic API documentation.

API Structure

RAG Pipeline Architecture

The Retrieval-Augmented Generation (RAG) pipeline is the core of DeepWiki’s intelligence.

Data Flow

Wiki Generation Flow

Real-time Chat Flow

Component Interactions

Core Service Dependencies

Database and Storage Design

Storage Architecture

Cache Key Structure

deepwiki_cache_{repo_type}_{owner}_{repo}_{language}.json

Example:
deepwiki_cache_github_microsoft_vscode_en.json

Caching Strategies

Multi-Level Caching

Wiki Cache (Persistent)
- Full wiki structures stored as JSON
- Language-specific caching
- File-based for durability
Vector Cache (In-Memory)
- FAISS indices for fast retrieval
- Rebuilt on startup from embeddings
Conversation Cache (Session)
- Dialog history per session
- Memory-efficient circular buffer

Cache Invalidation

Security Architecture

Authentication & Authorization

Security Features

API Security
- CORS configuration for cross-origin requests
- Environment-based API key management
- Input validation and sanitization
Data Protection
- Token-based repository access
- Secure storage of credentials
- Encrypted communication channels
Rate Limiting
- Request throttling per IP
- WebSocket connection limits
- Resource usage monitoring

Scalability Considerations

Horizontal Scaling

Performance Optimizations

Async Processing
- Non-blocking I/O operations
- Concurrent request handling
- Stream processing for large responses
Resource Management
- Memory limits (6GB max, 2GB reserved)
- Connection pooling
- Garbage collection optimization
Caching Strategy
- Pre-computed wiki structures
- Embedding reuse
- Response streaming

Centralized Prompt Management

As of commit 36777d3, DeepWiki implements a centralized prompt management system for better maintainability and consistency across the application.

Prompt Architecture

Prompt Types and Usage

1. RAG System Prompt

Purpose: Core prompt for RAG-based question answering
Features:
- Language detection and response matching
- Markdown formatting guidelines
- Context integration with past conversations
- Code syntax highlighting support

2. Deep Research Prompts

Three-stage research prompt system for comprehensive analysis:

First Iteration: Research planning and initial investigation
Intermediate Iterations: Building on previous findings
Final Iteration: Synthesis and comprehensive conclusion

3. Simple Chat System Prompt

Purpose: Direct, concise responses without preambles
Features:
- No markdown headers in responses
- Direct answers without filler
- Optimized for quick interactions

Benefits of Centralized Management

Consistency: All prompts follow the same standards and formatting
Maintainability: Single source of truth for all prompt templates
Versioning: Easy to track prompt changes and iterations
Testability: Prompts can be unit tested and validated
Flexibility: Easy to add new prompts or modify existing ones

Implementation Example

# api/prompts.py
"""Module containing all prompts used in the DeepWiki project."""

# Centralized prompt definitions
RAG_SYSTEM_PROMPT = r"""
You are a code assistant which answers user questions on a Github Repo.
You will receive user query, relevant context, and past conversation history.

LANGUAGE DETECTION AND RESPONSE:
- Detect the language of the user's query
- Respond in the SAME language as the user's query
...
"""

# Usage in rag.py
from api.prompts import RAG_SYSTEM_PROMPT, RAG_TEMPLATE

def create_rag_prompt(query, contexts, history):
    return RAG_TEMPLATE.format(
        system_prompt=RAG_SYSTEM_PROMPT,
        input_str=query,
        contexts=contexts,
        conversation_history=history
    )

Technology Stack

Frontend Technologies

Category	Technology	Purpose
Framework	Next.js 14	React framework with SSR
Language	TypeScript	Type safety
Styling	Tailwind CSS	Utility-first CSS
UI Library	Radix UI	Accessible components
State	React Context	Global state management
HTTP	Fetch API	REST communication
WebSocket	Native WebSocket	Real-time communication
Markdown	react-markdown	Content rendering
Diagrams	Mermaid	Flowchart rendering

Backend Technologies

Category	Technology	Purpose
Framework	FastAPI	High-performance API
Language	Python 3.11+	Backend development
Server	Uvicorn	ASGI server
Vector DB	FAISS	Similarity search
ML Framework	AdalFlow	RAG implementation
Logging	Python logging	Application logs
Environment	python-dotenv	Configuration

AI/ML Stack

Category	Technology	Purpose
Embeddings	OpenAI/Gemini	Text embeddings
Generation	Multiple LLMs	Content generation
RAG	AdalFlow	Retrieval augmentation
Vector Search	FAISS	Semantic search

Infrastructure

Category	Technology	Purpose
Container	Docker	Application packaging
Orchestration	Docker Compose	Service management
Storage	File System	Persistent storage
Process Mgmt	Supervisor	Process control

Deployment Architecture

Best Practices & Design Principles

Architecture Principles

Separation of Concerns
- Clear boundaries between frontend and backend
- Modular service design
- Independent scaling capabilities
Resilience
- Graceful error handling
- Fallback mechanisms
- Health monitoring
Performance
- Async-first design
- Efficient caching
- Resource optimization
Maintainability
- Clean code architecture
- Comprehensive logging
- Type safety throughout

Development Guidelines

Code Organization
- Feature-based module structure
- Consistent naming conventions
- Separation of business logic
Testing Strategy
- Unit tests for core logic
- Integration tests for APIs
- End-to-end testing for critical flows
Documentation
- Inline code documentation
- API documentation (OpenAPI)
- Architecture decision records

Future Enhancements

Planned Improvements

Scalability
- Kubernetes deployment support
- Distributed caching with Redis
- Message queue integration
Features
- Real-time collaboration
- Advanced analytics
- Plugin system
Performance
- GraphQL API option
- Edge caching
- Optimized vector search

Architecture Evolution

The architecture is designed to evolve with:

Microservices decomposition
Event-driven architecture
Serverless function support
Multi-region deployment

Conclusion

DeepWiki’s architecture balances performance, scalability, and maintainability while providing a robust platform for AI-powered documentation generation. The modular design allows for easy extension and adaptation to changing requirements.

Get Started

Configuration

Advanced

Support

​Architecture Overview

​High-Level System Architecture

​Frontend Architecture

​Component Structure

​Key Frontend Features

​Backend Architecture

​API Structure

​RAG Pipeline Architecture

​Data Flow

​Wiki Generation Flow

​Real-time Chat Flow

​Component Interactions

​Core Service Dependencies

​Database and Storage Design

​Storage Architecture

​Cache Key Structure

​Caching Strategies

​Multi-Level Caching

​Cache Invalidation

​Security Architecture

​Authentication & Authorization

​Security Features

​Scalability Considerations

​Horizontal Scaling

​Performance Optimizations

​Centralized Prompt Management

​Prompt Architecture

​Prompt Types and Usage

​1. RAG System Prompt

​2. Deep Research Prompts

​3. Simple Chat System Prompt

​Benefits of Centralized Management

​Implementation Example

​Technology Stack

​Frontend Technologies

​Backend Technologies

​AI/ML Stack

​Infrastructure

​Deployment Architecture

​Best Practices & Design Principles

​Architecture Principles

​Development Guidelines

​Future Enhancements

​Planned Improvements

​Architecture Evolution

​Conclusion