# API Authentication
Source: https://asyncfunc.mintlify.app/endpoints/authentication
Authentication and security configuration for the DeepWiki-Open API including authorization modes and access control
The DeepWiki-Open API supports multiple authentication methods to secure your documentation generation service. This guide covers all authentication options, security configurations, and best practices.
## Authentication Overview
DeepWiki-Open uses a multi-layered authentication approach:
API provider keys configured via environment variables
Optional secret code protection for wiki generation
Personal access tokens for private repository access
Input validation and rate limiting protection
## Environment-Based Authentication
### API Provider Authentication
The API automatically validates configured AI provider credentials:
Uses `GOOGLE_API_KEY` from environment automatically
**Validation endpoint:**
```bash
curl -X GET "http://localhost:8001/auth/validate/google"
```
**Response:**
```json
{
"provider": "google",
"status": "valid",
"models_available": [
"gemini-2.0-flash",
"gemini-1.5-flash",
"gemini-1.0-pro"
],
"quota": {
"requests_per_minute": 15,
"tokens_per_minute": 32000,
"usage_today": "12%"
}
}
```
Uses `OPENAI_API_KEY` from environment automatically
**Validation endpoint:**
```bash
curl -X GET "http://localhost:8001/auth/validate/openai"
```
**Response:**
```json
{
"provider": "openai",
"status": "valid",
"models_available": [
"gpt-4o",
"gpt-4.1",
"o1",
"o4-mini"
],
"quota": {
"tier": "tier-2",
"requests_per_minute": 500,
"tokens_per_minute": 30000,
"current_usage": "8%"
}
}
```
Uses Azure credentials from environment
**Required environment variables:**
```env
AZURE_OPENAI_API_KEY=your_api_key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_VERSION=2024-02-15-preview
```
**Validation:**
```bash
curl -X GET "http://localhost:8001/auth/validate/azure"
```
### Authentication Status
Check overall authentication status:
```bash cURL
curl -X GET "http://localhost:8001/auth/status"
```
```python Python
import requests
response = requests.get("http://localhost:8001/auth/status")
auth_status = response.json()
```
```javascript JavaScript
const response = await fetch('http://localhost:8001/auth/status');
const authStatus = await response.json();
```
```json Response
{
"authentication": {
"overall_status": "healthy",
"providers": {
"google": {
"configured": true,
"valid": true,
"models_count": 3
},
"openai": {
"configured": true,
"valid": true,
"models_count": 4
},
"openrouter": {
"configured": false,
"valid": null,
"models_count": 0
},
"azure": {
"configured": true,
"valid": true,
"models_count": 2
},
"ollama": {
"configured": true,
"valid": true,
"models_count": 3,
"local": true
}
},
"recommended_provider": "google",
"fallback_providers": ["openai", "azure"]
},
"authorization": {
"mode": "disabled",
"required": false
}
}
```
## Authorization Mode
Optional access control requiring a secret code for wiki generation.
### Enabling Authorization Mode
Set authorization environment variables:
```env
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=your-secret-code-here
```
Choose a strong, unique authorization code. This provides basic access control.
Authorization mode requires a server restart:
```bash
# Stop current server
# Restart with new configuration
python -m api.main
```
Server logs should show: "Authorization mode: ENABLED"
```bash
curl -X GET "http://localhost:8001/auth/status"
```
Should show:
```json
{
"authorization": {
"mode": "enabled",
"required": true
}
}
```
### Using Authorization Mode
When authorization is enabled, all wiki generation requests must include the auth code:
```bash cURL with Auth
curl -X POST "http://localhost:8001/wiki/generate" \
-H "Content-Type: application/json" \
-d '{
"repo_url": "https://github.com/microsoft/vscode",
"model_provider": "google",
"auth_code": "your-secret-code-here"
}'
```
```python Python with Auth
import requests
response = requests.post("http://localhost:8001/wiki/generate", json={
"repo_url": "https://github.com/microsoft/vscode",
"model_provider": "google",
"auth_code": "your-secret-code-here"
})
```
```javascript JavaScript with Auth
const response = await fetch('http://localhost:8001/wiki/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
repo_url: 'https://github.com/microsoft/vscode',
model_provider: 'google',
auth_code: 'your-secret-code-here'
})
});
```
### Authorization Validation
Test authorization codes before use:
```bash Validate Auth Code
curl -X POST "http://localhost:8001/auth/validate" \
-H "Content-Type: application/json" \
-d '{
"auth_code": "your-secret-code-here"
}'
```
```json Valid Code
{
"valid": true,
"message": "Authorization code accepted"
}
```
```json Invalid Code
{
"valid": false,
"message": "Invalid authorization code",
"error": "AUTHORIZATION_FAILED"
}
```
## Repository Access Tokens
For accessing private repositories, provide personal access tokens.
### Supported Platforms
**Token creation:**
1. Go to GitHub Settings → Developer settings → Personal access tokens
2. Generate new token (classic or fine-grained)
3. Select scopes: `repo` (full repository access)
**Token format:** `ghp_xxxxxxxxxxxxxxxxxxxx` (classic) or `github_pat_xxxx` (fine-grained)
**Usage in requests:**
```json
{
"repo_url": "https://github.com/company/private-repo",
"access_token": "ghp_xxxxxxxxxxxxxxxxxxxx",
"model_provider": "google"
}
```
**Token creation:**
1. Go to GitLab User Settings → Access Tokens
2. Create personal access token
3. Select scopes: `read_repository`
**Token format:** `glpat-xxxxxxxxxxxxxxxxxxxx`
**Usage in requests:**
```json
{
"repo_url": "https://gitlab.com/company/private-repo",
"access_token": "glpat-xxxxxxxxxxxxxxxxxxxx",
"model_provider": "google"
}
```
**Token creation:**
1. Go to BitBucket Account Settings → App passwords
2. Create app password
3. Select permissions: `Repositories: Read`
**Token format:** App-specific password
**Usage in requests:**
```json
{
"repo_url": "https://bitbucket.org/company/private-repo",
"access_token": "your-app-password",
"model_provider": "google"
}
```
### Token Validation
Validate repository access tokens:
```bash Validate GitHub Token
curl -X POST "http://localhost:8001/auth/validate-token" \
-H "Content-Type: application/json" \
-d '{
"platform": "github",
"token": "ghp_xxxxxxxxxxxxxxxxxxxx"
}'
```
```json Valid Token
{
"valid": true,
"platform": "github",
"scopes": ["repo", "user"],
"user": "your-username",
"expires_at": "2024-12-31T23:59:59Z"
}
```
```json Invalid Token
{
"valid": false,
"platform": "github",
"error": "BAD_CREDENTIALS",
"message": "Token is invalid or expired"
}
```
## Security Best Practices
### Environment Security
**Secure storage:**
```bash
# Use environment files with restricted permissions
chmod 600 .env
# Never commit API keys to version control
echo ".env" >> .gitignore
# Use different keys for development and production
# .env.development vs .env.production
```
**Key rotation:**
```bash
# Regular key rotation schedule
# 1. Generate new API keys
# 2. Test with new keys
# 3. Update production environment
# 4. Revoke old keys
# 5. Monitor for any issues
```
**Monitoring:**
```javascript
{
"api_key_monitoring": {
"check_quota_daily": true,
"alert_on_high_usage": true,
"track_failed_requests": true,
"log_security_events": true
}
}
```
**HTTPS configuration:**
```nginx
# Nginx configuration for HTTPS
server {
listen 443 ssl;
server_name deepwiki.yourdomain.com;
ssl_certificate /path/to/certificate.crt;
ssl_certificate_key /path/to/private.key;
location / {
proxy_pass http://localhost:8001;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
```
**Firewall rules:**
```bash
# Only allow necessary ports
sudo ufw allow 22 # SSH
sudo ufw allow 80 # HTTP (redirect to HTTPS)
sudo ufw allow 443 # HTTPS
sudo ufw deny 8001 # Block direct API access
sudo ufw enable
```
**Input sanitization:**
```python
# Example validation middleware
def validate_repo_url(url: str) -> bool:
allowed_domains = [
'github.com',
'gitlab.com',
'bitbucket.org'
]
parsed = urlparse(url)
return (
parsed.scheme in ['https'] and
parsed.netloc in allowed_domains and
len(parsed.path.split('/')) >= 3
)
```
**Rate limiting:**
```javascript
{
"rate_limits": {
"requests_per_minute": 10,
"requests_per_hour": 100,
"requests_per_day": 1000,
"burst_allowance": 5
}
}
```
### Production Security
```env
# Production security settings
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=complex-random-string-here
# Additional security
NODE_ENV=production
LOG_LEVEL=WARNING
LOG_SENSITIVE_DATA=false
```
```bash
# Restrict API access to authorized IPs
# Using firewall or reverse proxy
iptables -A INPUT -p tcp --dport 8001 -s 192.168.1.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 8001 -j DROP
```
```javascript
{
"security_monitoring": {
"failed_auth_attempts": {
"threshold": 10,
"window": "1h",
"action": "alert"
},
"unusual_usage_patterns": {
"detect_anomalies": true,
"baseline_days": 7
},
"api_abuse_detection": {
"large_repository_attempts": true,
"rapid_fire_requests": true
}
}
}
```
## Error Handling
### Authentication Errors
```json Missing API Keys
{
"error": {
"code": "MISSING_API_KEYS",
"message": "No valid AI provider API keys configured",
"details": {
"configured_providers": [],
"required_providers": ["google", "openai", "openrouter"],
"suggestion": "Configure at least one AI provider API key"
}
}
}
```
```json Invalid Authorization
{
"error": {
"code": "AUTHORIZATION_REQUIRED",
"message": "Authorization code required but not provided",
"details": {
"auth_mode": "enabled",
"provided_code": null,
"suggestion": "Include 'auth_code' in request body"
}
}
}
```
```json Repository Access Denied
{
"error": {
"code": "REPOSITORY_ACCESS_DENIED",
"message": "Cannot access private repository",
"details": {
"repo_url": "https://github.com/company/private-repo",
"platform": "github",
"suggestion": "Provide valid access token with repository permissions"
}
}
}
```
### Token Validation Errors
```json
{
"error": {
"code": "TOKEN_EXPIRED",
"message": "Repository access token has expired",
"details": {
"platform": "github",
"expired_at": "2024-01-15T10:30:00Z",
"suggestion": "Generate new personal access token"
}
}
}
```
**Resolution:**
1. Generate new personal access token
2. Update request with new token
3. Consider using longer-lived tokens for automation
```json
{
"error": {
"code": "INSUFFICIENT_PERMISSIONS",
"message": "Token lacks required repository permissions",
"details": {
"platform": "github",
"required_scopes": ["repo"],
"current_scopes": ["user"],
"suggestion": "Create token with 'repo' scope for private repositories"
}
}
}
```
**Resolution:**
1. Create new token with correct scopes
2. For GitHub: include `repo` scope for private repositories
3. For GitLab: include `read_repository` scope
## Integration Examples
### Middleware Authentication
```python Python Middleware
from functools import wraps
from flask import request, jsonify
def require_auth(f):
@wraps(f)
def decorated_function(*args, **kwargs):
auth_code = request.json.get('auth_code')
if not validate_auth_code(auth_code):
return jsonify({
'error': 'Invalid or missing authorization code'
}), 401
return f(*args, **kwargs)
return decorated_function
@app.route('/wiki/generate', methods=['POST'])
@require_auth
def generate_wiki():
# Wiki generation logic
pass
```
```javascript Node.js Middleware
const authenticateRequest = (req, res, next) => {
const { auth_code } = req.body;
if (!validateAuthCode(auth_code)) {
return res.status(401).json({
error: 'Invalid or missing authorization code'
});
}
next();
};
app.post('/wiki/generate', authenticateRequest, (req, res) => {
// Wiki generation logic
});
```
### Client-Side Authentication
```javascript React Hook
import { useState, useEffect } from 'react';
export function useDeepWikiAuth() {
const [authStatus, setAuthStatus] = useState(null);
useEffect(() => {
checkAuthStatus();
}, []);
const checkAuthStatus = async () => {
try {
const response = await fetch('/api/auth/status');
const status = await response.json();
setAuthStatus(status);
} catch (error) {
console.error('Auth check failed:', error);
}
};
const validateAuthCode = async (code) => {
try {
const response = await fetch('/api/auth/validate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ auth_code: code })
});
return response.ok;
} catch (error) {
return false;
}
};
return { authStatus, validateAuthCode, checkAuthStatus };
}
```
## Next Steps
Learn about wiki generation and management endpoints
Implement comprehensive security measures
Deploy with proper authentication in production
Set up authentication monitoring and alerts
# Chat API Endpoints
Source: https://asyncfunc.mintlify.app/endpoints/chat-endpoints
Complete API reference for chat functionality with streaming, research, and real-time features
# Chat API Endpoints
DeepWikiOpen provides powerful chat capabilities with streaming responses, deep research mode, conversation management, and real-time WebSocket connections.
## Authentication
All chat endpoints require authentication using Bearer tokens:
```bash
Authorization: Bearer
```
## Rate Limits
* **Standard Chat**: 100 requests per minute
* **Research Mode**: 20 requests per minute
* **WebSocket**: 1000 messages per minute
* **History Operations**: 200 requests per minute
***
## POST /chat/stream
Stream chat responses with RAG (Retrieval-Augmented Generation) capabilities.
The user's message or question
Unique identifier for conversation continuity. Auto-generated if not provided.
AI model to use for response generation
Options: `gpt-4-turbo`, `gpt-3.5-turbo`, `claude-3-opus`, `claude-3-sonnet`
Response creativity (0.0 to 2.0)
Maximum response length (1 to 4096)
Specific knowledge sources to query
Options: `wikipedia`, `academic_papers`, `news`, `documentation`, `all`
Streaming configuration options
Include source citations in stream
Include processing metadata
### Request Example
```bash curl
curl -X POST "https://api.deepwikiopen.com/chat/stream" \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"message": "Explain quantum computing and its applications",
"conversation_id": "conv_123",
"model": "gpt-4-turbo",
"temperature": 0.7,
"context_sources": ["wikipedia", "academic_papers"],
"stream_options": {
"include_sources": true,
"include_metadata": true
}
}'
```
```python Python
import requests
import json
def stream_chat(message):
url = "https://api.deepwikiopen.com/chat/stream"
headers = {
"Authorization": "Bearer your-api-key",
"Content-Type": "application/json"
}
data = {
"message": message,
"model": "gpt-4-turbo",
"temperature": 0.7,
"context_sources": ["wikipedia", "academic_papers"],
"stream_options": {
"include_sources": True
}
}
response = requests.post(url, headers=headers, json=data, stream=True)
for line in response.iter_lines():
if line:
decoded_line = line.decode('utf-8')
if decoded_line.startswith('data: '):
try:
chunk = json.loads(decoded_line[6:])
if chunk.get('type') == 'content':
print(chunk['content'], end='', flush=True)
except json.JSONDecodeError:
pass
stream_chat("Explain quantum computing and its applications")
```
```javascript JavaScript
async function streamChat(message) {
const response = await fetch('https://api.deepwikiopen.com/chat/stream', {
method: 'POST',
headers: {
'Authorization': 'Bearer your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
message: message,
model: 'gpt-4-turbo',
temperature: 0.7,
context_sources: ['wikipedia', 'academic_papers'],
stream_options: {
include_sources: true
}
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
try {
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
try {
const data = JSON.parse(line.slice(6));
if (data.type === 'content') {
process.stdout.write(data.content);
}
} catch (e) {
// Skip invalid JSON
}
}
}
}
} finally {
reader.releaseLock();
}
}
streamChat('Explain quantum computing and its applications');
```
### Streaming Response Format (SSE)
The response is streamed using Server-Sent Events (SSE) format:
Type of stream chunk
Values: `start`, `content`, `sources`, `metadata`, `end`, `error`
Text content (for `content` type chunks)
Retrieved knowledge sources (for `sources` type chunks)
Source document title
Source URL or identifier
Relevant text excerpt
Relevance score (0.0 to 1.0)
Processing metadata (for `metadata` type chunks)
Total tokens consumed
Processing time in milliseconds
Number of sources searched
### Example Stream Response
```
data: {"type": "start", "conversation_id": "conv_123", "timestamp": "2024-01-15T10:30:00Z"}
data: {"type": "content", "content": "Quantum computing is a revolutionary"}
data: {"type": "content", "content": " approach to computation that leverages"}
data: {"type": "sources", "sources": [
{
"title": "Quantum Computing - Wikipedia",
"url": "https://en.wikipedia.org/wiki/Quantum_computing",
"snippet": "Quantum computing is a type of computation whose operations can harness the phenomena of quantum mechanics...",
"relevance_score": 0.92
}
]}
data: {"type": "content", "content": " quantum mechanical phenomena..."}
data: {"type": "metadata", "metadata": {"tokens_used": 156, "processing_time": 1250, "sources_queried": 5}}
data: {"type": "end", "conversation_id": "conv_123", "total_tokens": 156}
```
***
## POST /chat/research
Trigger deep research mode for comprehensive, multi-source analysis.
Research question or topic
Depth of research analysis
Options: `quick`, `standard`, `deep`, `comprehensive`
Types of sources to include in research
Options: `academic`, `wikipedia`, `news`, `documentation`, `books`, `patents`
Filter sources by publication date
Start date (ISO 8601 format)
End date (ISO 8601 format)
Maximum number of sources to analyze (5 to 100)
Response format structure
Options: `structured`, `narrative`, `bullet_points`, `academic`
### Request Example
```bash curl
curl -X POST "https://api.deepwikiopen.com/chat/research" \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"query": "Latest developments in CRISPR gene editing 2024",
"research_depth": "deep",
"source_types": ["academic", "news", "patents"],
"time_range": {
"start_date": "2024-01-01",
"end_date": "2024-12-31"
},
"max_sources": 50,
"output_format": "structured"
}'
```
```python Python
import requests
def deep_research(query):
url = "https://api.deepwikiopen.com/chat/research"
headers = {
"Authorization": "Bearer your-api-key",
"Content-Type": "application/json"
}
data = {
"query": query,
"research_depth": "deep",
"source_types": ["academic", "news", "patents"],
"time_range": {
"start_date": "2024-01-01",
"end_date": "2024-12-31"
},
"max_sources": 50,
"output_format": "structured"
}
response = requests.post(url, headers=headers, json=data)
return response.json()
result = deep_research("Latest developments in CRISPR gene editing 2024")
print(result)
```
```javascript JavaScript
async function deepResearch(query) {
const response = await fetch('https://api.deepwikiopen.com/chat/research', {
method: 'POST',
headers: {
'Authorization': 'Bearer your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
query: query,
research_depth: 'deep',
source_types: ['academic', 'news', 'patents'],
time_range: {
start_date: '2024-01-01',
end_date: '2024-12-31'
},
max_sources: 50,
output_format: 'structured'
})
});
return await response.json();
}
deepResearch('Latest developments in CRISPR gene editing 2024')
.then(result => console.log(result));
```
### Response Format
Unique identifier for the research session
Original research query
Executive summary of findings
Main discoveries and insights
Key insight or discovery
Confidence score (0.0 to 1.0)
Sources supporting this finding
Comprehensive analysis text
All sources analyzed
Source title
Author names
Publication date
Source URL
Relevance to query (0.0 to 1.0)
Important points from source
Research session metadata
Total sources processed
Time taken in seconds
Overall confidence in findings
***
## GET /chat/history
Retrieve conversation history for a specific conversation or user.
Specific conversation ID to retrieve
User ID to get all conversations for
Maximum number of messages to return (1 to 1000)
Number of messages to skip for pagination
Filter messages after this date (ISO 8601)
Filter messages before this date (ISO 8601)
### Request Example
```bash curl
curl -X GET "https://api.deepwikiopen.com/chat/history?conversation_id=conv_123&limit=100" \
-H "Authorization: Bearer your-api-key"
```
```python Python
import requests
def get_chat_history(conversation_id, limit=50):
url = f"https://api.deepwikiopen.com/chat/history"
headers = {"Authorization": "Bearer your-api-key"}
params = {
"conversation_id": conversation_id,
"limit": limit
}
response = requests.get(url, headers=headers, params=params)
return response.json()
history = get_chat_history("conv_123", limit=100)
```
```javascript JavaScript
async function getChatHistory(conversationId, limit = 50) {
const params = new URLSearchParams({
conversation_id: conversationId,
limit: limit.toString()
});
const response = await fetch(`https://api.deepwikiopen.com/chat/history?${params}`, {
headers: {
'Authorization': 'Bearer your-api-key'
}
});
return await response.json();
}
getChatHistory('conv_123', 100);
```
### Response Format
Conversation identifier
Array of conversation messages
Unique message identifier
Message sender role
Values: `user`, `assistant`, `system`
Message content
Message timestamp (ISO 8601)
Additional message data
AI model that generated response
Tokens consumed for this message
Knowledge sources used
Pagination information
Total messages in conversation
Whether more messages are available
Offset for next page
***
## DELETE /chat/history
Clear conversation history for a specific conversation or user.
Specific conversation to clear (optional if user\_id provided)
Clear all conversations for user (optional if conversation\_id provided)
Only clear messages before this date (ISO 8601)
### Request Example
```bash curl
curl -X DELETE "https://api.deepwikiopen.com/chat/history" \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"conversation_id": "conv_123"
}'
```
```python Python
import requests
def clear_chat_history(conversation_id):
url = "https://api.deepwikiopen.com/chat/history"
headers = {
"Authorization": "Bearer your-api-key",
"Content-Type": "application/json"
}
data = {"conversation_id": conversation_id}
response = requests.delete(url, headers=headers, json=data)
return response.json()
result = clear_chat_history("conv_123")
```
```javascript JavaScript
async function clearChatHistory(conversationId) {
const response = await fetch('https://api.deepwikiopen.com/chat/history', {
method: 'DELETE',
headers: {
'Authorization': 'Bearer your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
conversation_id: conversationId
})
});
return await response.json();
}
clearChatHistory('conv_123');
```
### Response Format
Whether the operation was successful
Number of messages deleted
Number of conversations affected
***
## WebSocket /ws/chat
Real-time bidirectional chat connection with live streaming and collaborative features.
### Connection URL
```
wss://api.deepwikiopen.com/ws/chat?token=your-api-key&conversation_id=conv_123
```
### Query Parameters
Your API authentication token
Conversation ID to connect to (auto-generated if not provided)
User identifier for multi-user conversations
### Connection Example
```javascript JavaScript
const ws = new WebSocket('wss://api.deepwikiopen.com/ws/chat?token=your-api-key&conversation_id=conv_123');
ws.onopen = function(event) {
console.log('Connected to chat WebSocket');
// Send initial message
ws.send(JSON.stringify({
type: 'message',
content: 'Hello, can you help me with quantum computing?',
model: 'gpt-4-turbo'
}));
};
ws.onmessage = function(event) {
const data = JSON.parse(event.data);
console.log('Received:', data);
switch(data.type) {
case 'content':
// Stream AI response content
process.stdout.write(data.content);
break;
case 'user_message':
// Another user's message in group chat
console.log(`${data.user}: ${data.content}`);
break;
case 'typing':
// Typing indicator
console.log(`${data.user} is typing...`);
break;
}
};
ws.onerror = function(error) {
console.error('WebSocket error:', error);
};
ws.onclose = function(event) {
console.log('Disconnected from chat WebSocket');
};
```
```python Python
import asyncio
import websockets
import json
async def chat_websocket():
uri = "wss://api.deepwikiopen.com/ws/chat?token=your-api-key&conversation_id=conv_123"
async with websockets.connect(uri) as websocket:
# Send initial message
await websocket.send(json.dumps({
"type": "message",
"content": "Hello, can you help me with quantum computing?",
"model": "gpt-4-turbo"
}))
# Listen for responses
async for message in websocket:
data = json.loads(message)
if data["type"] == "content":
print(data["content"], end="", flush=True)
elif data["type"] == "user_message":
print(f"\n{data['user']}: {data['content']}")
elif data["type"] == "typing":
print(f"\n{data['user']} is typing...")
# Run the WebSocket client
asyncio.run(chat_websocket())
```
### Message Types
#### Client to Server Messages
Message type
Values: `message`, `typing`, `stop_generation`, `join_room`, `leave_room`
Message content (for `message` type)
AI model to use (for `message` type)
Room to join/leave (for room operations)
#### Server to Client Messages
Response type
Values: `content`, `sources`, `user_message`, `typing`, `error`, `connection_info`
Response content or user message
Username for user messages and typing indicators
Current conversation identifier
Error information (for `error` type)
Error code
Error description
### Real-time Features
#### Typing Indicators
```javascript
// Send typing indicator
ws.send(JSON.stringify({
type: 'typing',
is_typing: true
}));
// Stop typing indicator
ws.send(JSON.stringify({
type: 'typing',
is_typing: false
}));
```
#### Multi-user Chat Rooms
```javascript
// Join a chat room
ws.send(JSON.stringify({
type: 'join_room',
room_id: 'quantum_physics_discussion'
}));
// Leave a chat room
ws.send(JSON.stringify({
type: 'leave_room',
room_id: 'quantum_physics_discussion'
}));
```
#### Stop AI Generation
```javascript
// Stop the AI from generating more content
ws.send(JSON.stringify({
type: 'stop_generation'
}));
```
***
## Error Handling
### HTTP Error Codes
Invalid request parameters or malformed JSON
Missing or invalid API key
Insufficient permissions or quota exceeded
Conversation or resource not found
Rate limit exceeded
Server error - try again later
### Error Response Format
```json
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Rate limit exceeded. Try again in 60 seconds.",
"details": {
"retry_after": 60,
"current_usage": 100,
"limit": 100
}
}
}
```
### WebSocket Error Codes
Authentication token is invalid or expired
Too many messages sent too quickly
Message format is invalid or unsupported
Specified conversation ID doesn't exist
***
## Real Conversation Examples
### Basic Q\&A with Streaming
```bash
# Request
curl -X POST "https://api.deepwikiopen.com/chat/stream" \
-H "Authorization: Bearer demo-key" \
-H "Content-Type: application/json" \
-d '{
"message": "What causes climate change?",
"context_sources": ["wikipedia", "academic_papers"]
}'
# Streaming Response
data: {"type": "start", "conversation_id": "conv_abc123"}
data: {"type": "content", "content": "Climate change is primarily caused by human activities that increase"}
data: {"type": "content", "content": " greenhouse gas concentrations in the atmosphere. The main causes include:"}
data: {"type": "sources", "sources": [
{
"title": "Climate Change - IPCC Report 2023",
"url": "https://www.ipcc.ch/report/ar6/",
"snippet": "Human influence has unequivocally warmed the planet...",
"relevance_score": 0.95
}
]}
data: {"type": "content", "content": "\n\n**1. Fossil Fuel Combustion**\nBurning coal, oil, and natural gas..."}
data: {"type": "end", "conversation_id": "conv_abc123", "total_tokens": 342}
```
### Deep Research Mode
```bash
# Research Request
curl -X POST "https://api.deepwikiopen.com/chat/research" \
-H "Authorization: Bearer demo-key" \
-H "Content-Type: application/json" \
-d '{
"query": "Impact of AI on healthcare diagnostics",
"research_depth": "comprehensive",
"source_types": ["academic", "news", "documentation"],
"max_sources": 25
}'
# Response (truncated for brevity)
{
"research_id": "research_xyz789",
"query": "Impact of AI on healthcare diagnostics",
"summary": "AI is revolutionizing healthcare diagnostics through machine learning algorithms that can analyze medical images, predict disease outcomes, and assist in early detection with unprecedented accuracy...",
"key_findings": [
{
"finding": "AI diagnostic systems show 94% accuracy in detecting skin cancer vs 86% for dermatologists",
"confidence": 0.92,
"supporting_sources": ["Nature Medicine 2024", "JAMA Dermatology 2024"]
}
],
"sources": [
{
"title": "AI in Medical Diagnosis: A Systematic Review",
"authors": ["Smith, J.", "Chen, L.", "Kumar, R."],
"publication_date": "2024-03-15",
"url": "https://doi.org/10.1038/s41591-024-2847-2",
"relevance_score": 0.96,
"key_points": [
"Machine learning models outperform traditional diagnostic methods",
"Reduced diagnostic errors by 23% in clinical trials"
]
}
]
}
```
### WebSocket Multi-user Chat
```javascript
// User A connects and sends message
const wsA = new WebSocket('wss://api.deepwikiopen.com/ws/chat?token=user-a-key&room_id=ai_discussion');
wsA.send(JSON.stringify({
type: 'message',
content: 'What are the latest developments in transformer models?'
}));
// User B receives the message and AI response
// Message from User A: {"type": "user_message", "user": "UserA", "content": "What are..."}
// AI starts responding: {"type": "content", "content": "The latest developments..."}
// User B sends a follow-up
const wsB = new WebSocket('wss://api.deepwikiopen.com/ws/chat?token=user-b-key&room_id=ai_discussion');
wsB.send(JSON.stringify({
type: 'message',
content: 'Can you elaborate on attention mechanisms?'
}));
```
## SDK Libraries
For easier integration, use our official SDK libraries:
* **Python**: `pip install deepwikiopen`
* **JavaScript/Node.js**: `npm install deepwikiopen-sdk`
* **Go**: `go get github.com/deepwikiopen/go-sdk`
* **Rust**: `cargo add deepwikiopen`
Each SDK provides typed interfaces, automatic retry logic, streaming support, and WebSocket management.
# null
Source: https://asyncfunc.mintlify.app/endpoints/model-endpoints
# Model Endpoints API
DeepWiki provides a flexible provider-based model selection system that supports multiple LLM providers. This documentation covers the model-related API endpoints and how to work with different model providers.
## Overview
DeepWiki's model provider system allows you to choose from various AI model providers including:
* **Google** - Gemini models
* **OpenAI** - GPT models
* **OpenRouter** - Access to multiple model providers through a unified API
* **Azure OpenAI** - Azure-hosted OpenAI models
* **Ollama** - Locally running open-source models
* **AWS Bedrock** - Amazon's managed AI models
* **DashScope** - Alibaba's AI models
Each provider offers different models with specific capabilities and pricing. The system is designed to be extensible, allowing service providers to add custom models as needed.
## Authentication
Before using any model provider, you need to configure the appropriate API keys as environment variables:
```bash
# Google Gemini
GOOGLE_API_KEY=your_google_api_key
# OpenAI
OPENAI_API_KEY=your_openai_api_key
# OpenRouter
OPENROUTER_API_KEY=your_openrouter_api_key
# Azure OpenAI
AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_ENDPOINT=your_azure_openai_endpoint
AZURE_OPENAI_VERSION=your_azure_openai_version
# AWS Bedrock
AWS_ACCESS_KEY_ID=your_aws_access_key
AWS_SECRET_ACCESS_KEY=your_aws_secret_key
AWS_REGION=your_aws_region
# Ollama (if not local)
OLLAMA_HOST=http://your-ollama-host:11434
# DashScope
DASHSCOPE_API_KEY=your_dashscope_api_key
```
## Endpoints
### Get Model Configuration
Retrieves the available model providers and their supported models.
```http
GET /models/config
```
#### Response
```json
{
"providers": [
{
"id": "google",
"name": "Google",
"supportsCustomModel": true,
"models": [
{
"id": "gemini-2.0-flash",
"name": "gemini-2.0-flash"
},
{
"id": "gemini-2.5-flash-preview-05-20",
"name": "gemini-2.5-flash-preview-05-20"
},
{
"id": "gemini-2.5-pro-preview-03-25",
"name": "gemini-2.5-pro-preview-03-25"
}
]
},
{
"id": "openai",
"name": "Openai",
"supportsCustomModel": true,
"models": [
{
"id": "gpt-4o",
"name": "gpt-4o"
},
{
"id": "gpt-4.1",
"name": "gpt-4.1"
},
{
"id": "o1",
"name": "o1"
},
{
"id": "o3",
"name": "o3"
},
{
"id": "o4-mini",
"name": "o4-mini"
}
]
}
],
"defaultProvider": "google"
}
```
#### Example Requests
**cURL:**
```bash
curl -X GET "http://localhost:8001/models/config" \
-H "Accept: application/json"
```
**Python:**
```python
import requests
response = requests.get("http://localhost:8001/models/config")
config = response.json()
# List all providers
for provider in config["providers"]:
print(f"Provider: {provider['name']}")
for model in provider["models"]:
print(f" - {model['id']}")
```
**JavaScript:**
```javascript
const response = await fetch('http://localhost:8001/models/config');
const config = await response.json();
// Get available models for a specific provider
const googleModels = config.providers
.find(p => p.id === 'google')
?.models || [];
```
### Using Models in Chat Completions
The model selection is integrated into the chat completions endpoint. You specify the provider and model when making requests.
```http
POST /chat/completions/stream
```
#### Request Body
```json
{
"repo_url": "https://github.com/user/repo",
"messages": [
{
"role": "user",
"content": "Explain the main functionality of this repository"
}
],
"provider": "google",
"model": "gemini-2.0-flash",
"language": "en",
"token": "optional_github_token_for_private_repos"
}
```
#### Parameters
| Parameter | Type | Required | Description |
| ---------- | ------ | -------- | ------------------------------------------------------------------------------ |
| `repo_url` | string | Yes | URL of the repository to analyze |
| `messages` | array | Yes | Array of chat messages |
| `provider` | string | No | Model provider ID (default: "google") |
| `model` | string | No | Model ID for the specified provider (uses provider's default if not specified) |
| `language` | string | No | Language for content generation (default: "en") |
| `token` | string | No | Personal access token for private repositories |
| `type` | string | No | Repository type: "github", "gitlab", or "bitbucket" (default: "github") |
#### Example Requests
**cURL with Google Gemini:**
```bash
curl -X POST "http://localhost:8001/chat/completions/stream" \
-H "Content-Type: application/json" \
-d '{
"repo_url": "https://github.com/asyncfuncai/deepwiki-open",
"messages": [
{
"role": "user",
"content": "What is the main purpose of this project?"
}
],
"provider": "google",
"model": "gemini-2.0-flash"
}'
```
**Python with OpenAI:**
```python
import requests
import json
url = "http://localhost:8001/chat/completions/stream"
data = {
"repo_url": "https://github.com/asyncfuncai/deepwiki-open",
"messages": [
{
"role": "user",
"content": "Explain the architecture of this application"
}
],
"provider": "openai",
"model": "gpt-4o"
}
response = requests.post(url, json=data, stream=True)
for line in response.iter_lines():
if line:
print(line.decode('utf-8'))
```
**JavaScript with OpenRouter:**
```javascript
const response = await fetch('http://localhost:8001/chat/completions/stream', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
repo_url: 'https://github.com/asyncfuncai/deepwiki-open',
messages: [
{
role: 'user',
content: 'What are the key features of this repository?'
}
],
provider: 'openrouter',
model: 'anthropic/claude-3.5-sonnet'
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(decoder.decode(value));
}
```
## Model Provider Details
### Google (Gemini)
Default provider with fast and capable models.
**Available Models:**
* `gemini-2.0-flash` - Fast, efficient model (default)
* `gemini-2.5-flash-preview-05-20` - Preview of upcoming flash model
* `gemini-2.5-pro-preview-03-25` - Preview of pro model
**Configuration:**
```json
{
"provider": "google",
"model": "gemini-2.0-flash"
}
```
### OpenAI
Industry-standard GPT models.
**Available Models:**
* `gpt-4o` - Latest GPT-4 model (default)
* `gpt-4.1` - Updated GPT-4 version
* `o1` - Reasoning model
* `o3` - Advanced model
* `o4-mini` - Smaller, faster model
**Configuration:**
```json
{
"provider": "openai",
"model": "gpt-4o"
}
```
### OpenRouter
Access multiple model providers through a unified API.
**Available Models:**
* `openai/gpt-4o` - OpenAI GPT-4 (default)
* `deepseek/deepseek-r1` - DeepSeek reasoning model
* `anthropic/claude-3.7-sonnet` - Claude 3.7 Sonnet
* `anthropic/claude-3.5-sonnet` - Claude 3.5 Sonnet
* And many more...
**Configuration:**
```json
{
"provider": "openrouter",
"model": "anthropic/claude-3.5-sonnet"
}
```
### Azure OpenAI
Azure-hosted OpenAI models with enterprise features.
**Available Models:**
* `gpt-4o` - GPT-4 on Azure (default)
* `gpt-4` - Standard GPT-4
* `gpt-35-turbo` - GPT-3.5 Turbo
* `gpt-4-turbo` - GPT-4 Turbo
**Configuration:**
```json
{
"provider": "azure",
"model": "gpt-4o"
}
```
**Note:** Requires Azure OpenAI endpoint and API version configuration.
### Ollama
Run models locally for privacy and cost efficiency.
**Available Models:**
* `qwen3:1.7b` - Small, fast model (default)
* `llama3:8b` - Llama 3 8B model
* `qwen3:8b` - Qwen 3 8B model
**Configuration:**
```json
{
"provider": "ollama",
"model": "llama3:8b"
}
```
**Note:** Requires Ollama to be running locally or accessible via OLLAMA\_HOST.
### AWS Bedrock
Amazon's managed AI service.
**Available Models:**
* `anthropic.claude-3-sonnet-20240229-v1:0` - Claude 3 Sonnet (default)
* `anthropic.claude-3-haiku-20240307-v1:0` - Claude 3 Haiku
* `anthropic.claude-3-opus-20240229-v1:0` - Claude 3 Opus
* `amazon.titan-text-express-v1` - Amazon Titan
* `cohere.command-r-v1:0` - Cohere Command R
* `ai21.j2-ultra-v1` - AI21 Jurassic
**Configuration:**
```json
{
"provider": "bedrock",
"model": "anthropic.claude-3-sonnet-20240229-v1:0"
}
```
### DashScope
Alibaba's AI models.
**Available Models:**
* `qwen-plus` - Qwen Plus (default)
* `qwen-turbo` - Qwen Turbo
* `deepseek-r1` - DeepSeek R1
**Configuration:**
```json
{
"provider": "dashscope",
"model": "qwen-plus"
}
```
## Custom Models
Providers that support custom models (where `supportsCustomModel: true`) allow you to specify model IDs not listed in the predefined options. This is useful for:
* Newly released models
* Fine-tuned models
* Private or custom deployments
**Example with custom model:**
```json
{
"provider": "openai",
"model": "ft:gpt-3.5-turbo-0125:custom:model:id"
}
```
## Error Handling
The API returns standard HTTP status codes and error messages.
### Common Errors
**400 Bad Request:**
```json
{
"detail": "No messages provided"
}
```
**401 Unauthorized:**
```json
{
"detail": "Invalid API key for provider"
}
```
**404 Not Found:**
```json
{
"detail": "Model not found for provider"
}
```
**500 Internal Server Error:**
```json
{
"detail": "Error preparing retriever: No valid document embeddings found"
}
```
### Error Handling Examples
**Python:**
```python
try:
response = requests.post(url, json=data)
response.raise_for_status()
result = response.json()
except requests.exceptions.HTTPError as e:
if e.response.status_code == 400:
print(f"Bad request: {e.response.json()['detail']}")
elif e.response.status_code == 500:
print(f"Server error: {e.response.json()['detail']}")
```
**JavaScript:**
```javascript
try {
const response = await fetch(url, options);
if (!response.ok) {
const error = await response.json();
throw new Error(error.detail);
}
const data = await response.json();
} catch (error) {
console.error('API Error:', error.message);
}
```
## Rate Limiting
Rate limiting depends on the model provider being used:
* **Google Gemini**: Subject to Google AI Studio quotas
* **OpenAI**: Based on your OpenAI tier and usage
* **OpenRouter**: Depends on the specific model and your OpenRouter credits
* **Azure OpenAI**: Based on your Azure deployment quotas
* **Ollama**: Limited by local hardware resources
* **AWS Bedrock**: Subject to AWS service quotas
* **DashScope**: Based on Alibaba Cloud quotas
It's recommended to implement retry logic with exponential backoff for production applications.
## Best Practices
1. **Model Selection**: Choose models based on your specific needs:
* Use faster models (e.g., `gemini-2.0-flash`, `gpt-4o-mini`) for simple queries
* Use more capable models (e.g., `gpt-4o`, `claude-3.5-sonnet`) for complex analysis
2. **Error Handling**: Always implement proper error handling for API calls
3. **Streaming**: The chat endpoint supports streaming responses for better user experience
4. **Caching**: DeepWiki automatically caches wiki generation results to improve performance
5. **Security**: Never expose API keys in client-side code; use environment variables
6. **Cost Optimization**: Monitor usage and costs, especially with premium models
## Configuration Files
DeepWiki uses JSON configuration files to manage model settings:
* `api/config/generator.json` - Model provider configurations
* `api/config/embedder.json` - Embedding model settings
* `api/config/repo.json` - Repository processing settings
You can customize these files or use the `DEEPWIKI_CONFIG_DIR` environment variable to specify a custom configuration directory.
# API Overview
Source: https://asyncfunc.mintlify.app/endpoints/overview
Complete guide to the DeepWiki-Open REST API for programmatic wiki generation and repository analysis
The DeepWiki-Open API provides programmatic access to all wiki generation and repository analysis features. Built with FastAPI, it offers high-performance endpoints for integration into your development workflows.
## API Base URL
Base URL: `http://localhost:8001` (development)
For production deployments, replace with your actual API server URL.
## API Architecture
The DeepWiki API is organized into several key areas:
Generate comprehensive documentation wikis from repository URLs
Interactive Q\&A with repository content using RAG
Configure and manage AI model providers
Real-time streaming for generation progress and chat
## Quick Start
### Authentication
Most endpoints require authentication via environment-configured API keys. The API validates your configured providers automatically.
```bash Test API Health
curl -X GET "http://localhost:8001/health"
```
```json Response
{
"status": "healthy",
"version": "0.1.0",
"timestamp": "2024-01-15T10:30:00Z"
}
```
### Basic Wiki Generation
```bash cURL
curl -X POST "http://localhost:8001/wiki/generate" \
-H "Content-Type: application/json" \
-d '{
"repo_url": "https://github.com/microsoft/TypeScript-Node-Starter",
"model_provider": "google",
"model_name": "gemini-2.0-flash",
"force_regenerate": false
}'
```
```python Python
import requests
response = requests.post("http://localhost:8001/wiki/generate", json={
"repo_url": "https://github.com/microsoft/TypeScript-Node-Starter",
"model_provider": "google",
"model_name": "gemini-2.0-flash",
"force_regenerate": False
})
wiki_data = response.json()
```
```javascript JavaScript
const response = await fetch('http://localhost:8001/wiki/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
repo_url: 'https://github.com/microsoft/TypeScript-Node-Starter',
model_provider: 'google',
model_name: 'gemini-2.0-flash',
force_regenerate: false
})
});
const wikiData = await response.json();
```
```json Success Response
{
"success": true,
"wiki_id": "wiki_1234567890",
"repo_info": {
"url": "https://github.com/microsoft/TypeScript-Node-Starter",
"name": "TypeScript-Node-Starter",
"owner": "microsoft",
"platform": "github"
},
"generation_status": "completed",
"wiki_pages": [
{
"id": "overview",
"title": "Project Overview",
"content": "TypeScript Node Starter is...",
"importance": "high",
"filePaths": ["README.md", "package.json"],
"relatedPages": ["setup", "architecture"]
}
],
"metadata": {
"generated_at": "2024-01-15T10:30:00Z",
"model_used": "gemini-2.0-flash",
"processing_time": 45.2,
"total_files_analyzed": 127
}
}
```
## Core Endpoints
### Wiki Generation
Generate a comprehensive wiki from a repository URL.
**Request Body:**
* `repo_url` (string, required): Repository URL
* `model_provider` (string): AI provider (`google`, `openai`, `openrouter`, `azure`, `ollama`)
* `model_name` (string): Specific model to use
* `force_regenerate` (boolean): Force regeneration even if cached
* `access_token` (string): Repository access token for private repos
* `auth_code` (string): Authorization code (if auth mode enabled)
**Response:** Complete wiki structure with pages and metadata
List all processed repositories and their wiki status.
**Query Parameters:**
* `limit` (integer): Number of results to return
* `offset` (integer): Pagination offset
**Response:** Array of processed projects with generation status
Retrieve a specific generated wiki by project ID.
**Path Parameters:**
* `project_id` (string): Unique project identifier
**Response:** Complete wiki data including all pages
### Chat & RAG
Stream chat responses using RAG on repository content.
**Request Body:**
* `message` (string, required): User question
* `repo_url` (string, required): Repository URL for context
* `conversation_history` (array): Previous conversation messages
* `model_provider` (string): AI provider for responses
* `deep_research` (boolean): Enable multi-turn research mode
**Response:** Server-sent events with streaming chat responses
### Model Configuration
Get available model providers and configurations.
**Response:** Available providers, models, and their parameters
Validate API keys and model availability.
**Request Body:**
* `provider` (string): Provider to validate
* `model_name` (string): Specific model to test
**Response:** Validation status and model information
## Data Models
### WikiPage
Unique identifier for the wiki page
Human-readable page title
Full page content in Markdown format with Mermaid diagrams
Source file paths that contributed to this page
Page importance level: `high`, `medium`, or `low`
Array of related page IDs for cross-references
### RepoInfo
Full repository URL
Repository name
Repository owner/organization
Git platform: `github`, `gitlab`, or `bitbucket`
Whether the repository is private
Default branch name (usually `main` or `master`)
## Error Handling
The API uses standard HTTP status codes and returns detailed error information:
```json Error Response
{
"error": {
"code": "INVALID_REPOSITORY",
"message": "Repository not found or not accessible",
"details": {
"repo_url": "https://github.com/invalid/repo",
"status_code": 404,
"suggestion": "Verify the repository URL and access permissions"
},
"timestamp": "2024-01-15T10:30:00Z"
}
}
```
### Common Error Codes
**Common causes:**
* Invalid repository URL format
* Missing required parameters
* Invalid model provider/name combination
**Example:**
```json
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid repository URL format",
"details": {
"field": "repo_url",
"provided": "not-a-valid-url",
"expected": "https://github.com/owner/repo"
}
}
}
```
**Common causes:**
* Missing or invalid API keys
* Repository access token required but not provided
* Invalid authorization code
**Example:**
```json
{
"error": {
"code": "API_KEY_INVALID",
"message": "Google API key is invalid or expired",
"details": {
"provider": "google",
"suggestion": "Check your GOOGLE_API_KEY environment variable"
}
}
}
```
**Common causes:**
* Repository doesn't exist
* Repository is private and requires access token
* Wiki not found for the given project ID
**Example:**
```json
{
"error": {
"code": "REPOSITORY_NOT_FOUND",
"message": "Repository not accessible",
"details": {
"repo_url": "https://github.com/private/repo",
"suggestion": "Provide access_token for private repositories"
}
}
}
```
**Common causes:**
* API rate limits exceeded
* AI provider rate limits reached
**Example:**
```json
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "OpenAI API rate limit exceeded",
"details": {
"provider": "openai",
"retry_after": 60,
"suggestion": "Wait 60 seconds before retrying or use a different provider"
}
}
}
```
**Common causes:**
* AI model generation failures
* Repository processing errors
* Configuration issues
**Example:**
```json
{
"error": {
"code": "GENERATION_FAILED",
"message": "Wiki generation failed due to model error",
"details": {
"model": "gpt-4o",
"stage": "content_generation",
"suggestion": "Try a different model or contact support"
}
}
}
```
## Rate Limits
Rate limits depend on your AI provider's limits. DeepWiki doesn't impose additional rate limits.
### Provider Rate Limits
* **GPT-4**: 500 requests/minute, 30,000 tokens/minute
* **GPT-3.5**: 3,500 requests/minute, 90,000 tokens/minute
* Varies by usage tier and model
* **Free tier**: 15 requests/minute, 32,000 tokens/minute
* **Paid tier**: 360 requests/minute, 120,000 tokens/minute
* Varies by model and payment plan
* Some models have per-request costs
* Check [OpenRouter pricing](https://openrouter.ai/docs#limits)
* Configurable limits per deployment
* Enterprise-grade quotas available
* Contact Microsoft for high-volume needs
## WebSocket API
For real-time updates during wiki generation and chat:
```javascript WebSocket Connection
const ws = new WebSocket('ws://localhost:8001/ws/wiki/generate');
ws.onopen = () => {
ws.send(JSON.stringify({
repo_url: 'https://github.com/microsoft/vscode',
model_provider: 'google',
model_name: 'gemini-2.0-flash'
}));
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
switch(data.type) {
case 'progress':
console.log(`Progress: ${data.progress}%`);
break;
case 'page_generated':
console.log(`Generated page: ${data.page.title}`);
break;
case 'completed':
console.log('Wiki generation completed!');
break;
case 'error':
console.error('Error:', data.error);
break;
}
};
```
## SDK Examples
### Python SDK Usage
```python
import asyncio
import aiohttp
import json
class DeepWikiClient:
def __init__(self, base_url="http://localhost:8001"):
self.base_url = base_url
async def generate_wiki(self, repo_url, model_provider="google",
model_name="gemini-2.0-flash"):
async with aiohttp.ClientSession() as session:
payload = {
"repo_url": repo_url,
"model_provider": model_provider,
"model_name": model_name
}
async with session.post(
f"{self.base_url}/wiki/generate",
json=payload
) as response:
return await response.json()
async def ask_question(self, question, repo_url):
async with aiohttp.ClientSession() as session:
payload = {
"message": question,
"repo_url": repo_url
}
async with session.post(
f"{self.base_url}/chat/stream",
json=payload
) as response:
async for line in response.content:
yield json.loads(line.decode())
# Usage
async def main():
client = DeepWikiClient()
# Generate wiki
wiki = await client.generate_wiki(
"https://github.com/fastapi/fastapi"
)
print(f"Generated {len(wiki['wiki_pages'])} pages")
# Ask questions
async for response in client.ask_question(
"How does FastAPI handle dependency injection?",
"https://github.com/fastapi/fastapi"
):
print(response['content'], end='', flush=True)
asyncio.run(main())
```
### Node.js SDK Usage
```javascript
class DeepWikiClient {
constructor(baseUrl = 'http://localhost:8001') {
this.baseUrl = baseUrl;
}
async generateWiki(repoUrl, options = {}) {
const {
modelProvider = 'google',
modelName = 'gemini-2.0-flash',
forceRegenerate = false
} = options;
const response = await fetch(`${this.baseUrl}/wiki/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
repo_url: repoUrl,
model_provider: modelProvider,
model_name: modelName,
force_regenerate: forceRegenerate
})
});
if (!response.ok) {
throw new Error(`API Error: ${response.status}`);
}
return await response.json();
}
async *askQuestion(question, repoUrl) {
const response = await fetch(`${this.baseUrl}/chat/stream`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: question,
repo_url: repoUrl
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n').filter(line => line.trim());
for (const line of lines) {
if (line.startsWith('data: ')) {
try {
yield JSON.parse(line.slice(6));
} catch (e) {
console.warn('Failed to parse SSE data:', line);
}
}
}
}
}
}
// Usage
const client = new DeepWikiClient();
async function example() {
try {
// Generate wiki
const wiki = await client.generateWiki(
'https://github.com/expressjs/express'
);
console.log(`Generated ${wiki.wiki_pages.length} pages`);
// Ask question with streaming response
console.log('\nAsking question...');
for await (const chunk of client.askQuestion(
'Explain Express.js middleware system',
'https://github.com/expressjs/express'
)) {
if (chunk.type === 'content') {
process.stdout.write(chunk.content);
}
}
} catch (error) {
console.error('Error:', error.message);
}
}
example();
```
## Next Steps
Detailed wiki generation and management endpoints
Interactive chat and RAG endpoints for repository Q\&A
Real-time streaming APIs for live updates
API authentication and security configuration
# WebSocket API
Source: https://asyncfunc.mintlify.app/endpoints/websocket-api
Real-time streaming API for chat completions and wiki generation
# WebSocket API
DeepWiki uses WebSocket connections to provide real-time streaming responses for chat completions and wiki generation. This enables a more responsive user experience with lower latency compared to traditional HTTP streaming.
## Overview
The WebSocket API replaces HTTP streaming endpoints with persistent WebSocket connections that:
* Stream AI responses in real-time as they're generated
* Provide immediate feedback for long-running operations
* Support bidirectional communication for future enhancements
* Automatically fall back to HTTP if WebSocket connection fails
## Connection Establishment
### Endpoint
```
ws://localhost:8001/ws/chat
```
For production deployments with SSL:
```
wss://your-domain.com/ws/chat
```
### Client Connection Example
```typescript
// Convert HTTP URL to WebSocket URL
const getWebSocketUrl = () => {
const baseUrl = process.env.SERVER_BASE_URL || 'http://localhost:8001';
// Replace http:// with ws:// or https:// with wss://
const wsBaseUrl = baseUrl.replace(/^http/, 'ws');
return `${wsBaseUrl}/ws/chat`;
};
// Create WebSocket connection
const ws = new WebSocket(getWebSocketUrl());
ws.onopen = () => {
console.log('WebSocket connection established');
// Send the request after connection is open
ws.send(JSON.stringify(requestData));
};
```
## Message Formats
### Request Format
After establishing the connection, send a JSON message with the following structure:
```typescript
interface ChatCompletionRequest {
repo_url: string; // Repository URL (GitHub, GitLab, BitBucket)
messages: ChatMessage[]; // Conversation history
filePath?: string; // Optional: Focus on specific file
token?: string; // Optional: Access token for private repos
type?: string; // Repository type: 'github' | 'gitlab' | 'bitbucket'
provider?: string; // AI provider: 'google' | 'openai' | 'openrouter' | 'ollama' | 'azure'
model?: string; // Model name for the provider
language?: string; // Response language: 'en' | 'ja' | 'zh' | 'es' | 'kr' | 'vi'
excluded_dirs?: string; // Newline-separated list of directories to exclude
excluded_files?: string; // Newline-separated list of file patterns to exclude
included_dirs?: string; // Newline-separated list of directories to include
included_files?: string; // Newline-separated list of file patterns to include
}
interface ChatMessage {
role: 'user' | 'assistant' | 'system';
content: string;
}
```
### Response Format
The server streams text responses as they're generated:
```typescript
// Each WebSocket message contains a text chunk
ws.onmessage = (event) => {
const textChunk = event.data; // Plain text chunk
console.log('Received:', textChunk);
// Append to the full response
fullResponse += textChunk;
};
```
## Event Types
### Connection Events
```typescript
// Connection established
ws.onopen = (event) => {
console.log('Connected to DeepWiki WebSocket');
// Send your request here
};
// Connection closed
ws.onclose = (event) => {
console.log('WebSocket connection closed');
// Handle completion or reconnection logic
};
// Connection error
ws.onerror = (error) => {
console.error('WebSocket error:', error);
// Fall back to HTTP streaming
};
```
### Message Flow
1. **Client connects** to WebSocket endpoint
2. **Client sends** JSON request after connection opens
3. **Server streams** text responses
4. **Server closes** connection when complete
5. **Client handles** close event
## Streaming Features
### Wiki Generation
For wiki page generation, the WebSocket streams Markdown content in real-time:
```typescript
const generateWikiPage = async (page: WikiPage) => {
const ws = new WebSocket(wsUrl);
await new Promise((resolve, reject) => {
ws.onopen = () => {
ws.send(JSON.stringify({
repo_url: repoUrl,
messages: [{
role: 'user',
content: generatePagePrompt(page)
}],
provider: selectedProvider,
model: selectedModel,
language: language
}));
};
let content = '';
ws.onmessage = (event) => {
content += event.data;
// Update UI with streaming content
updatePageContent(content);
};
ws.onclose = () => resolve(content);
ws.onerror = (error) => reject(error);
});
};
```
### Chat/Ask Feature
The Ask feature uses WebSocket for real-time streaming with support for:
* **Conversation History**: Maintains context across multiple questions
* **Deep Research Mode**: Multi-turn research with automatic continuation
* **File Context**: Include specific file content in queries
```typescript
// Example: Deep Research request
const deepResearchRequest = {
repo_url: 'https://github.com/user/repo',
messages: [
{
role: 'user',
content: '[DEEP RESEARCH] How does the authentication system work?'
}
],
provider: 'google',
model: 'gemini-2.0-flash',
language: 'en'
};
ws.send(JSON.stringify(deepResearchRequest));
```
## Connection Lifecycle
### Automatic Closure
The server automatically closes the WebSocket connection after:
* Completing the response stream
* Encountering an error
* Detecting client disconnect
### Reconnection Strategy
```typescript
const createResilientWebSocket = (
request: ChatCompletionRequest,
maxRetries: number = 3
) => {
let retryCount = 0;
const connect = () => {
const ws = new WebSocket(getWebSocketUrl());
ws.onerror = (error) => {
if (retryCount < maxRetries) {
retryCount++;
console.log(`Retrying connection (${retryCount}/${maxRetries})...`);
setTimeout(connect, 1000 * retryCount); // Exponential backoff
} else {
console.error('Max retries reached, falling back to HTTP');
fallbackToHttp(request);
}
};
ws.onopen = () => {
retryCount = 0; // Reset on successful connection
ws.send(JSON.stringify(request));
};
return ws;
};
return connect();
};
```
## Error Handling
### Server-Side Errors
The server sends error messages as text before closing the connection:
```typescript
ws.onmessage = (event) => {
const message = event.data;
if (message.startsWith('Error:')) {
// Handle error message
console.error('Server error:', message);
// Error types:
// - "Error: No valid document embeddings found..."
// - "Error: No messages provided"
// - "Error: Last message must be from the user"
// - "Error preparing retriever: ..."
} else {
// Handle normal response
processResponse(message);
}
};
```
### Client-Side Error Handling
```typescript
const handleWebSocketError = (error: Event) => {
console.error('WebSocket error:', error);
// Fallback to HTTP streaming
return fetch('/api/chat/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(request)
});
};
```
## Status Codes
WebSocket connections use standard close codes:
| Code | Status | Description |
| ---- | ---------------- | ------------------------------ |
| 1000 | Normal Closure | Request completed successfully |
| 1001 | Going Away | Server is shutting down |
| 1002 | Protocol Error | Invalid message format |
| 1003 | Unsupported Data | Invalid request data |
| 1006 | Abnormal Closure | Connection lost unexpectedly |
| 1011 | Internal Error | Server encountered an error |
## Security Considerations
### Authentication
For private repositories, include the access token in the request:
```typescript
const secureRequest = {
repo_url: 'https://github.com/org/private-repo',
token: 'ghp_xxxxxxxxxxxx', // GitHub personal access token
type: 'github',
messages: [...]
};
```
### Connection Security
1. **Use WSS in Production**: Always use `wss://` (WebSocket Secure) in production
2. **Token Validation**: Tokens are validated server-side before accessing repositories
3. **Origin Validation**: Consider implementing origin checks for CORS security
4. **Rate Limiting**: Implement connection rate limiting to prevent abuse
### Example Security Headers
```typescript
// Server-side WebSocket upgrade with security headers
app.add_websocket_route("/ws/chat", handle_websocket_chat, {
headers: {
'X-Content-Type-Options': 'nosniff',
'X-Frame-Options': 'DENY',
'X-XSS-Protection': '1; mode=block'
}
});
```
## Complete Client Implementation Example
```typescript
import { ChatCompletionRequest } from '@/types/chat';
class DeepWikiWebSocketClient {
private ws: WebSocket | null = null;
private messageBuffer: string = '';
async streamCompletion(
request: ChatCompletionRequest,
onChunk: (chunk: string) => void,
onError: (error: Error) => void,
onComplete: () => void
): Promise {
return new Promise((resolve, reject) => {
try {
// Close existing connection if any
this.close();
// Create new WebSocket connection
this.ws = new WebSocket(this.getWebSocketUrl());
// Set binary type for potential future binary support
this.ws.binaryType = 'arraybuffer';
// Connection opened
this.ws.onopen = () => {
console.log('WebSocket connection established');
this.ws!.send(JSON.stringify(request));
};
// Message received
this.ws.onmessage = (event) => {
try {
const chunk = event.data;
this.messageBuffer += chunk;
onChunk(chunk);
} catch (error) {
console.error('Error processing message:', error);
onError(error as Error);
}
};
// Connection closed
this.ws.onclose = (event) => {
console.log('WebSocket closed:', event.code, event.reason);
onComplete();
resolve();
};
// Connection error
this.ws.onerror = (error) => {
console.error('WebSocket error:', error);
onError(new Error('WebSocket connection failed'));
reject(error);
};
// Set timeout for connection
setTimeout(() => {
if (this.ws?.readyState === WebSocket.CONNECTING) {
this.close();
reject(new Error('WebSocket connection timeout'));
}
}, 5000);
} catch (error) {
reject(error);
}
});
}
private getWebSocketUrl(): string {
const baseUrl = process.env.NEXT_PUBLIC_SERVER_BASE_URL || 'http://localhost:8001';
return baseUrl.replace(/^http/, 'ws') + '/ws/chat';
}
close(): void {
if (this.ws && this.ws.readyState === WebSocket.OPEN) {
this.ws.close(1000, 'Client closing connection');
}
this.ws = null;
this.messageBuffer = '';
}
getFullResponse(): string {
return this.messageBuffer;
}
}
// Usage example
const client = new DeepWikiWebSocketClient();
await client.streamCompletion(
{
repo_url: 'https://github.com/user/repo',
messages: [{ role: 'user', content: 'Explain the main functionality' }],
provider: 'google',
model: 'gemini-2.0-flash',
language: 'en'
},
(chunk) => {
// Update UI with streaming chunk
setResponse(prev => prev + chunk);
},
(error) => {
console.error('Streaming error:', error);
// Fall back to HTTP
},
() => {
console.log('Streaming complete');
// Enable UI for next question
}
);
```
## Performance Considerations
### Advantages over HTTP Streaming
1. **Lower Latency**: No HTTP overhead for each chunk
2. **Bidirectional**: Enables future features like progress updates
3. **Connection Reuse**: Single connection for entire session
4. **Binary Support**: Can handle binary data if needed
### Best Practices
1. **Implement Fallback**: Always have HTTP streaming as fallback
2. **Handle Disconnects**: Gracefully handle unexpected disconnections
3. **Buffer Management**: Clear buffers after each completion
4. **Resource Cleanup**: Close connections when component unmounts
```typescript
// React cleanup example
useEffect(() => {
return () => {
client.close(); // Clean up WebSocket on unmount
};
}, []);
```
## Troubleshooting
### Common Issues
1. **Connection Refused**
* Check if the API server is running on port 8001
* Verify WebSocket endpoint URL is correct
* Check for proxy/firewall blocking WebSocket connections
2. **Immediate Disconnect**
* Verify request JSON format is valid
* Check for missing required fields
* Ensure messages array is not empty
3. **No Response**
* Check server logs for errors
* Verify model provider credentials are set
* Ensure repository URL is accessible
### Debug Logging
Enable detailed logging for troubleshooting:
```typescript
const debugWebSocket = (ws: WebSocket) => {
ws.addEventListener('open', (e) => console.log('WS Open:', e));
ws.addEventListener('message', (e) => console.log('WS Message:', e.data));
ws.addEventListener('error', (e) => console.log('WS Error:', e));
ws.addEventListener('close', (e) => console.log('WS Close:', e.code, e.reason));
};
```
## Future Enhancements
The WebSocket infrastructure enables future real-time features:
* Progress indicators for long operations
* Cancel/interrupt ongoing generation
* Real-time collaboration features
* Live repository updates
* Streaming file analysis
* Interactive debugging sessions
# Wiki Generation API
Source: https://asyncfunc.mintlify.app/endpoints/wiki-endpoints
Complete API reference for DeepWikiOpen wiki generation and management endpoints
## Overview
The DeepWikiOpen API provides endpoints for generating comprehensive documentation from repositories using AI-powered analysis. All endpoints require authentication and support JSON request/response formats.
**Base URL**: `https://api.deepwikiopen.com/v1`
**Authentication**: All endpoints require an API key in the Authorization header.
***
## POST /wiki/generate
Generate comprehensive wiki documentation from a repository URL or uploaded codebase.
The GitHub repository URL to analyze and generate documentation for
Custom name for the project (defaults to repository name)
Generation configuration options
Include code examples in generated documentation
Maximum directory depth to analyze (1-10)
File patterns to exclude from analysis (e.g., \["*.test.js", "node\_modules/*"])
Include private/internal functions in documentation
Output format: "markdown", "html", or "json"
Optional webhook URL for completion notifications
### Request Example
```bash curl
curl -X POST "https://api.deepwikiopen.com/v1/wiki/generate" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"repository_url": "https://github.com/username/awesome-project",
"project_name": "Awesome Project",
"config": {
"include_code_examples": true,
"max_depth": 3,
"exclude_patterns": ["*.test.js", "dist/*"],
"output_format": "markdown"
},
"webhook_url": "https://your-domain.com/webhooks/wiki-complete"
}'
```
```python Python
import requests
url = "https://api.deepwikiopen.com/v1/wiki/generate"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}
data = {
"repository_url": "https://github.com/username/awesome-project",
"project_name": "Awesome Project",
"config": {
"include_code_examples": True,
"max_depth": 3,
"exclude_patterns": ["*.test.js", "dist/*"],
"output_format": "markdown"
}
}
response = requests.post(url, headers=headers, json=data)
result = response.json()
print(f"Task ID: {result['task_id']}")
```
```javascript JavaScript
const response = await fetch('https://api.deepwikiopen.com/v1/wiki/generate', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
repository_url: 'https://github.com/username/awesome-project',
project_name: 'Awesome Project',
config: {
include_code_examples: true,
max_depth: 3,
exclude_patterns: ['*.test.js', 'dist/*'],
output_format: 'markdown'
}
})
});
const result = await response.json();
console.log('Task ID:', result.task_id);
```
### Response
Unique identifier for tracking the generation task
Unique project identifier for accessing results
Current task status: "queued", "processing", "completed", or "failed"
Estimated completion time in seconds
Whether webhook notification was successfully registered
```json Response Example
{
"task_id": "task_12345abcde",
"project_id": "proj_awesome_project_67890",
"status": "queued",
"estimated_duration": 180,
"webhook_registered": true,
"created_at": "2024-01-15T10:30:00Z"
}
```
***
## GET /wiki/projects
Retrieve a list of all processed repositories and their current status.
Page number for pagination
Number of projects per page (max: 100)
Filter by status: "completed", "processing", "failed"
Search projects by name or repository URL
### Request Example
```bash curl
curl -X GET "https://api.deepwikiopen.com/v1/wiki/projects?page=1&limit=10&status=completed" \
-H "Authorization: Bearer YOUR_API_KEY"
```
```python Python
import requests
url = "https://api.deepwikiopen.com/v1/wiki/projects"
headers = {"Authorization": "Bearer YOUR_API_KEY"}
params = {
"page": 1,
"limit": 10,
"status": "completed"
}
response = requests.get(url, headers=headers, params=params)
projects = response.json()
```
```javascript JavaScript
const params = new URLSearchParams({
page: '1',
limit: '10',
status: 'completed'
});
const response = await fetch(`https://api.deepwikiopen.com/v1/wiki/projects?${params}`, {
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const projects = await response.json();
```
### Response
Array of project objects
Unique project identifier
Display name of the project
Original repository URL
Current project status
ISO timestamp of creation
ISO timestamp of completion (if applicable)
Number of files processed
Number of documentation pages generated
Pagination information
Current page numberTotal number of pagesTotal number of projectsWhether there are more pages
***
## GET /wiki/{project_id}
Retrieve the complete wiki documentation for a specific project.
The unique project identifier
Response format: "json", "markdown", or "html"
Include project metadata in response
### Request Example
```bash curl
curl -X GET "https://api.deepwikiopen.com/v1/wiki/proj_awesome_project_67890?format=json" \
-H "Authorization: Bearer YOUR_API_KEY"
```
```python Python
import requests
project_id = "proj_awesome_project_67890"
url = f"https://api.deepwikiopen.com/v1/wiki/{project_id}"
headers = {"Authorization": "Bearer YOUR_API_KEY"}
params = {"format": "json", "include_metadata": True}
response = requests.get(url, headers=headers, params=params)
wiki_data = response.json()
```
```javascript JavaScript
const projectId = 'proj_awesome_project_67890';
const params = new URLSearchParams({
format: 'json',
include_metadata: 'true'
});
const response = await fetch(`https://api.deepwikiopen.com/v1/wiki/${projectId}?${params}`, {
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const wikiData = await response.json();
```
### Response
Project identifier
Project display name
Original repository URL
ISO timestamp of wiki generation
Array of documentation pages
Unique page identifierPage titlePage content (format depends on query parameter)File path in repositoryPage type: "overview", "api", "guide", "reference"Display order
Project metadata (if include\_metadata=true)
Programming language distributionTotal files processedProject complexity rating (0-10)Last commit hash from repository
***
## DELETE /wiki/{project_id}
Delete a project's cached wiki data and generated documentation.
The unique project identifier to delete
### Request Example
```bash curl
curl -X DELETE "https://api.deepwikiopen.com/v1/wiki/proj_awesome_project_67890" \
-H "Authorization: Bearer YOUR_API_KEY"
```
```python Python
import requests
project_id = "proj_awesome_project_67890"
url = f"https://api.deepwikiopen.com/v1/wiki/{project_id}"
headers = {"Authorization": "Bearer YOUR_API_KEY"}
response = requests.delete(url, headers=headers)
result = response.json()
```
```javascript JavaScript
const projectId = 'proj_awesome_project_67890';
const response = await fetch(`https://api.deepwikiopen.com/v1/wiki/${projectId}`, {
method: 'DELETE',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const result = await response.json();
```
### Response
Whether the deletion was successful
The deleted project identifier
Confirmation message
ISO timestamp of deletion
```json Response Example
{
"success": true,
"project_id": "proj_awesome_project_67890",
"message": "Project wiki data successfully deleted",
"deleted_at": "2024-01-15T14:30:00Z"
}
```
***
## POST /wiki/regenerate
Force regenerate wiki documentation for an existing project with updated configuration.
The project identifier to regenerate
Updated generation configuration (same structure as /wiki/generate)
Force refresh from repository (ignore cache)
Optional webhook URL for completion notifications
### Request Example
```bash curl
curl -X POST "https://api.deepwikiopen.com/v1/wiki/regenerate" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"project_id": "proj_awesome_project_67890",
"config": {
"include_code_examples": true,
"max_depth": 5,
"output_format": "html"
},
"force_refresh": true
}'
```
```python Python
import requests
url = "https://api.deepwikiopen.com/v1/wiki/regenerate"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}
data = {
"project_id": "proj_awesome_project_67890",
"config": {
"include_code_examples": True,
"max_depth": 5,
"output_format": "html"
},
"force_refresh": True
}
response = requests.post(url, headers=headers, json=data)
result = response.json()
```
```javascript JavaScript
const response = await fetch('https://api.deepwikiopen.com/v1/wiki/regenerate', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
project_id: 'proj_awesome_project_67890',
config: {
include_code_examples: true,
max_depth: 5,
output_format: 'html'
},
force_refresh: true
})
});
const result = await response.json();
```
### Response
Same response structure as POST /wiki/generate.
***
## GET /wiki/status/{task_id}
Check the status of a wiki generation task and get progress updates.
The task identifier returned from generation request
### Request Example
```bash curl
curl -X GET "https://api.deepwikiopen.com/v1/wiki/status/task_12345abcde" \
-H "Authorization: Bearer YOUR_API_KEY"
```
```python Python
import requests
task_id = "task_12345abcde"
url = f"https://api.deepwikiopen.com/v1/wiki/status/{task_id}"
headers = {"Authorization": "Bearer YOUR_API_KEY"}
response = requests.get(url, headers=headers)
status = response.json()
```
```javascript JavaScript
const taskId = 'task_12345abcde';
const response = await fetch(`https://api.deepwikiopen.com/v1/wiki/status/${taskId}`, {
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const status = await response.json();
```
### Response
The task identifier
Associated project identifier
Current status: "queued", "processing", "completed", "failed"
Progress information (when status is "processing")
Completion percentage (0-100)Current processing stepNumber of files processed so farTotal files to processEstimated seconds remaining
Results (when status is "completed")
URL to access generated wikiNumber of pages createdTotal processing time in seconds
Error information (when status is "failed")
Error codeError descriptionAdditional error details
ISO timestamp when task was created
ISO timestamp of last status update
```json Response Example (Processing)
{
"task_id": "task_12345abcde",
"project_id": "proj_awesome_project_67890",
"status": "processing",
"progress": {
"percentage": 65,
"current_step": "Generating API documentation",
"files_processed": 42,
"total_files": 65,
"estimated_remaining": 45
},
"created_at": "2024-01-15T10:30:00Z",
"updated_at": "2024-01-15T10:32:15Z"
}
```
***
## Authentication
All API endpoints require authentication using an API key passed in the Authorization header:
```
Authorization: Bearer YOUR_API_KEY
```
### Getting an API Key
1. Sign up for a DeepWikiOpen account at [console.deepwikiopen.com](https://console.deepwikiopen.com)
2. Navigate to the API Keys section
3. Generate a new API key
4. Store it securely (it won't be shown again)
***
## Rate Limiting
API requests are rate limited to prevent abuse:
* **Standard Plan**: 100 requests per hour
* **Pro Plan**: 1,000 requests per hour
* **Enterprise Plan**: 10,000 requests per hour
Rate limit headers are included in all responses:
```http
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1642248000
```
When rate limited, you'll receive a 429 status code:
```json
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Too many requests. Limit resets at 2024-01-15T11:00:00Z"
}
}
```
***
## Error Responses
All endpoints follow consistent error response format:
### HTTP Status Codes
| Code | Description |
| ---- | --------------------- |
| 200 | Success |
| 201 | Created |
| 400 | Bad Request |
| 401 | Unauthorized |
| 403 | Forbidden |
| 404 | Not Found |
| 429 | Rate Limited |
| 500 | Internal Server Error |
| 503 | Service Unavailable |
### Error Response Format
```json
{
"error": {
"code": "ERROR_CODE",
"message": "Human-readable error message",
"details": "Additional error context",
"request_id": "req_12345abcde"
}
}
```
### Common Error Codes
| Code | Description |
| ------------------------ | ---------------------------------------------- |
| `INVALID_API_KEY` | API key is missing or invalid |
| `PROJECT_NOT_FOUND` | Specified project doesn't exist |
| `TASK_NOT_FOUND` | Specified task doesn't exist |
| `INVALID_REPOSITORY_URL` | Repository URL is malformed or inaccessible |
| `GENERATION_FAILED` | Wiki generation failed due to processing error |
| `RATE_LIMIT_EXCEEDED` | Too many requests within time window |
| `INSUFFICIENT_CREDITS` | Not enough API credits remaining |
***
## Webhooks
When providing a `webhook_url` in generation requests, DeepWikiOpen will send HTTP POST notifications when tasks complete:
### Webhook Payload
```json
{
"event": "wiki.generation.completed",
"task_id": "task_12345abcde",
"project_id": "proj_awesome_project_67890",
"status": "completed",
"result": {
"wiki_url": "https://api.deepwikiopen.com/v1/wiki/proj_awesome_project_67890",
"pages_generated": 15,
"processing_time": 142
},
"timestamp": "2024-01-15T10:35:00Z"
}
```
### Webhook Security
Webhooks include a signature header for verification:
```http
X-DeepWiki-Signature: sha256=1234567890abcdef...
```
Verify the signature using your webhook secret (available in your dashboard).
***
## SDK Examples
### Python SDK
```python
from deepwikiopen import DeepWikiClient
client = DeepWikiClient(api_key="YOUR_API_KEY")
# Generate wiki
task = client.generate_wiki(
repository_url="https://github.com/username/repo",
config={
"include_code_examples": True,
"max_depth": 3
}
)
# Wait for completion
result = client.wait_for_completion(task.task_id)
print(f"Wiki generated: {result.wiki_url}")
```
### Node.js SDK
```javascript
const { DeepWikiClient } = require('deepwikiopen');
const client = new DeepWikiClient({ apiKey: 'YOUR_API_KEY' });
// Generate wiki
const task = await client.generateWiki({
repositoryUrl: 'https://github.com/username/repo',
config: {
includeCodeExamples: true,
maxDepth: 3
}
});
// Poll for completion
const result = await client.waitForCompletion(task.taskId);
console.log(`Wiki generated: ${result.wikiUrl}`);
```
***
## Support
For API support and questions:
* **Documentation**: [docs.deepwikiopen.com](https://docs.deepwikiopen.com)
* **Support Email**: [support@deepwikiopen.com](mailto:support@deepwikiopen.com)
* **Status Page**: [status.deepwikiopen.com](https://status.deepwikiopen.com)
* **Community Discord**: [discord.gg/deepwikiopen](https://discord.gg/deepwikiopen)
# null
Source: https://asyncfunc.mintlify.app/examples/basic-usage
# Basic Usage
This guide provides practical examples of using DeepWiki's core features. All examples are designed to be runnable and progressively introduce more complex functionality.
## Table of Contents
1. [Simple Wiki Generation](#simple-wiki-generation)
2. [Using the Ask Feature](#using-the-ask-feature)
3. [Using DeepResearch Feature](#using-deepresearch-feature)
4. [Working with Public Repositories](#working-with-public-repositories)
5. [Basic API Calls with curl](#basic-api-calls-with-curl)
6. [Python Client Example](#python-client-example)
7. [JavaScript/Node.js Example](#javascriptnodejs-example)
8. [Common Use Cases and Workflows](#common-use-cases-and-workflows)
## Simple Wiki Generation
The most basic use of DeepWiki is generating documentation from your codebase.
### Example 1: Generate a basic wiki
```bash
# Using the CLI
deepwiki generate --input ./my-project --output ./wiki
# With specific file types
deepwiki generate --input ./my-project --output ./wiki --include "*.py,*.js"
# Exclude certain directories
deepwiki generate --input ./my-project --output ./wiki --exclude "node_modules,dist"
```
### Example 2: Generate with custom configuration
```yaml
# deepwiki.config.yaml
input:
path: ./my-project
include:
- "src/**/*.py"
- "lib/**/*.js"
exclude:
- "**/test/**"
- "**/__pycache__/**"
output:
path: ./documentation
format: markdown
generation:
include_source_links: true
max_file_size: 1MB
language_support:
- python
- javascript
```
```bash
# Use the configuration file
deepwiki generate --config deepwiki.config.yaml
```
## Using the Ask Feature
The Ask feature allows you to query your codebase using natural language.
### Example 3: Basic questions about your code
```bash
# Ask about function usage
deepwiki ask "How does the authentication system work?"
# Ask about specific components
deepwiki ask "What does the UserService class do?"
# Ask about dependencies
deepwiki ask "What external libraries does this project use?"
```
### Example 4: Advanced Ask queries with context
```python
from deepwiki import DeepWikiClient
client = DeepWikiClient(api_key="your-api-key")
# Ask with specific context
response = client.ask(
question="How can I add a new payment method?",
context={
"files": ["src/payments/*.py", "src/models/payment.py"],
"focus": "implementation details"
}
)
print(response.answer)
print(f"Confidence: {response.confidence}")
print(f"Sources: {response.sources}")
```
## Using DeepResearch Feature
DeepResearch performs in-depth analysis of your codebase to answer complex questions.
### Example 5: Basic research query
```bash
# Research architectural patterns
deepwiki research "What architectural patterns are used in this project?"
# Research performance optimizations
deepwiki research "What are the potential performance bottlenecks?"
# Research security concerns
deepwiki research "Are there any security vulnerabilities in the authentication flow?"
```
### Example 6: Comprehensive research with multiple aspects
```python
from deepwiki import DeepWikiClient
client = DeepWikiClient(api_key="your-api-key")
# Perform comprehensive research
research_config = {
"topic": "Database optimization opportunities",
"aspects": [
"query performance",
"indexing strategies",
"connection pooling",
"caching mechanisms"
],
"depth": "comprehensive",
"include_recommendations": True
}
result = client.deep_research(research_config)
# Process results
for finding in result.findings:
print(f"Finding: {finding.title}")
print(f"Impact: {finding.impact}")
print(f"Recommendation: {finding.recommendation}")
print("---")
```
## Working with Public Repositories
DeepWiki can analyze public repositories directly from GitHub.
### Example 7: Analyze a public GitHub repository
```bash
# Analyze a public repository
deepwiki analyze --repo https://github.com/username/repo-name
# Generate documentation for a specific branch
deepwiki generate --repo https://github.com/username/repo-name --branch develop
# Analyze with specific focus
deepwiki analyze --repo https://github.com/username/repo-name --focus "api-design"
```
### Example 8: Batch analysis of multiple repositories
```python
from deepwiki import DeepWikiClient
client = DeepWikiClient(api_key="your-api-key")
# Analyze multiple related repositories
repos = [
"https://github.com/org/frontend",
"https://github.com/org/backend",
"https://github.com/org/shared-lib"
]
for repo in repos:
analysis = client.analyze_repository(repo)
print(f"Repository: {repo}")
print(f"Tech Stack: {', '.join(analysis.tech_stack)}")
print(f"Complexity Score: {analysis.complexity_score}")
print(f"Documentation Coverage: {analysis.doc_coverage}%")
print("---")
```
## Basic API Calls with curl
### Example 9: Generate wiki via API
```bash
# Generate wiki for a repository
curl -X POST https://api.deepwiki.com/v1/generate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"repository": "https://github.com/username/repo",
"output_format": "markdown",
"include_diagrams": true
}'
# Check generation status
curl -X GET https://api.deepwiki.com/v1/jobs/JOB_ID \
-H "Authorization: Bearer YOUR_API_KEY"
```
### Example 10: Ask questions via API
```bash
# Ask a question about your codebase
curl -X POST https://api.deepwiki.com/v1/ask \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question": "How does the caching system work?",
"repository": "https://github.com/username/repo",
"context_depth": "detailed"
}'
```
### Example 11: Deep research via API
```bash
# Initiate deep research
curl -X POST https://api.deepwiki.com/v1/research \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"topic": "Security vulnerabilities in API endpoints",
"repository": "https://github.com/username/repo",
"analysis_type": "security",
"include_fixes": true
}'
```
## Python Client Example
### Example 12: Complete Python workflow
````python
from deepwiki import DeepWikiClient
import json
# Initialize client
client = DeepWikiClient(
api_key="your-api-key",
base_url="https://api.deepwiki.com/v1" # Optional: for self-hosted
)
# 1. Generate documentation
def generate_documentation(repo_path):
"""Generate comprehensive documentation for a repository."""
job = client.generate_wiki(
input_path=repo_path,
output_format="markdown",
options={
"include_api_docs": True,
"include_examples": True,
"generate_diagrams": True,
"languages": ["python", "javascript"]
}
)
# Wait for completion
result = job.wait_for_completion(timeout=300)
if result.status == "completed":
print(f"Documentation generated at: {result.output_path}")
return result.output_path
else:
print(f"Generation failed: {result.error}")
return None
# 2. Interactive Q&A session
def interactive_qa_session(repo_path):
"""Run an interactive Q&A session."""
print("DeepWiki Q&A Session (type 'exit' to quit)")
while True:
question = input("\nYour question: ")
if question.lower() == 'exit':
break
response = client.ask(
question=question,
repository=repo_path,
options={
"include_code_snippets": True,
"max_sources": 5
}
)
print(f"\nAnswer: {response.answer}")
if response.code_snippets:
print("\nRelevant code:")
for snippet in response.code_snippets:
print(f"File: {snippet.file}")
print(f"```{snippet.language}")
print(snippet.code)
print("```")
# 3. Automated code analysis
def analyze_code_quality(repo_path):
"""Perform automated code quality analysis."""
analysis = client.analyze(
repository=repo_path,
analysis_types=[
"code_quality",
"security",
"performance",
"maintainability"
]
)
# Generate report
report = {
"summary": analysis.summary,
"scores": {
"quality": analysis.quality_score,
"security": analysis.security_score,
"performance": analysis.performance_score,
"maintainability": analysis.maintainability_score
},
"issues": [
{
"type": issue.type,
"severity": issue.severity,
"file": issue.file,
"line": issue.line,
"message": issue.message,
"suggestion": issue.suggestion
}
for issue in analysis.issues
],
"recommendations": analysis.recommendations
}
with open("analysis_report.json", "w") as f:
json.dump(report, f, indent=2)
return report
# 4. Research complex topics
def research_topic(repo_path, topic):
"""Conduct deep research on a specific topic."""
research = client.deep_research(
topic=topic,
repository=repo_path,
research_config={
"depth": "comprehensive",
"include_external_resources": True,
"compare_with_best_practices": True,
"generate_recommendations": True
}
)
# Process findings
for finding in research.findings:
print(f"\n## {finding.title}")
print(f"Importance: {finding.importance}/10")
print(f"\nDescription: {finding.description}")
if finding.code_examples:
print("\nCode Examples:")
for example in finding.code_examples:
print(f"```{example.language}")
print(example.code)
print("```")
if finding.recommendations:
print("\nRecommendations:")
for rec in finding.recommendations:
print(f"- {rec}")
# Main workflow
if __name__ == "__main__":
repo_path = "./my-project"
# Generate documentation
doc_path = generate_documentation(repo_path)
# Analyze code quality
report = analyze_code_quality(repo_path)
print(f"Overall quality score: {report['scores']['quality']}/100")
# Research specific topic
research_topic(repo_path, "API performance optimization strategies")
# Start Q&A session
interactive_qa_session(repo_path)
````
## JavaScript/Node.js Example
### Example 13: Complete Node.js workflow
````javascript
const { DeepWikiClient } = require('deepwiki-js');
// Initialize client
const client = new DeepWikiClient({
apiKey: process.env.DEEPWIKI_API_KEY,
baseUrl: 'https://api.deepwiki.com/v1' // Optional: for self-hosted
});
// 1. Generate documentation with progress tracking
async function generateDocumentation(repoPath) {
console.log('Starting documentation generation...');
const job = await client.generateWiki({
inputPath: repoPath,
outputFormat: 'markdown',
options: {
includeApiDocs: true,
includeExamples: true,
generateDiagrams: true,
languages: ['javascript', 'typescript']
}
});
// Track progress
job.on('progress', (progress) => {
console.log(`Progress: ${progress.percentage}% - ${progress.status}`);
});
try {
const result = await job.waitForCompletion();
console.log(`Documentation generated at: ${result.outputPath}`);
return result;
} catch (error) {
console.error('Generation failed:', error);
throw error;
}
}
// 2. Real-time Q&A with streaming
async function askWithStreaming(question, repoPath) {
const stream = await client.askStream({
question,
repository: repoPath,
options: {
includeCodeSnippets: true,
streamResponse: true
}
});
console.log('\nAnswer: ');
for await (const chunk of stream) {
process.stdout.write(chunk.text);
// Handle code snippets
if (chunk.codeSnippet) {
console.log(`\n\nCode from ${chunk.codeSnippet.file}:`);
console.log('```' + chunk.codeSnippet.language);
console.log(chunk.codeSnippet.code);
console.log('```\n');
}
}
}
// 3. Analyze repository with webhooks
async function analyzeWithWebhooks(repoUrl, webhookUrl) {
const analysis = await client.analyze({
repository: repoUrl,
analysisTypes: ['all'],
webhook: {
url: webhookUrl,
events: ['started', 'progress', 'completed', 'failed']
}
});
console.log(`Analysis started with ID: ${analysis.id}`);
// Poll for results
let result;
while (!result || result.status === 'processing') {
await new Promise(resolve => setTimeout(resolve, 5000));
result = await client.getAnalysis(analysis.id);
console.log(`Status: ${result.status}`);
}
return result;
}
// 4. Batch operations
async function batchAnalyzeRepositories(repositories) {
const batch = client.createBatch();
// Add operations to batch
for (const repo of repositories) {
batch.addOperation('analyze', {
repository: repo,
analysisTypes: ['code_quality', 'security']
});
}
// Execute batch
const results = await batch.execute();
// Process results
results.forEach((result, index) => {
console.log(`\nRepository: ${repositories[index]}`);
if (result.success) {
console.log(`Quality Score: ${result.data.qualityScore}`);
console.log(`Security Score: ${result.data.securityScore}`);
} else {
console.log(`Failed: ${result.error}`);
}
});
}
// 5. Research with caching
class CachedResearchClient {
constructor(client) {
this.client = client;
this.cache = new Map();
}
async research(topic, repoPath, useCache = true) {
const cacheKey = `${topic}:${repoPath}`;
if (useCache && this.cache.has(cacheKey)) {
console.log('Returning cached result');
return this.cache.get(cacheKey);
}
const result = await this.client.deepResearch({
topic,
repository: repoPath,
config: {
depth: 'comprehensive',
includeExternalResources: true
}
});
this.cache.set(cacheKey, result);
return result;
}
}
// 6. Express.js integration
const express = require('express');
const app = express();
app.use(express.json());
// Wiki generation endpoint
app.post('/api/generate-wiki', async (req, res) => {
const { repository, format = 'markdown' } = req.body;
try {
const job = await client.generateWiki({
repository,
outputFormat: format
});
res.json({
jobId: job.id,
status: 'processing',
checkUrl: `/api/jobs/${job.id}`
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
// Ask endpoint
app.post('/api/ask', async (req, res) => {
const { question, repository } = req.body;
try {
const response = await client.ask({
question,
repository
});
res.json({
answer: response.answer,
confidence: response.confidence,
sources: response.sources
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
// Job status endpoint
app.get('/api/jobs/:jobId', async (req, res) => {
try {
const job = await client.getJob(req.params.jobId);
res.json(job);
} catch (error) {
res.status(404).json({ error: 'Job not found' });
}
});
// Main execution
async function main() {
const repoPath = './my-project';
try {
// Generate documentation
await generateDocumentation(repoPath);
// Ask questions with streaming
await askWithStreaming(
'What are the main components of this application?',
repoPath
);
// Batch analyze multiple repos
await batchAnalyzeRepositories([
'https://github.com/org/repo1',
'https://github.com/org/repo2',
'https://github.com/org/repo3'
]);
// Start API server
app.listen(3000, () => {
console.log('DeepWiki API server running on port 3000');
});
} catch (error) {
console.error('Error:', error);
process.exit(1);
}
}
// Run if called directly
if (require.main === module) {
main();
}
module.exports = {
generateDocumentation,
askWithStreaming,
analyzeWithWebhooks,
batchAnalyzeRepositories,
CachedResearchClient
};
````
## Common Use Cases and Workflows
### Example 14: Onboarding new developers
```bash
#!/bin/bash
# onboard-developer.sh
# Generate comprehensive documentation
deepwiki generate \
--input . \
--output ./docs/wiki \
--include-getting-started \
--include-architecture
# Create developer guide
deepwiki research "What should a new developer know about this codebase?" \
--output ./docs/developer-guide.md
# Generate API reference
deepwiki generate-api-docs \
--input ./src/api \
--output ./docs/api-reference
# Create interactive tutorial
deepwiki create-tutorial \
--topics "setup,first-feature,testing" \
--output ./docs/tutorial
```
### Example 15: Continuous documentation workflow
```yaml
# .github/workflows/deepwiki.yml
name: DeepWiki Documentation
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
update-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup DeepWiki
uses: deepwiki/setup-action@v1
with:
api-key: ${{ secrets.DEEPWIKI_API_KEY }}
- name: Generate Documentation
run: |
deepwiki generate \
--input . \
--output ./docs \
--config .deepwiki.yml
- name: Analyze Changes
if: github.event_name == 'pull_request'
run: |
deepwiki analyze-pr \
--base ${{ github.base_ref }} \
--head ${{ github.head_ref }} \
--comment
- name: Update Wiki
if: github.ref == 'refs/heads/main'
run: |
deepwiki publish \
--source ./docs \
--destination wiki
```
### Example 16: API documentation generation
```python
from deepwiki import DeepWikiClient
from pathlib import Path
client = DeepWikiClient(api_key="your-api-key")
def generate_api_documentation(api_dir):
"""Generate comprehensive API documentation."""
# Analyze API endpoints
api_analysis = client.analyze_api({
"directory": api_dir,
"frameworks": ["express", "fastapi", "django"],
"include_examples": True
})
# Generate OpenAPI spec
openapi_spec = client.generate_openapi({
"analysis": api_analysis,
"version": "3.0.0",
"include_schemas": True
})
# Create interactive documentation
interactive_docs = client.generate_interactive_docs({
"openapi_spec": openapi_spec,
"theme": "modern",
"include_playground": True
})
# Save outputs
Path("./docs/api/openapi.yaml").write_text(openapi_spec)
Path("./docs/api/interactive").mkdir(parents=True, exist_ok=True)
return {
"openapi": "./docs/api/openapi.yaml",
"interactive": "./docs/api/interactive/index.html"
}
```
### Example 17: Code migration assistant
```javascript
const { DeepWikiClient } = require('deepwiki-js');
async function migrationAssistant(sourcePath, targetFramework) {
const client = new DeepWikiClient({
apiKey: process.env.DEEPWIKI_API_KEY
});
// Analyze current codebase
const analysis = await client.analyze({
repository: sourcePath,
focus: ['architecture', 'dependencies', 'patterns']
});
// Research migration path
const migrationPlan = await client.deepResearch({
topic: `Migrating from ${analysis.mainFramework} to ${targetFramework}`,
context: analysis,
includeStepByStep: true
});
// Generate migration checklist
const checklist = await client.generateChecklist({
plan: migrationPlan,
includeTimeEstimates: true,
includeDependencies: true
});
// Create migration documentation
const docs = await client.generateMigrationDocs({
currentState: analysis,
targetState: targetFramework,
plan: migrationPlan,
checklist: checklist
});
return {
analysis,
plan: migrationPlan,
checklist,
documentation: docs
};
}
// Usage
migrationAssistant('./legacy-app', 'react')
.then(result => {
console.log('Migration plan generated successfully');
console.log(`Total steps: ${result.checklist.steps.length}`);
console.log(`Estimated time: ${result.checklist.totalHours} hours`);
});
```
### Example 18: Security audit workflow
```python
import asyncio
from deepwiki import DeepWikiClient
async def security_audit(repo_path):
"""Perform comprehensive security audit."""
client = DeepWikiClient(api_key="your-api-key")
# 1. Initial security scan
scan_result = await client.security_scan({
"repository": repo_path,
"scan_types": [
"dependency_vulnerabilities",
"code_vulnerabilities",
"configuration_issues",
"secret_detection"
]
})
# 2. Deep research on vulnerabilities
vulnerabilities = []
for vuln in scan_result.vulnerabilities:
research = await client.deep_research({
"topic": f"Fix for {vuln.type}: {vuln.description}",
"context": {
"file": vuln.file,
"line": vuln.line,
"severity": vuln.severity
}
})
vulnerabilities.append({
"vulnerability": vuln,
"research": research,
"fix_recommendation": research.recommendations[0]
})
# 3. Generate security report
report = await client.generate_security_report({
"scan_result": scan_result,
"vulnerability_research": vulnerabilities,
"include_remediation": True,
"format": "html"
})
# 4. Create fix PRs for critical issues
for vuln in vulnerabilities:
if vuln['vulnerability'].severity == 'critical':
await client.create_fix_pr({
"vulnerability": vuln['vulnerability'],
"fix": vuln['fix_recommendation'],
"branch_name": f"fix-{vuln['vulnerability'].id}",
"pr_title": f"Security: Fix {vuln['vulnerability'].type}",
"pr_body": report.get_section(vuln['vulnerability'].id)
})
return report
# Run security audit
if __name__ == "__main__":
report = asyncio.run(security_audit("./my-app"))
print(f"Security audit complete. Report saved to: {report.path}")
```
## Best Practices
1. **Start Simple**: Begin with basic wiki generation before moving to advanced features
2. **Use Configuration Files**: Store settings in `deepwiki.config.yaml` for consistency
3. **Leverage Caching**: Cache results for frequently asked questions or repeated analyses
4. **Batch Operations**: Group multiple operations together for better performance
5. **Monitor Progress**: Use progress callbacks for long-running operations
6. **Handle Errors Gracefully**: Implement proper error handling and retries
7. **Secure API Keys**: Never hardcode API keys; use environment variables
8. **Regular Updates**: Keep documentation in sync with code changes using CI/CD
## Next Steps
* Explore [Advanced Features](./advanced-features.mdx) for more complex scenarios
* Check out [API Reference](../api-reference/index.mdx) for detailed API documentation
* Join our [Community](https://deepwiki.com/community) for support and updates
# Custom Models
Source: https://asyncfunc.mintlify.app/examples/custom-models
Configure and use custom AI models with DeepWiki
# Using Custom Models with DeepWiki
DeepWiki supports a wide range of AI models through various providers. This guide covers how to configure and use custom models for optimal performance and cost efficiency.
## Overview
DeepWiki's flexible architecture allows you to use models from:
* OpenRouter (access to 100+ models)
* Ollama (local models)
* Azure OpenAI
* Any OpenAI-compatible endpoint
* Custom API endpoints
## OpenRouter Integration
OpenRouter provides access to multiple model providers through a single API.
### Configuration
```json
// generator.json
{
"provider": "openrouter",
"apiKey": "YOUR_OPENROUTER_API_KEY",
"model": "anthropic/claude-3-opus",
"baseURL": "https://openrouter.ai/api/v1",
"headers": {
"HTTP-Referer": "https://yourapp.com",
"X-Title": "DeepWiki"
}
}
```
### Available Models
Popular models on OpenRouter:
* `anthropic/claude-3-opus` - Best for complex reasoning
* `anthropic/claude-3-sonnet` - Balanced performance/cost
* `openai/gpt-4-turbo` - Latest GPT-4 variant
* `google/gemini-pro` - Google's latest model
* `meta-llama/llama-3-70b` - Open source alternative
### Usage Example
```typescript
// app/lib/ai/generator.ts
import { OpenRouter } from '@openrouter/sdk';
const client = new OpenRouter({
apiKey: process.env.OPENROUTER_API_KEY,
defaultHeaders: {
'HTTP-Referer': process.env.APP_URL,
'X-Title': 'DeepWiki'
}
});
export async function generateContent(prompt: string) {
const response = await client.chat.completions.create({
model: 'anthropic/claude-3-opus',
messages: [{ role: 'user', content: prompt }],
temperature: 0.7,
max_tokens: 4000
});
return response.choices[0].message.content;
}
```
## Ollama for Local Models
Run models locally for privacy and zero API costs.
### Installation
```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull models
ollama pull llama3
ollama pull mistral
ollama pull codellama
```
### Configuration
```json
// generator.json
{
"provider": "ollama",
"baseURL": "http://localhost:11434",
"model": "llama3:70b",
"options": {
"temperature": 0.7,
"num_predict": 4096
}
}
```
### Integration
```typescript
// app/lib/ai/ollama-provider.ts
export class OllamaProvider {
private baseURL: string;
constructor(baseURL = 'http://localhost:11434') {
this.baseURL = baseURL;
}
async generate(prompt: string, model = 'llama3') {
const response = await fetch(`${this.baseURL}/api/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model,
prompt,
stream: false,
options: {
temperature: 0.7,
num_predict: 4096
}
})
});
const data = await response.json();
return data.response;
}
}
```
## Azure OpenAI Configuration
Use Azure's enterprise-grade OpenAI deployment.
### Setup
```json
// generator.json
{
"provider": "azure-openai",
"apiKey": "YOUR_AZURE_API_KEY",
"baseURL": "https://YOUR_RESOURCE.openai.azure.com",
"apiVersion": "2024-02-15-preview",
"deployment": "gpt-4-turbo",
"model": "gpt-4-turbo"
}
```
### Environment Variables
```bash
# .env.local
AZURE_OPENAI_API_KEY=your_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=gpt-4-turbo
AZURE_OPENAI_API_VERSION=2024-02-15-preview
```
### Implementation
```typescript
// app/lib/ai/azure-provider.ts
import { AzureOpenAI } from '@azure/openai';
const client = new AzureOpenAI({
apiKey: process.env.AZURE_OPENAI_API_KEY,
endpoint: process.env.AZURE_OPENAI_ENDPOINT,
apiVersion: process.env.AZURE_OPENAI_API_VERSION
});
export async function generateWithAzure(prompt: string) {
const result = await client.getChatCompletions(
process.env.AZURE_OPENAI_DEPLOYMENT,
[{ role: 'user', content: prompt }],
{
temperature: 0.7,
maxTokens: 4000
}
);
return result.choices[0].message?.content;
}
```
## Custom Model Selection UI
Implement a model selector in your DeepWiki interface.
### Model Selector Component
```tsx
// app/components/model-selector.tsx
import { useState } from 'react';
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select';
const AVAILABLE_MODELS = [
{ id: 'gpt-4-turbo', name: 'GPT-4 Turbo', provider: 'openai' },
{ id: 'claude-3-opus', name: 'Claude 3 Opus', provider: 'anthropic' },
{ id: 'llama3:70b', name: 'Llama 3 70B', provider: 'ollama' },
{ id: 'mistral-large', name: 'Mistral Large', provider: 'mistral' }
];
export function ModelSelector({ onModelChange }: { onModelChange: (model: string) => void }) {
const [selectedModel, setSelectedModel] = useState('gpt-4-turbo');
const handleChange = (value: string) => {
setSelectedModel(value);
onModelChange(value);
};
return (
);
}
```
### Dynamic Model Configuration
```typescript
// app/lib/ai/model-config.ts
export interface ModelConfig {
provider: string;
model: string;
apiKey?: string;
baseURL?: string;
temperature?: number;
maxTokens?: number;
}
export const MODEL_CONFIGS: Record = {
'gpt-4-turbo': {
provider: 'openai',
model: 'gpt-4-turbo-preview',
temperature: 0.7,
maxTokens: 4000
},
'claude-3-opus': {
provider: 'openrouter',
model: 'anthropic/claude-3-opus',
baseURL: 'https://openrouter.ai/api/v1',
temperature: 0.7,
maxTokens: 4000
},
'llama3:70b': {
provider: 'ollama',
model: 'llama3:70b',
baseURL: 'http://localhost:11434',
temperature: 0.8,
maxTokens: 4096
}
};
```
## Modifying generator.json
The `generator.json` file controls model configuration.
### Basic Structure
```json
{
"provider": "openai",
"model": "gpt-4-turbo",
"apiKey": "${OPENAI_API_KEY}",
"temperature": 0.7,
"maxTokens": 4000,
"systemPrompt": "You are a helpful wiki content generator...",
"retryAttempts": 3,
"retryDelay": 1000
}
```
### Multi-Provider Configuration
```json
{
"providers": {
"primary": {
"provider": "openai",
"model": "gpt-4-turbo",
"apiKey": "${OPENAI_API_KEY}"
},
"fallback": {
"provider": "openrouter",
"model": "meta-llama/llama-3-70b",
"apiKey": "${OPENROUTER_API_KEY}",
"baseURL": "https://openrouter.ai/api/v1"
},
"local": {
"provider": "ollama",
"model": "llama3",
"baseURL": "http://localhost:11434"
}
},
"strategy": "fallback",
"timeout": 30000
}
```
## OpenAI-Compatible Endpoints
Many providers offer OpenAI-compatible APIs.
### Generic Configuration
```typescript
// app/lib/ai/openai-compatible.ts
export class OpenAICompatibleProvider {
private apiKey: string;
private baseURL: string;
constructor(config: { apiKey: string; baseURL: string }) {
this.apiKey = config.apiKey;
this.baseURL = config.baseURL;
}
async chat(messages: any[], options: any = {}) {
const response = await fetch(`${this.baseURL}/v1/chat/completions`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
messages,
...options
})
});
return response.json();
}
}
```
### Supported Providers
* **Perplexity AI**: `https://api.perplexity.ai`
* **Together AI**: `https://api.together.xyz/v1`
* **Anyscale**: `https://api.endpoints.anyscale.com/v1`
* **Groq**: `https://api.groq.com/openai/v1`
## Performance Comparisons
### Benchmark Results
| Model | Tokens/Second | Quality Score | Cost/1M Tokens |
| ------------------- | ------------- | ------------- | -------------- |
| GPT-4 Turbo | 50 | 9.5/10 | \$10.00 |
| Claude 3 Opus | 40 | 9.3/10 | \$15.00 |
| Llama 3 70B (Local) | 30 | 8.5/10 | \$0.00 |
| Mistral Large | 60 | 8.8/10 | \$8.00 |
| GPT-3.5 Turbo | 80 | 7.5/10 | \$0.50 |
### Performance Testing Script
```typescript
// scripts/benchmark-models.ts
async function benchmarkModel(provider: any, prompt: string) {
const startTime = Date.now();
let tokens = 0;
try {
const response = await provider.generate(prompt);
tokens = response.usage?.total_tokens || 0;
const duration = Date.now() - startTime;
return {
duration,
tokens,
tokensPerSecond: tokens / (duration / 1000),
cost: calculateCost(provider.model, tokens)
};
} catch (error) {
return { error: error.message };
}
}
```
## Cost Optimization Strategies
### 1. Model Cascading
Use cheaper models first, escalate to expensive ones only when needed.
```typescript
// app/lib/ai/cascade-strategy.ts
export async function generateWithCascade(prompt: string, complexity: 'low' | 'medium' | 'high') {
const models = {
low: 'gpt-3.5-turbo',
medium: 'claude-3-sonnet',
high: 'gpt-4-turbo'
};
const model = models[complexity];
return await generate(prompt, { model });
}
```
### 2. Caching Responses
```typescript
// app/lib/ai/cache-manager.ts
import { Redis } from '@upstash/redis';
const redis = new Redis({
url: process.env.UPSTASH_REDIS_URL,
token: process.env.UPSTASH_REDIS_TOKEN
});
export async function getCachedOrGenerate(
prompt: string,
generator: () => Promise
) {
const cacheKey = `ai:${createHash('sha256').update(prompt).digest('hex')}`;
// Check cache
const cached = await redis.get(cacheKey);
if (cached) return cached;
// Generate and cache
const result = await generator();
await redis.set(cacheKey, result, { ex: 3600 }); // 1 hour TTL
return result;
}
```
### 3. Batch Processing
```typescript
// app/lib/ai/batch-processor.ts
export async function processBatch(prompts: string[], model: string) {
const batchSize = 10;
const results = [];
for (let i = 0; i < prompts.length; i += batchSize) {
const batch = prompts.slice(i, i + batchSize);
const batchResults = await Promise.all(
batch.map(prompt => generate(prompt, { model }))
);
results.push(...batchResults);
}
return results;
}
```
### 4. Token Optimization
```typescript
// app/lib/ai/token-optimizer.ts
export function optimizePrompt(prompt: string, maxTokens: number = 2000) {
// Remove unnecessary whitespace
let optimized = prompt.replace(/\s+/g, ' ').trim();
// Truncate if too long
const encoder = new GPT3Tokenizer({ type: 'gpt3' });
const tokens = encoder.encode(optimized);
if (tokens.length > maxTokens) {
const truncated = tokens.slice(0, maxTokens);
optimized = encoder.decode(truncated);
}
return optimized;
}
```
## Best Practices
### 1. Error Handling
```typescript
export async function generateWithRetry(
prompt: string,
options: any,
maxRetries = 3
) {
for (let i = 0; i < maxRetries; i++) {
try {
return await generate(prompt, options);
} catch (error) {
if (i === maxRetries - 1) throw error;
await new Promise(resolve => setTimeout(resolve, 1000 * (i + 1)));
}
}
}
```
### 2. Model Selection Logic
```typescript
export function selectOptimalModel(requirements: {
maxCost?: number;
minQuality?: number;
maxLatency?: number;
}) {
const models = getAvailableModels();
return models
.filter(m => m.costPer1M <= (requirements.maxCost || Infinity))
.filter(m => m.qualityScore >= (requirements.minQuality || 0))
.filter(m => m.avgLatency <= (requirements.maxLatency || Infinity))
.sort((a, b) => b.qualityScore - a.qualityScore)[0];
}
```
### 3. Monitoring and Logging
```typescript
export async function trackModelUsage(
model: string,
tokens: number,
duration: number
) {
await db.modelUsage.create({
data: {
model,
tokens,
duration,
cost: calculateCost(model, tokens),
timestamp: new Date()
}
});
}
```
## Conclusion
DeepWiki's flexible model system allows you to optimize for your specific needs:
* Use **OpenRouter** for access to multiple models
* Deploy **Ollama** for privacy and zero API costs
* Choose **Azure OpenAI** for enterprise requirements
* Implement **cascading strategies** for cost optimization
* Monitor usage and performance to make informed decisions
Remember to regularly review your model usage and costs to ensure you're using the most appropriate models for your use case.
# null
Source: https://asyncfunc.mintlify.app/examples/enterprise-setup
# Enterprise Deployment Guide
This guide covers comprehensive enterprise deployment strategies for DeepWikiOpen, focusing on security, scalability, and compliance requirements for production environments.
## Table of Contents
1. [Private API Channel Configuration](#private-api-channel-configuration)
2. [Custom Base URL Setup](#custom-base-url-setup)
3. [Authorization Mode Configuration](#authorization-mode-configuration)
4. [Multi-User Deployment Strategies](#multi-user-deployment-strategies)
5. [Security Hardening](#security-hardening)
6. [Corporate Authentication Integration](#corporate-authentication-integration)
7. [Monitoring and Logging](#monitoring-and-logging)
8. [High Availability Configuration](#high-availability-configuration)
9. [Backup and Disaster Recovery](#backup-and-disaster-recovery)
10. [Compliance Considerations](#compliance-considerations)
## Private API Channel Configuration
Configure DeepWikiOpen to use private API channels for enhanced security and control.
### API Gateway Setup
```yaml
# api-gateway-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: deepwikiopen-api-config
data:
api-config.json: |
{
"channels": {
"private": {
"endpoint": "https://api-internal.company.com/deepwiki",
"protocol": "https",
"port": 443,
"timeout": 30000,
"retryPolicy": {
"maxRetries": 3,
"backoffMultiplier": 2
}
}
},
"routing": {
"defaultChannel": "private",
"fallbackEnabled": false
}
}
```
### Environment Configuration
```bash
# .env.production
DEEPWIKI_API_CHANNEL=private
DEEPWIKI_API_ENDPOINT=https://api-internal.company.com/deepwiki
DEEPWIKI_API_KEY_ROTATION_DAYS=30
DEEPWIKI_API_RATE_LIMIT=1000
DEEPWIKI_API_BURST_LIMIT=2000
```
### Network Isolation
```yaml
# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deepwikiopen-api-isolation
spec:
podSelector:
matchLabels:
app: deepwikiopen
policyTypes:
- Ingress
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: api-gateway
ports:
- protocol: TCP
port: 443
```
## Custom Base URL Setup
Configure custom base URLs for corporate proxy environments.
### Proxy Configuration
```javascript
// config/proxy.js
module.exports = {
production: {
baseUrl: process.env.CORPORATE_BASE_URL || 'https://deepwiki.company.com',
proxy: {
host: process.env.PROXY_HOST || 'proxy.company.com',
port: process.env.PROXY_PORT || 8080,
auth: {
username: process.env.PROXY_USERNAME,
password: process.env.PROXY_PASSWORD
},
protocol: 'https'
},
bypassList: [
'localhost',
'127.0.0.1',
'.company.internal'
]
}
}
```
### Reverse Proxy Setup (NGINX)
```nginx
# nginx.conf
upstream deepwikiopen_backend {
least_conn;
server backend1.internal:8080 weight=5;
server backend2.internal:8080 weight=5;
server backend3.internal:8080 weight=5;
keepalive 32;
keepalive_requests 100;
keepalive_timeout 60s;
}
server {
listen 443 ssl http2;
server_name deepwiki.company.com;
ssl_certificate /etc/nginx/certs/company.crt;
ssl_certificate_key /etc/nginx/certs/company.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass https://deepwikiopen_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Corporate headers
proxy_set_header X-Company-User $http_x_company_user;
proxy_set_header X-Company-Department $http_x_company_department;
# Security headers
add_header X-Content-Type-Options nosniff;
add_header X-Frame-Options DENY;
add_header X-XSS-Protection "1; mode=block";
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains";
}
}
```
## Authorization Mode Configuration
Implement robust authorization modes for enterprise environments.
### RBAC Configuration
```yaml
# rbac-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: deepwikiopen-rbac
data:
roles.json: |
{
"roles": {
"admin": {
"permissions": ["read", "write", "delete", "admin"],
"resources": ["*"]
},
"developer": {
"permissions": ["read", "write"],
"resources": ["projects/*", "documents/*"]
},
"viewer": {
"permissions": ["read"],
"resources": ["documents/*", "public/*"]
}
},
"defaultRole": "viewer",
"superAdmins": ["admin@company.com"]
}
```
### OAuth2/OIDC Integration
```javascript
// auth/oauth-config.js
const OIDCStrategy = require('passport-openidconnect').Strategy;
module.exports = {
strategy: new OIDCStrategy({
issuer: process.env.OIDC_ISSUER || 'https://auth.company.com',
authorizationURL: 'https://auth.company.com/authorize',
tokenURL: 'https://auth.company.com/token',
userInfoURL: 'https://auth.company.com/userinfo',
clientID: process.env.OIDC_CLIENT_ID,
clientSecret: process.env.OIDC_CLIENT_SECRET,
callbackURL: 'https://deepwiki.company.com/auth/callback',
scope: ['openid', 'profile', 'email', 'groups']
}, (issuer, profile, done) => {
// Map corporate groups to DeepWiki roles
const roles = mapCorporateGroups(profile.groups);
return done(null, {
id: profile.id,
email: profile.email,
roles: roles,
department: profile.department
});
}),
sessionConfig: {
secret: process.env.SESSION_SECRET,
resave: false,
saveUninitialized: false,
cookie: {
secure: true,
httpOnly: true,
maxAge: 8 * 60 * 60 * 1000, // 8 hours
sameSite: 'strict'
}
}
};
```
### API Key Management
```javascript
// auth/api-key-manager.js
class EnterpriseAPIKeyManager {
constructor(config) {
this.rotationDays = config.rotationDays || 30;
this.keyVault = config.keyVault;
}
async rotateKeys() {
const keys = await this.keyVault.listKeys();
const now = Date.now();
for (const key of keys) {
if (now - key.createdAt > this.rotationDays * 24 * 60 * 60 * 1000) {
// Generate new key
const newKey = await this.generateSecureKey();
// Store in vault
await this.keyVault.store({
id: key.id,
key: newKey,
createdAt: now,
previousKey: key.key,
rotatedAt: now
});
// Notify administrators
await this.notifyKeyRotation(key.id);
}
}
}
async generateSecureKey() {
const crypto = require('crypto');
return crypto.randomBytes(32).toString('base64');
}
}
```
## Multi-User Deployment Strategies
### Kubernetes Deployment
```yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deepwikiopen
labels:
app: deepwikiopen
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: deepwikiopen
template:
metadata:
labels:
app: deepwikiopen
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- deepwikiopen
topologyKey: kubernetes.io/hostname
containers:
- name: deepwikiopen
image: company.registry.io/deepwikiopen:enterprise-1.0
ports:
- containerPort: 8080
env:
- name: NODE_ENV
value: "production"
- name: MULTI_TENANT_MODE
value: "true"
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
```
### Multi-Tenant Configuration
```javascript
// config/multi-tenant.js
module.exports = {
multiTenant: {
enabled: true,
isolation: 'database', // 'database' | 'schema' | 'row'
strategy: {
identification: 'subdomain', // 'subdomain' | 'header' | 'jwt'
headerName: 'X-Tenant-ID',
defaultTenant: 'default'
},
database: {
poolSize: 10,
idleTimeout: 30000,
connectionLimit: 100
},
cache: {
ttl: 3600,
checkPeriod: 600,
maxKeys: 10000
}
}
};
```
## Security Hardening
### Security Headers
```javascript
// middleware/security.js
const helmet = require('helmet');
module.exports = {
configure: (app) => {
app.use(helmet({
contentSecurityPolicy: {
directives: {
defaultSrc: ["'self'"],
styleSrc: ["'self'", "'unsafe-inline'"],
scriptSrc: ["'self'"],
imgSrc: ["'self'", "data:", "https:"],
connectSrc: ["'self'", "wss:", "https:"],
fontSrc: ["'self'"],
objectSrc: ["'none'"],
mediaSrc: ["'self'"],
frameSrc: ["'none'"],
},
},
hsts: {
maxAge: 31536000,
includeSubDomains: true,
preload: true
}
}));
// Additional security middleware
app.use(require('./rate-limiter'));
app.use(require('./input-validation'));
app.use(require('./sql-injection-prevention'));
}
};
```
### Encryption at Rest
```yaml
# encryption-config.yaml
apiVersion: v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
- configmaps
providers:
- aescbc:
keys:
- name: key1
secret:
- identity: {}
```
### Network Security
```yaml
# pod-security-policy.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: deepwikiopen-psp
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
- 'persistentVolumeClaim'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
readOnlyRootFilesystem: true
```
## Corporate Authentication Integration
### LDAP Integration
```javascript
// auth/ldap-config.js
const LdapAuth = require('ldapauth-fork');
module.exports = {
ldap: new LdapAuth({
url: process.env.LDAP_URL || 'ldaps://ldap.company.com:636',
bindDN: process.env.LDAP_BIND_DN,
bindCredentials: process.env.LDAP_BIND_PASSWORD,
searchBase: process.env.LDAP_SEARCH_BASE || 'ou=users,dc=company,dc=com',
searchFilter: '(&(objectClass=user)(sAMAccountName={{username}}))',
searchAttributes: ['displayName', 'mail', 'memberOf', 'department'],
groupSearchBase: 'ou=groups,dc=company,dc=com',
groupSearchFilter: '(&(objectClass=group)(member={{dn}}))',
groupSearchAttributes: ['cn', 'description'],
cache: true,
cacheTime: 3600000, // 1 hour
tlsOptions: {
ca: [fs.readFileSync('./certs/company-ca.crt')]
}
}),
mapUserRoles: (ldapUser) => {
const roles = [];
if (ldapUser.memberOf) {
const groups = Array.isArray(ldapUser.memberOf)
? ldapUser.memberOf
: [ldapUser.memberOf];
groups.forEach(group => {
if (group.includes('CN=DeepWiki-Admins')) roles.push('admin');
if (group.includes('CN=DeepWiki-Developers')) roles.push('developer');
if (group.includes('CN=DeepWiki-Users')) roles.push('viewer');
});
}
return roles.length > 0 ? roles : ['viewer'];
}
};
```
### SAML Integration
```javascript
// auth/saml-config.js
const saml = require('passport-saml');
module.exports = {
strategy: new saml.Strategy({
callbackUrl: 'https://deepwiki.company.com/auth/saml/callback',
entryPoint: process.env.SAML_ENTRY_POINT,
issuer: 'deepwikiopen',
cert: fs.readFileSync('./certs/saml-cert.pem', 'utf-8'),
privateKey: fs.readFileSync('./certs/saml-key.pem', 'utf-8'),
identifierFormat: 'urn:oasis:names:tc:SAML:2.0:nameid-format:persistent',
wantAssertionsSigned: true,
signatureAlgorithm: 'sha256'
}, (profile, done) => {
return done(null, {
id: profile.nameID,
email: profile.email,
displayName: profile.displayName,
department: profile['http://schemas.company.com/claims/department'],
roles: profile['http://schemas.company.com/claims/roles']
});
})
};
```
## Monitoring and Logging
### Prometheus Metrics
```javascript
// monitoring/metrics.js
const prometheus = require('prom-client');
// Custom metrics
const httpRequestDuration = new prometheus.Histogram({
name: 'deepwiki_http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status_code'],
buckets: [0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, 10]
});
const activeUsers = new prometheus.Gauge({
name: 'deepwiki_active_users',
help: 'Number of active users',
labelNames: ['tenant', 'department']
});
const apiCallsTotal = new prometheus.Counter({
name: 'deepwiki_api_calls_total',
help: 'Total number of API calls',
labelNames: ['endpoint', 'method', 'status']
});
// Export metrics endpoint
module.exports = {
endpoint: '/metrics',
register: prometheus.register,
metrics: {
httpRequestDuration,
activeUsers,
apiCallsTotal
}
};
```
### ELK Stack Configuration
```yaml
# filebeat.yml
filebeat.inputs:
- type: container
paths:
- /var/lib/docker/containers/*/*.log
processors:
- add_kubernetes_metadata:
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: "/var/lib/docker/containers/"
multiline.pattern: '^\d{4}-\d{2}-\d{2}'
multiline.negate: true
multiline.match: after
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
indices:
- index: "deepwiki-%{[agent.version]}-%{+yyyy.MM.dd}"
when.contains:
kubernetes.labels.app: "deepwikiopen"
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
```
### Application Performance Monitoring
```javascript
// monitoring/apm.js
const apm = require('elastic-apm-node');
module.exports = {
init: () => {
apm.start({
serviceName: 'deepwikiopen',
secretToken: process.env.APM_SECRET_TOKEN,
serverUrl: process.env.APM_SERVER_URL,
environment: process.env.NODE_ENV,
captureBody: 'all',
captureHeaders: true,
transactionSampleRate: 0.1,
spanFramesMinDuration: '5ms',
usePathAsTransactionName: true,
ignoreUrls: ['/health', '/metrics'],
errorOnAbortedRequests: true,
captureErrorLogStackTraces: 'always',
logLevel: 'info'
});
}
};
```
## High Availability Configuration
### Database Clustering
```yaml
# postgres-ha.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: deepwiki-postgres-cluster
spec:
instances: 3
primaryUpdateStrategy: unsupervised
postgresql:
parameters:
max_connections: "200"
shared_buffers: "256MB"
effective_cache_size: "1GB"
maintenance_work_mem: "64MB"
checkpoint_completion_target: "0.9"
wal_buffers: "16MB"
default_statistics_target: "100"
random_page_cost: "1.1"
effective_io_concurrency: "200"
work_mem: "4MB"
min_wal_size: "1GB"
max_wal_size: "2GB"
bootstrap:
initdb:
database: deepwiki
owner: deepwiki_user
secret:
name: deepwiki-db-secret
monitoring:
enabled: true
customQueriesConfigMap:
- name: deepwiki-metrics
key: queries.yaml
```
### Redis Sentinel Configuration
```yaml
# redis-sentinel.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-sentinel-config
data:
sentinel.conf: |
port 26379
bind 0.0.0.0
sentinel announce-ip ${HOSTNAME}.redis-sentinel
sentinel down-after-milliseconds mymaster 5000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 60000
sentinel auth-pass mymaster ${REDIS_PASSWORD}
sentinel monitor mymaster redis-master 6379 2
```
### Load Balancer Configuration
```yaml
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: deepwikiopen-lb
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
spec:
type: LoadBalancer
selector:
app: deepwikiopen
ports:
- port: 443
targetPort: 8080
protocol: TCP
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
```
## Backup and Disaster Recovery
### Automated Backup Strategy
```yaml
# velero-backup.yaml
apiVersion: velero.io/v1
kind: Schedule
metadata:
name: deepwiki-daily-backup
spec:
schedule: "0 2 * * *"
template:
ttl: 720h0m0s
includedNamespaces:
- deepwikiopen
includedResources:
- '*'
storageLocation: deepwiki-backup-location
volumeSnapshotLocations:
- deepwiki-snapshots
hooks:
resources:
- name: database-backup
includedNamespaces:
- deepwikiopen
labelSelector:
matchLabels:
app: postgres
pre:
- exec:
container: postgres
command:
- /bin/bash
- -c
- pg_dump -U $POSTGRES_USER -d $POSTGRES_DB > /backup/dump.sql
```
### Disaster Recovery Plan
```javascript
// dr/recovery-plan.js
class DisasterRecoveryPlan {
constructor(config) {
this.rto = config.rto || '4 hours'; // Recovery Time Objective
this.rpo = config.rpo || '1 hour'; // Recovery Point Objective
}
async executeRecovery(disaster) {
const steps = [
this.assessDamage(disaster),
this.activateSecondaryRegion(),
this.restoreFromBackup(),
this.validateDataIntegrity(),
this.switchTraffic(),
this.notifyStakeholders()
];
for (const step of steps) {
await step;
await this.logRecoveryStep(step);
}
}
async testDRPlan() {
// Automated DR testing
const scenarios = [
'database-failure',
'region-outage',
'data-corruption',
'security-breach'
];
for (const scenario of scenarios) {
await this.simulateDisaster(scenario);
await this.measureRecoveryMetrics();
}
}
}
```
### Cross-Region Replication
```yaml
# cross-region-sync.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: cross-region-sync
spec:
schedule: "*/15 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: sync
image: company.registry.io/deepwiki-sync:latest
env:
- name: SOURCE_REGION
value: "us-east-1"
- name: TARGET_REGION
value: "us-west-2"
- name: SYNC_MODE
value: "incremental"
command:
- /bin/sh
- -c
- |
rsync -avz --delete \
--exclude='temp/*' \
--exclude='cache/*' \
/data/source/ /data/target/
```
## Compliance Considerations
### GDPR Compliance
```javascript
// compliance/gdpr.js
class GDPRCompliance {
constructor() {
this.consentManager = new ConsentManager();
this.dataProcessor = new PersonalDataProcessor();
}
async handleDataRequest(userId, requestType) {
switch (requestType) {
case 'access':
return await this.exportUserData(userId);
case 'rectification':
return await this.correctUserData(userId);
case 'erasure':
return await this.deleteUserData(userId);
case 'portability':
return await this.exportPortableData(userId);
default:
throw new Error('Invalid request type');
}
}
async anonymizeData(data) {
// Implement data anonymization
const anonymized = {
...data,
email: this.hashEmail(data.email),
name: 'ANONYMIZED',
ipAddress: this.anonymizeIP(data.ipAddress)
};
return anonymized;
}
}
```
### Audit Logging
```javascript
// compliance/audit-logger.js
class AuditLogger {
constructor(config) {
this.storage = config.storage;
this.encryption = config.encryption;
}
async log(event) {
const auditEntry = {
timestamp: new Date().toISOString(),
userId: event.userId,
action: event.action,
resource: event.resource,
result: event.result,
ipAddress: event.ipAddress,
userAgent: event.userAgent,
department: event.department,
metadata: event.metadata
};
// Encrypt sensitive data
const encrypted = await this.encryption.encrypt(auditEntry);
// Store with integrity check
await this.storage.store({
data: encrypted,
hash: this.calculateHash(encrypted),
signature: await this.sign(encrypted)
});
}
async generateComplianceReport(startDate, endDate) {
const entries = await this.storage.query({
startDate,
endDate
});
return {
summary: this.summarizeActivity(entries),
accessPatterns: this.analyzeAccess(entries),
anomalies: this.detectAnomalies(entries),
privilegedActions: this.filterPrivileged(entries)
};
}
}
```
### Data Retention Policies
```yaml
# data-retention-policy.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: data-retention-policy
data:
policy.json: |
{
"policies": {
"user-data": {
"retentionDays": 2555,
"archiveAfterDays": 365,
"deletePersonalInfoAfterDays": 1095
},
"audit-logs": {
"retentionDays": 2555,
"immutable": true,
"compressionAfterDays": 90
},
"application-logs": {
"retentionDays": 90,
"compressionAfterDays": 7
},
"metrics": {
"retentionDays": 365,
"aggregationRules": {
"1m": 24,
"5m": 168,
"1h": 730
}
}
}
}
```
## Security Best Practices Summary
1. **Zero Trust Architecture**: Never trust, always verify
2. **Principle of Least Privilege**: Grant minimum required permissions
3. **Defense in Depth**: Multiple layers of security controls
4. **Regular Security Audits**: Quarterly penetration testing
5. **Incident Response Plan**: Documented and tested procedures
6. **Encryption Everywhere**: In transit and at rest
7. **Continuous Monitoring**: Real-time threat detection
8. **Patch Management**: Automated security updates
9. **Access Reviews**: Regular audit of user permissions
10. **Security Training**: Regular employee security awareness
## Deployment Checklist
* [ ] Configure private API channels
* [ ] Set up custom base URLs and proxies
* [ ] Implement authentication and authorization
* [ ] Deploy multi-user infrastructure
* [ ] Apply security hardening measures
* [ ] Integrate corporate authentication
* [ ] Set up monitoring and logging
* [ ] Configure high availability
* [ ] Implement backup strategy
* [ ] Ensure compliance requirements
* [ ] Test disaster recovery plan
* [ ] Conduct security audit
* [ ] Document operational procedures
* [ ] Train operations team
* [ ] Schedule regular reviews
## Support and Resources
* **Enterprise Support Portal**: [https://support.deepwikiopen.enterprise](https://support.deepwikiopen.enterprise)
* **Security Updates**: [https://security.deepwikiopen.enterprise](https://security.deepwikiopen.enterprise)
* **Compliance Documentation**: [https://compliance.deepwikiopen.enterprise](https://compliance.deepwikiopen.enterprise)
* **Operations Runbook**: [https://runbook.deepwikiopen.enterprise](https://runbook.deepwikiopen.enterprise)
***
For additional enterprise features or custom requirements, please contact your DeepWikiOpen enterprise support team.
# Troubleshooting Scenarios
Source: https://asyncfunc.mintlify.app/examples/troubleshooting-scenarios
Real-world troubleshooting examples and solutions for common DeepWikiOpen issues
# Troubleshooting Scenarios
This guide provides step-by-step solutions for common issues you may encounter when using DeepWikiOpen.
## 1. API Key Validation Issues
### Scenario: "Invalid API key" error when starting the application
#### Symptoms
```bash
Error: Invalid API key provided
Status: 401 Unauthorized
```
#### Debugging Steps
1. **Verify API key format**
```bash
# Check if API key is properly set
echo $ANTHROPIC_API_KEY
# Should start with "sk-ant-api03-"
```
2. **Check environment file**
```bash
# Ensure .env file exists
ls -la .env
# Verify contents
cat .env | grep ANTHROPIC_API_KEY
```
3. **Test API key directly**
```bash
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 10,
"messages": [{"role": "user", "content": "Hello"}]
}'
```
#### Solutions
```bash
# Solution 1: Export API key correctly
export ANTHROPIC_API_KEY="sk-ant-api03-your-actual-key"
# Solution 2: Fix .env file formatting
echo 'ANTHROPIC_API_KEY="sk-ant-api03-your-actual-key"' > .env
# Solution 3: Use Docker with proper env passing
docker run -e ANTHROPIC_API_KEY="$ANTHROPIC_API_KEY" \
deepwikiopen/app:latest
# Solution 4: Check for special characters
# Ensure no trailing spaces or newlines
ANTHROPIC_API_KEY=$(echo "$ANTHROPIC_API_KEY" | tr -d ' \n')
```
## 2. Connection Problems (CORS, Network)
### Scenario: CORS errors when accessing the API
#### Symptoms
```
Access to XMLHttpRequest at 'http://localhost:8000/api' from origin
'http://localhost:3000' has been blocked by CORS policy
```
#### Debugging Steps
1. **Check server CORS configuration**
```python
# backend/main.py
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:3000"], # Add your frontend URL
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
```
2. **Verify network connectivity**
```bash
# Test backend is running
curl http://localhost:8000/health
# Check ports are not blocked
netstat -an | grep -E "8000|3000"
```
3. **Inspect browser network tab**
* Open Developer Tools → Network tab
* Look for preflight OPTIONS requests
* Check response headers for CORS headers
#### Solutions
```javascript
// Solution 1: Frontend proxy configuration
// frontend/package.json
{
"proxy": "http://localhost:8000"
}
// Solution 2: Use environment-specific API URLs
const API_BASE = process.env.REACT_APP_API_URL || 'http://localhost:8000';
// Solution 3: nginx reverse proxy
// nginx.conf
server {
location /api {
proxy_pass http://backend:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
```
## 3. Large Repository Handling
### Scenario: Application crashes when analyzing large repositories
#### Symptoms
```
Error: JavaScript heap out of memory
FATAL ERROR: Reached heap limit Allocation failed
```
#### Debugging Steps
1. **Monitor memory usage**
```bash
# Check current memory usage
docker stats
# Monitor Node.js memory
node --max-old-space-size=4096 server.js
```
2. **Profile repository size**
```bash
# Check repository size
du -sh /path/to/repo
# Count files
find /path/to/repo -type f | wc -l
```
#### Solutions
```javascript
// Solution 1: Implement streaming file processing
const processLargeRepo = async (repoPath) => {
const stream = fs.createReadStream(repoPath);
const chunks = [];
for await (const chunk of stream) {
// Process in chunks
await processChunk(chunk);
// Clear memory periodically
if (chunks.length > 1000) {
await flushToDatabase(chunks);
chunks.length = 0;
}
}
};
// Solution 2: Use worker threads
const { Worker } = require('worker_threads');
const analyzeInWorker = (filePath) => {
return new Promise((resolve, reject) => {
const worker = new Worker('./analyzer.js', {
workerData: { filePath }
});
worker.on('message', resolve);
worker.on('error', reject);
});
};
// Solution 3: Docker memory limits
// docker-compose.yml
services:
app:
mem_limit: 4g
memswap_limit: 4g
```
## 4. Memory and Performance Issues
### Scenario: Slow response times and high memory usage
#### Debugging Steps
1. **Profile application performance**
```bash
# Node.js profiling
node --inspect server.js
# Python profiling
python -m cProfile -o profile.stats app.py
```
2. **Monitor resource usage**
```bash
# Real-time monitoring
htop
iotop
# Docker resource usage
docker stats --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}"
```
#### Solutions
```python
# Solution 1: Implement caching
from functools import lru_cache
import redis
cache = redis.Redis(host='localhost', port=6379)
@lru_cache(maxsize=1000)
def expensive_operation(repo_path):
# Check cache first
cached = cache.get(f"analysis:{repo_path}")
if cached:
return json.loads(cached)
result = perform_analysis(repo_path)
cache.set(f"analysis:{repo_path}", json.dumps(result), ex=3600)
return result
# Solution 2: Batch processing
async def batch_analyze(files, batch_size=100):
for i in range(0, len(files), batch_size):
batch = files[i:i + batch_size]
await asyncio.gather(*[analyze_file(f) for f in batch])
# Allow garbage collection
gc.collect()
```
## 5. Docker Container Problems
### Scenario: Container fails to start or crashes immediately
#### Symptoms
```
docker: Error response from daemon: OCI runtime create failed
Container exited with code 137 (Out of Memory)
```
#### Debugging Steps
1. **Check container logs**
```bash
docker logs deepwikiopen-app
docker logs --tail 50 -f deepwikiopen-app
```
2. **Inspect container**
```bash
docker inspect deepwikiopen-app
docker exec -it deepwikiopen-app /bin/sh
```
#### Solutions
```bash
# Solution 1: Rebuild with proper base image
FROM node:18-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
# Solution 2: Fix permission issues
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nodejs -u 1001
USER nodejs
# Solution 3: Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s \
CMD node healthcheck.js || exit 1
```
## 6. Model Selection Errors
### Scenario: "Model not found" or incorrect model responses
#### Symptoms
```
Error: Model 'claude-3-opus' not found
Available models: ['claude-3-5-sonnet-20241022', 'claude-3-5-haiku-20241022']
```
#### Debugging Steps
1. **List available models**
```python
import anthropic
client = anthropic.Anthropic()
# Check model availability based on your API tier
```
2. **Verify model configuration**
```javascript
// config/models.js
export const AVAILABLE_MODELS = {
'claude-3-5-sonnet-20241022': {
maxTokens: 8192,
contextWindow: 200000
},
'claude-3-5-haiku-20241022': {
maxTokens: 8192,
contextWindow: 200000
}
};
```
#### Solutions
```javascript
// Solution 1: Implement model fallback
const getModel = (preferred) => {
const fallbackChain = [
'claude-3-5-sonnet-20241022',
'claude-3-5-haiku-20241022',
'claude-3-haiku-20240307'
];
if (AVAILABLE_MODELS[preferred]) {
return preferred;
}
return fallbackChain.find(model => AVAILABLE_MODELS[model]);
};
// Solution 2: Dynamic model selection
const selectOptimalModel = (contextLength, speed = 'balanced') => {
if (speed === 'fast' || contextLength < 10000) {
return 'claude-3-5-haiku-20241022';
}
return 'claude-3-5-sonnet-20241022';
};
```
## 7. Private Repository Access Issues
### Scenario: Cannot access private GitHub repositories
#### Symptoms
```
Error: Repository not found or you don't have access
Status: 404
```
#### Debugging Steps
1. **Verify GitHub token**
```bash
# Test token permissions
curl -H "Authorization: token $GITHUB_TOKEN" \
https://api.github.com/user/repos
```
2. **Check token scopes**
```bash
curl -H "Authorization: token $GITHUB_TOKEN" \
-I https://api.github.com/user
# Look for X-OAuth-Scopes header
```
#### Solutions
```bash
# Solution 1: Create token with correct scopes
# Go to GitHub Settings → Developer settings → Personal access tokens
# Required scopes: repo, read:org (for private repos)
# Solution 2: Use GitHub App authentication
# github-app-auth.js
const { createAppAuth } = require("@octokit/auth-app");
const auth = createAppAuth({
appId: process.env.GITHUB_APP_ID,
privateKey: process.env.GITHUB_PRIVATE_KEY,
installationId: process.env.GITHUB_INSTALLATION_ID,
});
# Solution 3: SSH key authentication for cloning
ssh-keygen -t ed25519 -C "deepwikiopen@example.com"
# Add to GitHub account settings
```
## 8. WebSocket Connection Failures
### Scenario: Real-time updates not working, WebSocket disconnects
#### Symptoms
```
WebSocket connection to 'ws://localhost:8000/ws' failed
Error: Connection closed before established
```
#### Debugging Steps
1. **Test WebSocket endpoint**
```javascript
// Test WebSocket connection
const ws = new WebSocket('ws://localhost:8000/ws');
ws.onopen = () => console.log('Connected');
ws.onerror = (error) => console.error('Error:', error);
ws.onclose = (event) => console.log('Closed:', event.code, event.reason);
```
2. **Check server WebSocket implementation**
```python
# Ensure WebSocket endpoint exists
@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
await websocket.accept()
# Handle connection
```
#### Solutions
```javascript
// Solution 1: Implement reconnection logic
class ReconnectingWebSocket {
constructor(url, options = {}) {
this.url = url;
this.reconnectInterval = options.reconnectInterval || 1000;
this.maxReconnectInterval = options.maxReconnectInterval || 30000;
this.reconnectDecay = options.reconnectDecay || 1.5;
this.reconnectAttempts = 0;
this.connect();
}
connect() {
this.ws = new WebSocket(this.url);
this.ws.onclose = () => {
this.reconnect();
};
this.ws.onerror = (error) => {
console.error('WebSocket error:', error);
this.ws.close();
};
}
reconnect() {
this.reconnectAttempts++;
const timeout = Math.min(
this.reconnectInterval * Math.pow(this.reconnectDecay, this.reconnectAttempts),
this.maxReconnectInterval
);
setTimeout(() => this.connect(), timeout);
}
}
// Solution 2: Use Socket.IO for better reliability
const io = require('socket.io')(server, {
cors: {
origin: "http://localhost:3000",
methods: ["GET", "POST"]
},
reconnection: true,
reconnectionAttempts: 5,
reconnectionDelay: 1000,
});
```
## 9. Rate Limiting Scenarios
### Scenario: API rate limit exceeded errors
#### Symptoms
```
Error: Rate limit exceeded. Please retry after 30 seconds.
Status: 429 Too Many Requests
```
#### Debugging Steps
1. **Monitor rate limit headers**
```javascript
// Check response headers
response.headers.get('X-RateLimit-Limit')
response.headers.get('X-RateLimit-Remaining')
response.headers.get('X-RateLimit-Reset')
```
#### Solutions
```javascript
// Solution 1: Implement rate limit handling
class RateLimiter {
constructor(maxRequests = 50, windowMs = 60000) {
this.maxRequests = maxRequests;
this.windowMs = windowMs;
this.requests = [];
}
async acquire() {
const now = Date.now();
this.requests = this.requests.filter(time => now - time < this.windowMs);
if (this.requests.length >= this.maxRequests) {
const oldestRequest = this.requests[0];
const waitTime = this.windowMs - (now - oldestRequest);
await new Promise(resolve => setTimeout(resolve, waitTime));
return this.acquire();
}
this.requests.push(now);
}
}
// Solution 2: Implement exponential backoff
async function makeRequestWithRetry(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000;
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}
// Solution 3: Queue requests
const pQueue = require('p-queue').default;
const queue = new pQueue({
concurrency: 2,
interval: 1000,
intervalCap: 10
});
```
## 10. Common Error Messages and Solutions
### Error: "Context length exceeded"
#### Solution
```javascript
// Implement context window management
const truncateContext = (messages, maxTokens = 150000) => {
let totalTokens = 0;
const truncated = [];
for (let i = messages.length - 1; i >= 0; i--) {
const tokens = estimateTokens(messages[i].content);
if (totalTokens + tokens > maxTokens) break;
totalTokens += tokens;
truncated.unshift(messages[i]);
}
return truncated;
};
```
### Error: "WebContainer initialization failed"
#### Solution
```javascript
// Ensure proper WebContainer setup
import { WebContainer } from '@webcontainer/api';
let webcontainerInstance;
async function initWebContainer() {
try {
webcontainerInstance = await WebContainer.boot();
console.log('WebContainer initialized');
} catch (error) {
console.error('Failed to initialize WebContainer:', error);
// Fallback to server-side execution
return initServerSideContainer();
}
}
```
### Error: "Database connection timeout"
#### Solution
```python
# Implement connection pooling and retry logic
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool
import time
def create_db_engine(retry_count=3, retry_delay=5):
for attempt in range(retry_count):
try:
engine = create_engine(
DATABASE_URL,
poolclass=QueuePool,
pool_size=10,
max_overflow=20,
pool_timeout=30,
pool_recycle=3600
)
# Test connection
with engine.connect() as conn:
conn.execute("SELECT 1")
return engine
except Exception as e:
if attempt < retry_count - 1:
time.sleep(retry_delay)
continue
raise e
```
## Best Practices for Troubleshooting
1. **Enable verbose logging**
```bash
export DEBUG=deepwikiopen:*
export LOG_LEVEL=debug
```
2. **Use health check endpoints**
```bash
curl http://localhost:8000/health
curl http://localhost:8000/api/status
```
3. **Monitor system resources**
```bash
# Create monitoring script
#!/bin/bash
while true; do
echo "=== $(date) ==="
docker stats --no-stream
sleep 5
done
```
4. **Implement comprehensive error handling**
```javascript
app.use((err, req, res, next) => {
console.error('Error details:', {
message: err.message,
stack: err.stack,
timestamp: new Date().toISOString(),
request: {
method: req.method,
url: req.url,
headers: req.headers
}
});
res.status(err.status || 500).json({
error: err.message,
timestamp: new Date().toISOString(),
requestId: req.id
});
});
```
## Getting Help
If you encounter issues not covered here:
1. Check the [GitHub Issues](https://github.com/deepwikiopen/deepwikiopen/issues)
2. Enable debug logging and collect logs
3. Provide system information:
```bash
node --version
npm --version
docker --version
uname -a
```
4. Include minimal reproduction steps
5. Join our [Discord community](https://discord.gg/deepwikiopen) for real-time support
# Authorization Mode
Source: https://asyncfunc.mintlify.app/getting-started/authorization-mode
Secure your DeepWiki-Open instance with authorization mode - requiring authentication codes for wiki generation and API access
Authorization mode provides access control for your DeepWiki-Open instance by requiring users to enter a secret authorization code before generating wikis. This is essential for production deployments, shared environments, or when you want to control who can generate documentation from repositories.
## What is Authorization Mode?
Authorization mode is a simple but effective security feature that:
Users must enter an authorization code in the web interface
All API requests must include the authorization code
Prevents unauthorized usage of AI provider quotas
Tracks and logs authentication attempts
## When to Use Authorization Mode
### Recommended Scenarios
**Always enable authorization mode in production environments:**
* Public-facing instances
* Shared hosting environments
* Corporate deployments
* Any instance accessible from the internet
**Benefits:**
* Prevents unauthorized access to AI providers
* Controls API usage and costs
* Maintains audit trail of access
**Shared development environments:**
* Team wikis for private repositories
* Collaborative documentation projects
* Educational environments
* Demo instances with controlled access
**Benefits:**
* Ensures only authorized team members can generate docs
* Prevents accidental quota exhaustion
* Maintains consistent access control
**AI provider quota protection:**
* Limited API budgets
* Pay-per-use providers
* Rate-limited accounts
* Enterprise cost control
**Benefits:**
* Prevents unexpected charges
* Controls usage patterns
* Enables usage tracking per authorization
### When NOT to Use Authorization Mode
Authorization mode can be disabled for:
* Local development instances
* Personal single-user setups
* Internal networks with existing security
* Testing and experimentation environments
## Configuration
### Environment Variables
Configure authorization mode using these environment variables:
DEEPWIKI\_AUTH\_MODE
Enable or disable authorization mode.
DEEPWIKI\_AUTH\_CODE
Secret authorization code required for access.
### Basic Configuration
Add to your `.env` file:
```env
# Enable authorization mode
DEEPWIKI_AUTH_MODE=true
# Set your secret authorization code
DEEPWIKI_AUTH_CODE=your-secure-code-here
```
Choose a strong, unique authorization code. Avoid common passwords or easily guessable codes.
Authorization mode requires a full restart:
```bash
# Stop containers
docker-compose down
# Start with new configuration
docker-compose up -d
# Verify authorization mode is enabled
docker-compose logs api | grep "Authorization"
```
```bash
# Stop backend API
pkill -f "python -m api.main"
# Stop frontend
pkill -f "npm run dev"
# Start backend with new configuration
python -m api.main &
# Start frontend
npm run dev &
```
Check authorization status:
```bash
curl -X GET "http://localhost:8001/auth/status"
```
Expected response:
```json
{
"auth_required": true
}
```
Server logs should show: "Authorization mode: ENABLED"
### Advanced Configuration
```env Production Environment
# Production authorization settings
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=prod-secure-auth-2024-XYZ789
# Additional security settings
NODE_ENV=production
LOG_LEVEL=WARNING
LOG_SENSITIVE_DATA=false
# Server configuration
SERVER_BASE_URL=https://deepwiki.yourdomain.com
PORT=8001
```
```env Development Environment
# Development with light authorization
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=dev-auth-123
# Development settings
NODE_ENV=development
LOG_LEVEL=DEBUG
LOG_SENSITIVE_DATA=true
# Local server configuration
SERVER_BASE_URL=http://localhost:8001
PORT=8001
```
```env Team Shared Environment
# Team authorization settings
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=team-wiki-access-2024
# Shared environment configuration
NODE_ENV=production
LOG_LEVEL=INFO
# Team server settings
SERVER_BASE_URL=https://team-wiki.company.com
PORT=8001
```
## Frontend Usage
### Authorization Code Input
When authorization mode is enabled, users see an authorization code input field:
Authorization Required
This code is required to generate wikis
### User Experience Flow
User visits the DeepWiki-Open interface and sees:
* Normal repository URL input
* Authorization code field (when mode is enabled)
* Clear indication that authorization is required
User enters the authorization code:
* Code is masked (password field)
* Real-time validation (optional)
* Clear error messages for invalid codes
After successful authorization:
* Normal wiki generation flow continues
* Code is included in API requests automatically
* User doesn't need to re-enter code for the session
### Frontend Implementation
The frontend automatically detects authorization mode:
```javascript
// Authorization status check
const checkAuthStatus = async () => {
const response = await fetch('/api/auth/status');
const { auth_required } = await response.json();
if (auth_required) {
// Show authorization code input
setShowAuthInput(true);
}
};
// Authorization validation
const validateAuthCode = async (code) => {
const response = await fetch('/api/auth/validate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ code })
});
const { success } = await response.json();
return success;
};
```
## API Usage
### Including Authorization Codes
All API requests must include the authorization code when mode is enabled:
```bash cURL
curl -X POST "http://localhost:8001/wiki/generate" \
-H "Content-Type: application/json" \
-d '{
"repo_url": "https://github.com/facebook/react",
"model_provider": "google",
"authorization_code": "your-secure-code-here"
}'
```
```python Python
import requests
# Wiki generation with authorization
response = requests.post("http://localhost:8001/wiki/generate", json={
"repo_url": "https://github.com/facebook/react",
"model_provider": "google",
"authorization_code": "your-secure-code-here"
})
if response.status_code == 401:
print("Invalid authorization code")
else:
wiki_data = response.json()
```
```javascript JavaScript
// API request with authorization
const generateWiki = async (repoUrl, authCode) => {
try {
const response = await fetch('/api/wiki/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
repo_url: repoUrl,
model_provider: 'google',
authorization_code: authCode
})
});
if (response.status === 401) {
throw new Error('Invalid authorization code');
}
return await response.json();
} catch (error) {
console.error('Wiki generation failed:', error);
throw error;
}
};
```
### Authorization Validation Endpoint
Test authorization codes before use:
```bash Validate Code
curl -X POST "http://localhost:8001/auth/validate" \
-H "Content-Type: application/json" \
-d '{
"code": "your-secure-code-here"
}'
```
```python Python Validation
import requests
def validate_auth_code(code):
response = requests.post("http://localhost:8001/auth/validate",
json={"code": code})
result = response.json()
return result.get("success", False)
# Usage
if validate_auth_code("test-code"):
print("Authorization code is valid")
else:
print("Invalid authorization code")
```
```javascript JavaScript Validation
const validateAuthCode = async (code) => {
const response = await fetch('/api/auth/validate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ code })
});
const { success } = await response.json();
return success;
};
```
```json Valid Code
{
"success": true
}
```
```json Invalid Code
{
"success": false
}
```
## Security Considerations
### Authorization Code Security
**Strong authorization codes:**
```bash
# Good examples
team-wiki-2024-XRT789-secure
prod-deepwiki-auth-94KL23P
company-docs-access-2024-QW45
# Avoid these
123456
password
admin
test
deepwiki
```
**Generation methods:**
```bash
# Generate random codes
openssl rand -hex 16
# Output: a3f7c9e2b4d8f6a1c5e9b7d3f8a2c6e4
# Generate pronounceable codes
openssl rand -base64 12 | tr -d "=+/" | cut -c1-16
# Output: kH7mP9qR3sL4wX8t
# Custom format
echo "wiki-$(date +%Y)-$(openssl rand -hex 4)"
# Output: wiki-2024-7a8b9c2d
```
**Secure storage:**
* Store codes in environment variables, not source code
* Use secrets management in production (AWS Secrets Manager, Azure Key Vault)
* Restrict file permissions on `.env` files (`chmod 600`)
* Never commit authorization codes to version control
**Regular rotation:**
```bash
# Rotation schedule
# - Development: Monthly
# - Staging: Bi-weekly
# - Production: Weekly or after security incidents
# Rotation process:
# 1. Generate new authorization code
# 2. Test with new code in staging
# 3. Update production environment variables
# 4. Restart services
# 5. Notify authorized users
# 6. Monitor for authentication failures
```
**Code management:**
```bash
# Environment-specific codes
DEEPWIKI_AUTH_CODE_DEV=dev-auth-code-123
DEEPWIKI_AUTH_CODE_STAGING=staging-auth-456
DEEPWIKI_AUTH_CODE_PROD=prod-auth-789
# Use deployment scripts to select appropriate code
export DEEPWIKI_AUTH_CODE=${DEEPWIKI_AUTH_CODE_PROD}
```
**HTTPS requirement:**
```nginx
# Always use HTTPS in production
server {
listen 443 ssl;
server_name deepwiki.yourdomain.com;
# SSL configuration
ssl_certificate /path/to/certificate.crt;
ssl_certificate_key /path/to/private.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://localhost:8001;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
# Redirect HTTP to HTTPS
server {
listen 80;
server_name deepwiki.yourdomain.com;
return 301 https://$server_name$request_uri;
}
```
**Access restrictions:**
```bash
# Firewall rules for additional security
# Only allow specific IPs to access API directly
iptables -A INPUT -p tcp --dport 8001 -s 192.168.1.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 8001 -j DROP
# Allow HTTPS from anywhere (proxied)
iptables -A INPUT -p tcp --dport 443 -j ACCEPT
```
### Attack Prevention
Authorization mode provides basic access control but is not a complete security solution. Consider additional measures:
* Rate limiting to prevent brute force attacks
* IP whitelisting for sensitive environments
* Web Application Firewall (WAF) for production
* Regular security audits and monitoring
* Multi-factor authentication for critical deployments
**Rate limiting example:**
```python
# Basic rate limiting implementation
from collections import defaultdict
from time import time
auth_attempts = defaultdict(list)
MAX_ATTEMPTS = 5
WINDOW_SECONDS = 300 # 5 minutes
def check_rate_limit(ip_address):
now = time()
attempts = auth_attempts[ip_address]
# Remove old attempts
attempts[:] = [t for t in attempts if now - t < WINDOW_SECONDS]
if len(attempts) >= MAX_ATTEMPTS:
return False # Rate limited
attempts.append(now)
return True
```
## Troubleshooting
### Common Issues
**Symptoms:**
* "Invalid authorization code" errors
* Authentication consistently fails
* API returns 401 status codes
**Solutions:**
1. **Verify environment variables:**
```bash
echo "Auth Mode: $DEEPWIKI_AUTH_MODE"
echo "Auth Code: $DEEPWIKI_AUTH_CODE"
```
2. **Check code formatting:**
```bash
# Ensure no extra spaces or characters
export DEEPWIKI_AUTH_CODE=$(echo "$DEEPWIKI_AUTH_CODE" | xargs)
```
3. **Restart services completely:**
```bash
# Kill all processes
pkill -f "python -m api.main"
pkill -f "npm"
# Wait a moment, then restart
sleep 2
python -m api.main &
npm run dev &
```
4. **Test with curl:**
```bash
curl -X POST "http://localhost:8001/auth/validate" \
-H "Content-Type: application/json" \
-d '{"code": "your-exact-code-here"}'
```
**Symptoms:**
* Authorization mode enabled but no auth input field
* Frontend behaves as if authorization is disabled
* No indication that auth is required
**Solutions:**
1. **Check API status endpoint:**
```bash
curl -X GET "http://localhost:8001/auth/status"
```
2. **Verify frontend API connection:**
```javascript
// Check in browser console
fetch('/api/auth/status')
.then(r => r.json())
.then(console.log);
```
3. **Clear browser cache:**
```bash
# Or use browser dev tools:
# Application > Storage > Clear Site Data
```
4. **Check Next.js API routes:**
```bash
# Verify routes exist
ls -la src/app/api/auth/
```
**Symptoms:**
* Frontend shows auth required but validation fails
* Network errors in browser console
* Proxy/forwarding failures
**Solutions:**
1. **Check SERVER\_BASE\_URL:**
```bash
echo "Server URL: $SERVER_BASE_URL"
# Should match your backend API location
```
2. **Test direct API access:**
```bash
# Test backend directly
curl -X GET "http://localhost:8001/auth/status"
# Test frontend proxy
curl -X GET "http://localhost:3000/api/auth/status"
```
3. **Verify port configuration:**
```bash
# Backend should be running on PORT
netstat -tulpn | grep :8001
# Frontend should be running on 3000
netstat -tulpn | grep :3000
```
### Debugging Steps
```bash
# Verify environment variables are loaded
python -c "
import os
from dotenv import load_dotenv
load_dotenv()
print('Auth Mode:', os.getenv('DEEPWIKI_AUTH_MODE'))
print('Auth Code Set:', bool(os.getenv('DEEPWIKI_AUTH_CODE')))
print('Server URL:', os.getenv('SERVER_BASE_URL'))
"
```
```bash
# Test status endpoint
curl -v -X GET "http://localhost:8001/auth/status"
# Test validation endpoint
curl -v -X POST "http://localhost:8001/auth/validate" \
-H "Content-Type: application/json" \
-d '{"code": "test-code"}'
# Test wiki generation (should require auth)
curl -v -X POST "http://localhost:8001/wiki/generate" \
-H "Content-Type: application/json" \
-d '{"repo_url": "https://github.com/octocat/Hello-World", "model_provider": "google"}'
```
```bash
# Backend logs
tail -f api/logs/application.log
# Look for:
# - "Authorization mode: ENABLED/DISABLED"
# - Authentication attempt logs
# - Error messages related to auth
```
```javascript
// In browser console
// Check authorization status
fetch('/api/auth/status').then(r => r.json()).then(console.log);
// Test validation
fetch('/api/auth/validate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ code: 'test-code' })
}).then(r => r.json()).then(console.log);
```
## Production Deployment
### Docker Configuration
```yaml docker-compose.yml
services:
deepwiki-api:
build: .
environment:
# Authorization configuration
- DEEPWIKI_AUTH_MODE=true
- DEEPWIKI_AUTH_CODE=${DEEPWIKI_AUTH_CODE}
# API keys
- GOOGLE_API_KEY=${GOOGLE_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
# Server configuration
- PORT=8001
- SERVER_BASE_URL=https://api.deepwiki.yourdomain.com
# Security settings
- NODE_ENV=production
- LOG_LEVEL=WARNING
ports:
- "8001:8001"
restart: unless-stopped
deepwiki-frontend:
build:
context: .
dockerfile: Dockerfile.frontend
environment:
- SERVER_BASE_URL=https://api.deepwiki.yourdomain.com
- NEXT_PUBLIC_API_URL=https://api.deepwiki.yourdomain.com
ports:
- "3000:3000"
depends_on:
- deepwiki-api
restart: unless-stopped
```
```env .env.production
# Production authorization settings
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=prod-secure-auth-2024-XYZ789
# API Keys
GOOGLE_API_KEY=your_production_google_key
OPENAI_API_KEY=your_production_openai_key
# Server configuration
SERVER_BASE_URL=https://api.deepwiki.yourdomain.com
PORT=8001
# Security
NODE_ENV=production
LOG_LEVEL=WARNING
LOG_SENSITIVE_DATA=false
```
### Kubernetes Deployment
```yaml deepwiki-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: deepwiki-auth
type: Opaque
stringData:
auth-code: "prod-secure-auth-2024-XYZ789"
google-api-key: "your_google_api_key"
openai-api-key: "your_openai_api_key"
```
```yaml deepwiki-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deepwiki-api
spec:
replicas: 3
selector:
matchLabels:
app: deepwiki-api
template:
metadata:
labels:
app: deepwiki-api
spec:
containers:
- name: api
image: deepwiki/api:latest
ports:
- containerPort: 8001
env:
- name: DEEPWIKI_AUTH_MODE
value: "true"
- name: DEEPWIKI_AUTH_CODE
valueFrom:
secretKeyRef:
name: deepwiki-auth
key: auth-code
- name: GOOGLE_API_KEY
valueFrom:
secretKeyRef:
name: deepwiki-auth
key: google-api-key
- name: SERVER_BASE_URL
value: "https://deepwiki.yourdomain.com"
- name: NODE_ENV
value: "production"
```
### Monitoring and Alerting
```python
# Example monitoring setup
import logging
from collections import defaultdict
from time import time
# Metrics collection
auth_metrics = {
'successful_auths': 0,
'failed_auths': 0,
'unique_ips': set(),
'requests_per_hour': defaultdict(int)
}
def track_auth_attempt(ip_address, success):
current_hour = int(time()) // 3600
if success:
auth_metrics['successful_auths'] += 1
else:
auth_metrics['failed_auths'] += 1
auth_metrics['unique_ips'].add(ip_address)
auth_metrics['requests_per_hour'][current_hour] += 1
# Alert on suspicious activity
if auth_metrics['failed_auths'] > 50:
send_security_alert("High number of failed authentications")
```
```json
{
"dashboard": {
"title": "DeepWiki Authorization",
"panels": [
{
"title": "Authentication Success Rate",
"type": "stat",
"targets": [
{
"expr": "rate(deepwiki_auth_success_total[5m]) / rate(deepwiki_auth_attempts_total[5m]) * 100"
}
]
},
{
"title": "Failed Authentication Attempts",
"type": "graph",
"targets": [
{
"expr": "rate(deepwiki_auth_failures_total[1m])"
}
]
},
{
"title": "Unique IPs per Hour",
"type": "bargraph",
"targets": [
{
"expr": "increase(deepwiki_unique_ips_total[1h])"
}
]
}
]
}
}
```
```bash
# Analyze authentication logs
grep "Authorization" api/logs/application.log | tail -100
# Failed authentication attempts
grep "Authorization.*invalid" api/logs/application.log | wc -l
# Unique IP addresses attempting authentication
grep "Authorization" api/logs/application.log | \
grep -oE '([0-9]{1,3}\.){3}[0-9]{1,3}' | \
sort | uniq -c | sort -nr
# Successful authentications per hour
grep "Authorization.*success" api/logs/application.log | \
awk '{print $1" "$2}' | cut -c1-13 | uniq -c
```
## Best Practices
### Development Workflow
```bash
# Use simple auth codes for development
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=dev-123
# Enable detailed logging
LOG_LEVEL=DEBUG
```
```bash
# Use production-like codes
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=staging-secure-2024
# Test authorization workflows thoroughly
```
```bash
# Strong authorization codes
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=complex-random-production-code
# Minimal logging
LOG_LEVEL=WARNING
LOG_SENSITIVE_DATA=false
```
### Team Management
**Secure methods for sharing authorization codes:**
1. **Encrypted communication:**
* Use encrypted messaging (Signal, encrypted email)
* Password managers with secure sharing
* Company secrets management systems
2. **Documentation:**
```markdown
# Team Wiki Access
Authorization code: [Available in team password manager]
Location: Team Vault > DeepWiki > Production Access
Updated: 2024-01-15
Next rotation: 2024-02-15
```
3. **Onboarding process:**
* Include authorization code in new team member setup
* Document where to find current codes
* Explain rotation schedule and notifications
**Multiple authorization codes for different access levels:**
```env
# Different codes for different environments
DEEPWIKI_AUTH_CODE_READONLY=readonly-access-2024
DEEPWIKI_AUTH_CODE_FULL=full-access-2024-secure
DEEPWIKI_AUTH_CODE_ADMIN=admin-access-ultra-secure-2024
```
**Implementation:**
```python
# Multiple authorization codes
VALID_AUTH_CODES = {
'readonly': os.getenv('DEEPWIKI_AUTH_CODE_READONLY'),
'full': os.getenv('DEEPWIKI_AUTH_CODE_FULL'),
'admin': os.getenv('DEEPWIKI_AUTH_CODE_ADMIN')
}
def validate_auth_code(code, required_level='readonly'):
allowed_levels = {
'readonly': ['readonly', 'full', 'admin'],
'full': ['full', 'admin'],
'admin': ['admin']
}
for level in allowed_levels[required_level]:
if VALID_AUTH_CODES[level] == code:
return True, level
return False, None
```
### Security Maintenance
```bash
#!/bin/bash
# auth-rotation.sh - Automated code rotation
# Generate new auth code
NEW_CODE="wiki-$(date +%Y%m%d)-$(openssl rand -hex 6)"
# Update environment
echo "New authorization code: $NEW_CODE"
echo "Update DEEPWIKI_AUTH_CODE in production environment"
# Log rotation
echo "$(date): Code rotated to $NEW_CODE" >> auth-rotation.log
```
```python
# Audit script
def audit_auth_access():
with open('api/logs/application.log', 'r') as f:
auth_events = [
line for line in f
if 'Authorization' in line
]
print(f"Total auth events: {len(auth_events)}")
print(f"Successful: {len([e for e in auth_events if 'success' in e])}")
print(f"Failed: {len([e for e in auth_events if 'invalid' in e])}")
```
```bash
# Monitor for security issues
tail -f api/logs/application.log | grep -i "auth\|security\|unauthorized"
# Set up alerts for suspicious activity
grep -c "Authorization.*invalid" api/logs/application.log | \
awk '{if ($1 > 10) print "ALERT: High auth failure rate"}'
```
## Next Steps
Complete production deployment with authorization mode
Learn about complete API authentication and security
Configure access to private repositories with tokens
Implement comprehensive security measures
# Environment Variables
Source: https://asyncfunc.mintlify.app/getting-started/environment-variables
Complete guide to configuring DeepWiki-Open environment variables for all supported AI providers and features
DeepWiki-Open uses environment variables to configure AI providers, server settings, authentication, and advanced features. This guide covers all available environment variables and their usage.
## Required Environment Variables
At minimum, you need API keys for at least one AI provider:
```bash Minimal Setup
# Choose ONE of these AI providers
GOOGLE_API_KEY=your_google_api_key_here
# OR
OPENAI_API_KEY=your_openai_api_key_here
```
```bash Complete .env File
# AI Provider API Keys (choose at least one)
GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
OPENROUTER_API_KEY=your_openrouter_api_key_here
# Azure OpenAI (Enterprise)
AZURE_OPENAI_API_KEY=your_azure_openai_api_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_VERSION=2024-02-15-preview
# AWS Bedrock
AWS_ACCESS_KEY_ID=your_aws_access_key
AWS_SECRET_ACCESS_KEY=your_aws_secret_key
AWS_REGION=us-east-1
# Local/Self-Hosted Models
OLLAMA_HOST=http://localhost:11434
OPENAI_BASE_URL=https://api.openai.com/v1
# Server Configuration
PORT=8001
SERVER_BASE_URL=http://localhost:8001
# Authorization & Security
DEEPWIKI_AUTH_MODE=false
DEEPWIKI_AUTH_CODE=your_secret_code_here
# Logging & Debugging
LOG_LEVEL=INFO
LOG_FILE_PATH=./api/logs/application.log
# Advanced Configuration
DEEPWIKI_CONFIG_DIR=./api/config
REDIS_URL=redis://localhost:6379
```
## AI Provider Configuration
### Google Gemini
GOOGLE\_API\_KEY
Google AI Studio API key for Gemini models.
**How to get:**
1. Visit [Google AI Studio](https://makersuite.google.com/app/apikey)
2. Click "Create API Key"
3. Copy the generated key
**Supported models:**
* `gemini-2.0-flash` (default, recommended)
* `gemini-1.5-flash`
* `gemini-1.0-pro`
Google Gemini offers generous free tier limits and excellent performance for documentation generation.
### OpenAI
OPENAI\_API\_KEY
OpenAI API key for GPT models.
OPENAI\_BASE\_URL
Default: `https://api.openai.com/v1`
Custom OpenAI API endpoint (for OpenAI-compatible services).
**How to get:**
1. Visit [OpenAI Platform](https://platform.openai.com/api-keys)
2. Create new secret key
3. Copy the key (starts with `sk-`)
**Supported models:**
* `gpt-4o` (default)
* `gpt-4.1`
* `o1`
* `o3`
* `o4-mini`
OpenAI requires a paid account. Free tier users cannot access the API.
### OpenRouter
OPENROUTER\_API\_KEY
OpenRouter API key for access to multiple model providers.
**How to get:**
1. Sign up at [OpenRouter](https://openrouter.ai/)
2. Go to Keys section
3. Create new API key
**Available models:**
* `openai/gpt-4o`
* `anthropic/claude-3.5-sonnet`
* `deepseek/deepseek-r1`
* `google/gemini-pro`
* And 100+ more models
OpenRouter provides access to multiple AI providers through a single API, perfect for comparing models.
### Azure OpenAI
AZURE\_OPENAI\_API\_KEY
Azure OpenAI service API key.
AZURE\_OPENAI\_ENDPOINT
Your Azure OpenAI resource endpoint URL.
AZURE\_OPENAI\_VERSION
API version (e.g., `2024-02-15-preview`).
**How to get:**
1. Create Azure OpenAI resource in [Azure Portal](https://portal.azure.com/)
2. Deploy a model (GPT-4, GPT-3.5-turbo, etc.)
3. Get endpoint and API key from resource overview
4. Note the API version from the deployment
**Example configuration:**
```env
AZURE_OPENAI_API_KEY=abc123def456ghi789
AZURE_OPENAI_ENDPOINT=https://my-resource.openai.azure.com
AZURE_OPENAI_VERSION=2024-02-15-preview
```
### AWS Bedrock
AWS\_ACCESS\_KEY\_ID
AWS access key for Bedrock access.
AWS\_SECRET\_ACCESS\_KEY
AWS secret access key.
AWS\_REGION
Default: `us-east-1`
AWS region where Bedrock models are available.
**Supported models:**
* `anthropic.claude-3-sonnet-20240229-v1:0`
* `anthropic.claude-3-haiku-20240307-v1:0`
* `anthropic.claude-3-opus-20240229-v1:0`
* `amazon.titan-text-express-v1`
### Ollama (Local Models)
OLLAMA\_HOST
Default: `http://localhost:11434`
Ollama server URL for local AI models.
**Setup Ollama:**
```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Start service
ollama serve
# Pull a model
ollama pull llama3:8b
```
```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Start as service
sudo systemctl start ollama
sudo systemctl enable ollama
# Pull a model
ollama pull qwen3:8b
```
```bash
# Run Ollama in container
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
# Pull a model
docker exec -it ollama ollama pull llama3:8b
```
**Supported models:**
* `qwen3:1.7b` (lightweight)
* `llama3:8b` (balanced)
* `qwen3:8b` (high context)
### DashScope (Alibaba)
DASHSCOPE\_API\_KEY
Alibaba DashScope API key for Qwen models.
**How to get:**
1. Sign up at [DashScope](https://dashscope.aliyun.com/)
2. Create API key in console
3. Add key to environment
**Supported models:**
* `qwen-plus`
* `qwen-turbo`
* `deepseek-r1`
## Server Configuration
PORT
Default: `8001`
Port for the FastAPI backend server.
SERVER\_BASE\_URL
Default: `http://localhost:8001`
Base URL for API server (used by frontend).
NODE\_ENV
Enable authorization requirement for wiki generation.
DEEPWIKI\_AUTH\_CODE
Secret code required when authorization mode is enabled.
**Usage:**
```env
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=my-secret-code-123
```
When enabled, users must enter the auth code to generate wikis.
Authorization mode provides basic frontend protection but doesn't secure direct API access.
## Logging & Debugging
LOG\_LEVEL
Path for log file output.
**Example logging configuration:**
```env
LOG_LEVEL=DEBUG
LOG_FILE_PATH=./logs/deepwiki-debug.log
```
In production, use `INFO` or `WARNING` level to reduce log volume.
## Advanced Configuration
DEEPWIKI\_CONFIG\_DIR
Default: `./api/config`
Directory containing configuration JSON files.
REDIS\_URL
Redis connection URL for caching (optional).
**Example:**
```env
DEEPWIKI_CONFIG_DIR=/custom/config/path
REDIS_URL=redis://localhost:6379/0
```
## Environment File Templates
### Development
```env .env.development
# Development Environment
NODE_ENV=development
LOG_LEVEL=DEBUG
LOG_FILE_PATH=./api/logs/development.log
# Server Config
PORT=8001
SERVER_BASE_URL=http://localhost:8001
# API Keys
GOOGLE_API_KEY=your_development_google_key
OPENAI_API_KEY=your_development_openai_key
# Local Ollama
OLLAMA_HOST=http://localhost:11434
# No authentication for dev
DEEPWIKI_AUTH_MODE=false
```
### Production
```env .env.production
# Production Environment
NODE_ENV=production
LOG_LEVEL=INFO
LOG_FILE_PATH=/var/log/deepwiki/application.log
# Server Config
PORT=8001
SERVER_BASE_URL=https://api.yourdomain.com
# Production API Keys
GOOGLE_API_KEY=your_production_google_key
OPENAI_API_KEY=your_production_openai_key
AZURE_OPENAI_API_KEY=your_azure_key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_VERSION=2024-02-15-preview
# Enable authentication
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=your_secure_production_code
# Redis caching
REDIS_URL=redis://redis-server:6379/0
```
### Docker
```env .env.docker
# Docker Environment
NODE_ENV=production
LOG_LEVEL=INFO
# Container networking
PORT=8001
SERVER_BASE_URL=http://deepwiki-api:8001
# API Keys
GOOGLE_API_KEY=your_google_key
OPENAI_API_KEY=your_openai_key
# External Ollama
OLLAMA_HOST=http://ollama-server:11434
# Persistent data
LOG_FILE_PATH=/app/logs/application.log
DEEPWIKI_CONFIG_DIR=/app/config
```
## Validation & Testing
```bash
# Check environment variables are loaded
python -c "
import os
from dotenv import load_dotenv
load_dotenv()
# Check API keys
providers = {
'Google': os.getenv('GOOGLE_API_KEY'),
'OpenAI': os.getenv('OPENAI_API_KEY'),
'OpenRouter': os.getenv('OPENROUTER_API_KEY'),
'Azure': os.getenv('AZURE_OPENAI_API_KEY'),
}
for name, key in providers.items():
status = '✓ Configured' if key else '✗ Missing'
print(f'{name}: {status}')
"
```
```bash
# Test backend startup
python -m api.main
# Check for successful startup messages
# Look for: "Starting Streaming API on port 8001"
# No API key warnings for your configured providers
```
```bash
# Start frontend
npm run dev
# Test API connection at http://localhost:3000
# Model selection should show your configured providers
```
## Security Best Practices
* Never commit `.env` files to version control
* Use different API keys for development and production
* Regularly rotate API keys
* Monitor API usage for unexpected activity
* Use environment-specific keys when possible
```bash
# Secure file permissions
chmod 600 .env
# Use secrets management in production
# - AWS Secrets Manager
# - Azure Key Vault
# - Kubernetes secrets
# - Docker secrets
```
* Use HTTPS in production
* Configure proper CORS settings
* Use private networks for internal components
* Enable authorization mode for public deployments
## Troubleshooting
**Symptoms:** API key errors, default values used
**Solutions:**
1. Verify `.env` file is in project root
2. Check file permissions (readable by application)
3. Ensure no syntax errors in `.env` file
4. Restart application after changes
**Symptoms:** "Invalid API key" errors
**Solutions:**
1. Test API keys with provider's documentation
2. Check for extra spaces or characters
3. Verify key has correct permissions/scopes
4. Confirm key hasn't expired or been revoked
**Symptoms:** "Port already in use" errors
**Solutions:**
1. Change PORT environment variable
2. Kill existing processes on the port
3. Use Docker with port mapping
4. Configure reverse proxy
## Next Steps
Configure specific AI model providers and their settings
Deploy DeepWiki with production-ready configuration
Learn about JSON configuration files for advanced customization
Implement security best practices for production deployments
# Generate Your First Wiki
Source: https://asyncfunc.mintlify.app/getting-started/first-wiki
Step-by-step guide to creating your first repository wiki with DeepWiki-Open
Now that DeepWiki-Open is running, let's generate your first repository wiki! This guide walks you through the complete process from entering a repository URL to exploring the generated documentation.
## Quick Start
Navigate to [http://localhost:3000](http://localhost:3000) in your browser.
You should see the main DeepWiki interface with:
* Repository URL input field
* Model provider selection
* Generate Wiki button
The interface loads without errors and shows all input fields.
For your first wiki, start with a small, well-structured repository:
```
https://github.com/octocat/Hello-World
https://github.com/microsoft/TypeScript-Node-Starter
https://github.com/expressjs/express
```
```
https://github.com/microsoft/autogen
https://github.com/openai/openai-python
https://github.com/vercel/next.js
```
Enter your own repository URL:
* GitHub: `https://github.com/username/repo-name`
* GitLab: `https://gitlab.com/username/repo-name`
* BitBucket: `https://bitbucket.org/username/repo-name`
Start with repositories under 100 files for faster initial generation and testing.
Select your AI model provider and specific model:
* **Provider**: Google
* **Model**: `gemini-2.0-flash` (default)
* **Why**: Fast, reliable, generous free tier
* **Best for**: General documentation, good balance of speed and quality
* **Provider**: OpenAI
* **Model**: `gpt-4o` (default)
* **Why**: High-quality, detailed documentation
* **Best for**: Complex codebases requiring detailed analysis
* **Provider**: OpenRouter
* **Model**: `openai/gpt-4o` or `anthropic/claude-3.5-sonnet`
* **Why**: Access to multiple providers through one API
* **Best for**: Experimenting with different models
Ensure you have valid API keys configured for your chosen provider.
Click the **"Generate Wiki"** button and watch the progress:
1. **Repository Analysis**: DeepWiki clones and analyzes the repository structure
2. **Code Processing**: Creates embeddings for intelligent code understanding
3. **AI Generation**: Generates documentation using your selected AI model
4. **Diagram Creation**: Creates Mermaid diagrams for visualization
5. **Wiki Assembly**: Organizes everything into a structured wiki
Generation typically takes 30 seconds to 2 minutes for small repos, 5-10 minutes for larger ones.
## Understanding the Generated Wiki
Once generation completes, you'll see your interactive wiki with several key sections:
### Wiki Structure
**What you'll see:**
* Repository summary and purpose
* Key features and functionality
* Architecture overview
* Main technologies used
**Example content:**
* Project description
* Installation instructions
* Usage examples
* Contributing guidelines
**What you'll see:**
* File structure breakdown
* Component relationships
* Data flow analysis
* Key functions and classes
**Includes:**
* Code snippets with explanations
* Function parameter details
* Class inheritance diagrams
* Module dependencies
**Mermaid diagrams showing:**
* System architecture
* Data flow
* Component relationships
* Process workflows
* Database schemas (if applicable)
**Interactive features:**
* Zoom and pan diagrams
* Clickable nodes
* Responsive design
### Navigation Features
Hierarchical navigation showing all wiki pages and sections
Full-text search across all generated documentation
Links between related pages and code sections
Direct links to source files and line numbers
## Using the Ask Feature
The Ask feature lets you chat with your repository using RAG (Retrieval Augmented Generation):
Look for the "Ask" or chat interface in your generated wiki.
Try these example questions:
```text General Questions
What is the main purpose of this repository?
How do I install and run this project?
What are the key features?
```
```text Technical Questions
How does the authentication system work?
What databases does this project use?
Explain the API endpoints
```
```text Code-Specific Questions
Show me the main entry point
How is error handling implemented?
What testing framework is used?
```
The AI will provide answers based on your actual repository code, including:
* Direct code references
* File locations
* Relevant context
* Example usage
## Deep Research Mode
For complex analysis, try the Deep Research feature:
Toggle the "Deep Research" switch in the Ask interface.
Deep Research is perfect for questions like:
* "Analyze the security implementation across the entire codebase"
* "How is performance optimized in this application?"
* "Explain the complete data flow from user input to database"
Deep Research conducts multiple investigation rounds:
1. **Initial Analysis**: Basic research and findings
2. **Deep Dive**: Detailed investigation of specific aspects
3. **Cross-References**: Analysis of relationships and dependencies
4. **Final Synthesis**: Comprehensive conclusion with recommendations
## Customizing Your Wiki
### Model Selection
You can change models and regenerate sections:
1. Go to model configuration
2. Select a different provider/model
3. Regenerate specific sections
4. Compare results from different models
For advanced users, you can adjust:
* Temperature (creativity vs. consistency)
* Top-p (response diversity)
* Max tokens (response length)
* Custom model names (for compatible providers)
### Private Repository Access
1. Go to GitHub Settings → Developer settings → Personal access tokens
2. Generate new token (classic)
3. Select scopes: `repo` (for private repos)
4. Copy the generated token
1. Go to GitLab User Settings → Access Tokens
2. Create token with `read_repository` scope
3. Copy the generated token
1. Go to BitBucket Account Settings → App passwords
2. Create password with `Repositories: Read` permission
3. Copy the generated password
1. Click "+ Add access tokens" in the DeepWiki interface
2. Paste your personal access token
3. The token is stored locally and used for repository access
Now you can generate wikis for your private repositories using the same process.
Tokens are stored locally in your browser. Never share your access tokens.
## Troubleshooting Your First Wiki
**Common causes:**
* Invalid repository URL
* Private repository without access token
* API key issues
* Repository too large
**Solutions:**
1. Verify the repository URL is accessible
2. Check API key configuration
3. Try a smaller repository first
4. Check the browser developer console for errors
**Possible issues:**
* Wrong model selected
* Repository has unusual structure
* Limited code documentation
**Improvements:**
1. Try a different AI model
2. Use OpenAI GPT-4 for complex repositories
3. Enable Deep Research mode
4. Ask specific questions to get better answers
**Normal for:**
* Large repositories (1000+ files)
* Complex codebases
* Detailed analysis requests
**Speed tips:**
1. Start with smaller repositories
2. Use faster models like `gemini-2.0-flash`
3. Avoid Deep Research for initial testing
4. Check your internet connection
## Example Walkthrough
Let's generate a wiki for the Express.js repository:
* Repository: `https://github.com/expressjs/express`
* Model: Google Gemini 2.0 Flash
* No authentication needed (public repo)
The generated wiki should include:
* Express.js framework overview
* Middleware architecture explanation
* Router and route handling
* Request/response cycle diagrams
* API reference for core methods
Try asking:
* "How does Express.js middleware work?"
* "Show me how to create a simple server"
* "Explain the routing system"
You should get detailed answers with code examples from the actual Express.js codebase.
## Next Steps
Learn about advanced wiki generation options and customization
Configure multiple AI providers to compare results
Deploy DeepWiki for team or production use
Integrate DeepWiki into your development workflow
Congratulations! You've successfully generated your first repository wiki with DeepWiki-Open. The AI-powered analysis and documentation generation should give you a comprehensive understanding of any codebase in minutes rather than hours of manual exploration.
# Installation Guide
Source: https://asyncfunc.mintlify.app/getting-started/installation
Detailed installation instructions for DeepWiki-Open including dependencies, system requirements, and setup options
This guide covers detailed installation instructions for DeepWiki-Open, including system requirements, dependencies, and various setup options.
## System Requirements
* **OS**: Linux, macOS, or Windows 10+
* **Python**: 3.8 or higher
* **Node.js**: 16.0 or higher
* **Memory**: 4GB RAM minimum
* **Storage**: 2GB free space
* **Network**: Internet connection for AI API calls
* **OS**: Linux (Ubuntu 20.04+) or macOS
* **Python**: 3.9 or 3.10
* **Node.js**: 18.0 or higher
* **Memory**: 8GB RAM or more
* **Storage**: 10GB free space (for repository caches)
* **Network**: Stable broadband connection
## Prerequisites
Before installing DeepWiki-Open, ensure you have:
```bash
# Using Homebrew
brew install python@3.10
# Or download from python.org
# https://www.python.org/downloads/
```
```bash
sudo apt update
sudo apt install python3.10 python3.10-venv python3-pip
```
```powershell
# Using Chocolatey
choco install python3
# Or download from python.org
# https://www.python.org/downloads/
```
Verify Python installation: `python --version` should show 3.8+
```bash
# Using Homebrew
brew install node
# Or using Node Version Manager
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
nvm install 18
nvm use 18
```
```bash
# Using NodeSource repository
curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash -
sudo apt-get install -y nodejs
# Or using NVM
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
nvm install 18
nvm use 18
```
```powershell
# Using Chocolatey
choco install nodejs
# Or download from nodejs.org
# https://nodejs.org/
```
Verify Node.js installation: `node --version` should show 16.0+
```bash
# Git comes with Xcode Command Line Tools
xcode-select --install
# Or using Homebrew
brew install git
```
```bash
sudo apt update
sudo apt install git
```
```powershell
# Using Chocolatey
choco install git
# Or download from git-scm.com
# https://git-scm.com/download/win
```
Verify Git installation: `git --version`
## Installation Methods
Choose the installation method that best fits your needs:
### Docker Installation
Docker provides the easiest and most consistent setup experience.
#### Prerequisites for Docker
Download and install [Docker Desktop for Mac](https://docs.docker.com/desktop/mac/install/)
```bash
# Ubuntu/Debian
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and back in for group change to take effect
```
Download and install [Docker Desktop for Windows](https://docs.docker.com/desktop/windows/install/)
Docker Compose is typically included with Docker Desktop. For Linux:
```bash
sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
```
#### Docker Setup
```bash
git clone https://github.com/AsyncFuncAI/deepwiki-open.git
cd deepwiki-open
```
```bash
# Create .env file
cp .env.example .env
# Edit with your API keys
nano .env # or use your preferred editor
```
```bash
docker-compose up -d
```
This starts both backend (port 8001) and frontend (port 3000) services.
### Manual Installation
For development or when you need more control over the setup.
```bash
git clone https://github.com/AsyncFuncAI/deepwiki-open.git
cd deepwiki-open
```
```bash
# Create Python virtual environment
python -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
# venv\Scripts\activate
# Install Python dependencies
pip install -r api/requirements.txt
```
Using a virtual environment is strongly recommended to avoid dependency conflicts.
```bash
# Install Node.js dependencies
npm install
# or if you prefer yarn:
# yarn install
```
Create a `.env` file in the project root:
```bash
# Copy example environment file
cp .env.example .env
# Edit environment variables
nano .env
```
Required variables:
```env
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
```
Open two terminal windows:
Terminal 1 (Backend):
```bash
# Ensure virtual environment is activated
source venv/bin/activate # or venv\Scripts\activate on Windows
python -m api.main
```
Terminal 2 (Frontend):
```bash
npm run dev
# or: yarn dev
```
### Development Setup
For contributors and developers who want to modify DeepWiki-Open.
1. Fork the repository on GitHub
2. Clone your fork:
```bash
git clone https://github.com/YOUR_USERNAME/deepwiki-open.git
cd deepwiki-open
```
```bash
# Install Python development dependencies
pip install -r api/requirements.txt
pip install -r api/requirements-dev.txt # if exists
# Install Node.js development dependencies
npm install
# Install pre-commit hooks (if available)
pre-commit install
```
Create `.env.development`:
```env
# Development environment
NODE_ENV=development
LOG_LEVEL=DEBUG
LOG_FILE_PATH=./api/logs/development.log
# API Configuration
PORT=8001
SERVER_BASE_URL=http://localhost:8001
# Your API keys
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
```
For development with hot reloading:
```bash
# Terminal 1: Backend with auto-reload
python -m api.main # Already includes reload in development
# Terminal 2: Frontend with Turbopack
npm run dev # Uses Next.js Turbopack for fast refresh
```
## Verification
After installation, verify that everything is working correctly:
```bash
# Test backend API
curl http://localhost:8001/health
# Or visit in browser
open http://localhost:8001/docs
```
Expected response:
```json
{"status": "healthy", "version": "0.1.0"}
```
Visit [http://localhost:3000](http://localhost:3000) in your browser.
You should see:
* DeepWiki-Open interface
* Repository URL input field
* Model selection dropdown
* Generate Wiki button
1. Enter a small public repository (e.g., `https://github.com/octocat/Hello-World`)
2. Select an AI model provider
3. Click "Generate Wiki"
4. Verify wiki generation completes successfully
## Optional Dependencies
To run AI models locally using Ollama:
```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Or using Homebrew
brew install ollama
# Start Ollama service
ollama serve
# Pull a model
ollama pull llama3:8b
```
```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Start as service
sudo systemctl start ollama
sudo systemctl enable ollama
# Pull a model
ollama pull llama3:8b
```
Download and install from [Ollama website](https://ollama.ai/download)
Then in PowerShell:
```powershell
ollama pull llama3:8b
```
Update your `.env`:
```env
OLLAMA_HOST=http://localhost:11434
```
For improved performance in production:
```bash
brew install redis
brew services start redis
```
```bash
sudo apt update
sudo apt install redis-server
sudo systemctl start redis
sudo systemctl enable redis
```
```bash
docker run -d -p 6379:6379 redis:alpine
```
Add to `.env`:
```env
REDIS_URL=redis://localhost:6379
```
## Troubleshooting Installation
**Problem**: `python: command not found`
**Solutions**:
```bash
# Check if python3 is available
python3 --version
# Create alias (add to ~/.bashrc or ~/.zshrc)
alias python=python3
# Or install Python properly
sudo apt install python-is-python3 # Ubuntu/Debian
```
**Problem**: `npm ERR! EACCES: permission denied`
**Solutions**:
```bash
# Use Node Version Manager (recommended)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
nvm install 18
nvm use 18
# Or fix npm permissions
sudo chown -R $(whoami) ~/.npm
```
**Problem**: `Cannot connect to Docker daemon`
**Solutions**:
```bash
# Start Docker service
sudo systemctl start docker
# Add user to docker group
sudo usermod -aG docker $USER
# Then log out and back in
# Or run with sudo (not recommended for development)
sudo docker-compose up
```
**Problem**: `Port already in use`
**Solutions**:
```bash
# Find what's using the port
lsof -i :3000 # or :8001
# Kill the process
kill -9 PID
# Or use different ports in .env
PORT=8002
# Update frontend to use new backend port
```
## Next Steps
Configure API keys and environment settings
Set up AI model providers for documentation generation
Create your first repository wiki
Deploy DeepWiki for production use
# Introduction to DeepWiki-Open
Source: https://asyncfunc.mintlify.app/getting-started/introduction
AI-powered documentation generator that creates beautiful, interactive wikis for any GitHub, GitLab, or BitBucket repository
**DeepWiki-Open** is an innovative AI-powered documentation generator that automatically creates comprehensive, interactive wikis for any GitHub, GitLab, or BitBucket repository. Transform your codebase into beautiful, navigable documentation in seconds.
## What is DeepWiki-Open?
DeepWiki-Open analyzes your repository's code structure, understands the relationships between components, and generates:
Comprehensive documentation with code analysis and explanations
Automatic Mermaid diagrams showing architecture and data flow
RAG-powered Q\&A system to chat with your repository
Multi-turn AI research for complex code analysis
## Key Features
### 🚀 Instant Documentation Generation
Simply enter a repository URL and DeepWiki will analyze the code structure, generate comprehensive documentation, and create visual diagrams to explain how everything works.
### 🔐 Private Repository Support
Securely access private repositories with personal access tokens from GitHub, GitLab, or BitBucket.
### 🤖 Multiple AI Model Providers
Choose from leading AI providers:
* **Google Gemini** (default: gemini-2.0-flash)
* **OpenAI** (gpt-4o, o1, o3, o4-mini)
* **OpenRouter** (access to Claude, Llama, Mistral, and more)
* **Azure OpenAI** (enterprise-grade deployment)
* **Ollama** (local open-source models)
* **Amazon Bedrock** (AWS-hosted models)
* **DashScope** (Alibaba Qwen models)
### 💬 Intelligent Q\&A System
The **Ask Feature** uses Retrieval Augmented Generation (RAG) to provide accurate answers based on your repository's actual code.
### 🔍 Advanced Research Capabilities
**DeepResearch** conducts multi-turn investigations for complex topics, providing comprehensive analysis through multiple research iterations.
### 🌍 International Support
Built-in support for 10+ languages including English, Chinese, Japanese, Spanish, Korean, Vietnamese, Portuguese, French, and Russian.
## How It Works
DeepWiki clones and analyzes your repository structure, including private repos with token authentication.
Creates embeddings of the code for smart retrieval and uses AI to generate documentation with context-aware understanding.
Generates Mermaid diagrams to explain code relationships and system architecture.
Organizes everything into a structured, navigable wiki with intelligent Q\&A capabilities.
## Architecture Overview
```mermaid
graph TD
A[User inputs GitHub/GitLab/BitBucket repo] --> AA{Private repo?}
AA -->|Yes| AB[Add access token]
AA -->|No| B[Clone Repository]
AB --> B
B --> C[Analyze Code Structure]
C --> D[Create Code Embeddings]
D --> M{Select Model Provider}
M -->|Google Gemini| E1[Generate with Gemini]
M -->|OpenAI| E2[Generate with OpenAI]
M -->|OpenRouter| E3[Generate with OpenRouter]
M -->|Local Ollama| E4[Generate with Ollama]
M -->|Azure| E5[Generate with Azure]
E1 --> E[Generate Documentation]
E2 --> E
E3 --> E
E4 --> E
E5 --> E
D --> F[Create Visual Diagrams]
E --> G[Organize as Wiki]
F --> G
G --> H[Interactive DeepWiki]
classDef process stroke-width:2px;
classDef data stroke-width:2px;
classDef result stroke-width:2px;
classDef decision stroke-width:2px;
class A,D data;
class AA,M decision;
class B,C,E,F,G,AB,E1,E2,E3,E4,E5 process;
class H result;
```
## Project Structure
The DeepWiki-Open project consists of:
```
src/
├── app/ # Next.js app directory
│ ├── page.tsx # Main application page
│ └── api/ # API routes
├── components/ # React components
│ ├── Ask.tsx # Q&A interface
│ ├── Mermaid.tsx # Diagram renderer
│ └── ...
└── utils/ # Utility functions
```
```
api/
├── main.py # API entry point
├── api.py # FastAPI implementation
├── rag.py # RAG system
├── data_pipeline.py # Data processing
├── config/ # Configuration files
│ ├── generator.json # Model configurations
│ ├── embedder.json # Embedding settings
│ └── repo.json # Repository settings
└── requirements.txt # Python dependencies
```
## Quick Start
Ready to get started? Choose your preferred setup method:
Get up and running in minutes with Docker Compose
Install and configure manually for development
You'll need API keys from your chosen AI provider (Google, OpenAI, etc.) to get started. See the [Environment Variables](/getting-started/environment-variables) guide for details.
## Next Steps
Get DeepWiki running in under 5 minutes
Configure your AI model providers
Learn about all DeepWiki features
Explore the REST API endpoints
# Model Providers Setup
Source: https://asyncfunc.mintlify.app/getting-started/model-providers
Configure multiple AI model providers for optimal documentation generation with DeepWiki-Open
DeepWiki-Open supports multiple AI model providers, each with unique strengths for different documentation needs. This guide covers setup, configuration, and optimization for all supported providers.
## Supported Providers
Fast, reliable, generous free tier
High-quality, detailed documentation
Access to 100+ models through one API
Enterprise-grade with enhanced security
AWS-hosted models with enterprise features
Local, private, cost-free AI models
## Google Gemini
Google's Gemini models offer excellent performance with generous free tiers, making them ideal for getting started.
### Setup
1. Visit [Google AI Studio](https://makersuite.google.com/app/apikey)
2. Sign in with your Google account
3. Click "Create API Key"
4. Copy the generated key (starts with `AIza`)
API key generated and copied
Add to your `.env` file:
```env
GOOGLE_API_KEY=AIzaSyC...your_actual_key_here
```
Never commit API keys to version control. Add `.env` to your `.gitignore`.
Test the configuration by starting DeepWiki:
```bash
python -m api.main
# Should show: "Google API key configured successfully"
```
### Available Models
**Best for:** Most documentation tasks
* **Speed:** Very fast (1-3 seconds per request)
* **Quality:** Excellent for code analysis
* **Context:** 1M+ tokens input, 8K output
* **Cost:** Free tier: 15 RPM, 1M TPM
**Ideal for:**
* General repository documentation
* Quick prototyping and testing
* Regular development workflows
* Small to medium repositories
**Best for:** Stable, proven performance
* **Speed:** Fast (2-4 seconds per request)
* **Quality:** Very good, well-tested
* **Context:** 1M+ tokens input, 8K output
* **Cost:** Free tier: 15 RPM, 1M TPM
**Ideal for:**
* Production environments requiring stability
* Projects where consistency is critical
* Long-term documentation maintenance
**Best for:** Detailed analysis
* **Speed:** Moderate (3-6 seconds per request)
* **Quality:** High detail and accuracy
* **Context:** 32K tokens input/output
* **Cost:** Free tier: 60 RPM
**Ideal for:**
* Complex architectural analysis
* Detailed technical documentation
* Academic or research projects
### Optimization Tips
**Free tier limits:**
* 15 requests per minute (Flash models)
* 60 requests per minute (Pro models)
* 32,000 tokens per minute
**Best practices:**
```javascript
{
"rate_limiting": {
"requests_per_minute": 12, // Stay below limit
"retry_delay": 5, // Wait 5s on rate limit
"batch_processing": true // Process files in batches
}
}
```
Gemini models have large context windows. Optimize usage:
* **Large repositories**: Use full context for better understanding
* **Complex files**: Include more surrounding context
* **API documentation**: Include related endpoints together
```javascript
{
"context_settings": {
"max_file_size": 100000, // 100KB per file
"include_dependencies": true, // Include related files
"context_overlap": 0.1 // 10% overlap between chunks
}
}
```
## OpenAI
OpenAI's GPT models provide exceptional quality documentation with advanced reasoning capabilities.
### Setup
1. Sign up at [OpenAI Platform](https://platform.openai.com/)
2. Add payment method (required for API access)
3. Purchase credits or set up billing
4. Navigate to [API Keys](https://platform.openai.com/api-keys)
OpenAI requires a paid account. Free ChatGPT accounts cannot access the API.
1. Click "Create new secret key"
2. Add a name (e.g., "DeepWiki-Development")
3. Copy the key (starts with `sk-`)
4. Store securely (you won't see it again)
API key generated and stored securely
```env
OPENAI_API_KEY=sk-proj-...your_actual_key_here
# Optional: Custom endpoint for compatible services
OPENAI_BASE_URL=https://api.openai.com/v1
```
### Available Models
**Best for:** State-of-the-art documentation generation with advanced reasoning
* **Speed:** Fast to moderate (3-8 seconds per request)
* **Quality:** Next-generation AI capabilities with superior understanding
* **Context:** 256K tokens input/output (estimated)
* **Temperature:** 1.0 (default for creative yet accurate responses)
* **Availability:** Rolling out to API users (check availability in your region)
**Ideal for:**
* Cutting-edge documentation projects
* Complex architectural documentation
* Multi-language codebases
* Advanced technical analysis
* Projects requiring latest AI capabilities
GPT-5 is now the default model in DeepWiki as of commit 05693d5. Ensure your OpenAI account has access to GPT-5 API.
**Best for:** High-quality, comprehensive documentation
* **Speed:** Moderate (5-10 seconds per request)
* **Quality:** Exceptional writing and analysis
* **Context:** 128K tokens input/output
* **Cost:** $15/1M input tokens, $60/1M output tokens
* **Temperature:** 0.7 (default)
* **Top-p:** 0.8 (default)
**Ideal for:**
* Production documentation
* Complex enterprise applications
* Publication-quality content
* Detailed architectural analysis
* Fallback when GPT-5 is unavailable
**Best for:** Enhanced reasoning and analysis
* **Speed:** Moderate to slow (8-15 seconds)
* **Quality:** Superior technical analysis
* **Context:** 128K tokens
* **Cost:** Premium pricing
**Ideal for:**
* Complex system analysis
* Advanced architectural documentation
* Research and academic projects
**Best for:** Complex problem solving and analysis
* **o1-preview:** Advanced reasoning, slower but thorough
* **o1-mini:** Faster reasoning for simpler tasks
* **Cost:** Higher than standard GPT-4
**Ideal for:**
* Complex debugging documentation
* System optimization analysis
* Security assessment documentation
**Best for:** Budget-conscious high-quality documentation
* **Speed:** Fast (3-6 seconds per request)
* **Quality:** Very good for most tasks
* **Context:** 128K tokens
* **Cost:** Lower than GPT-4o
**Ideal for:**
* Regular documentation updates
* Smaller projects with quality requirements
* Development and testing workflows
### Cost Optimization
Monitor and optimize token consumption:
```javascript
{
"token_optimization": {
"max_input_tokens": 100000, // Limit input size
"target_output_tokens": 4000, // Reasonable output length
"preprocessing": true, // Clean input before sending
"compression": "smart" // Remove redundant content
}
}
```
**Cost calculation example:**
* Large repository: \~200K input tokens, 8K output tokens
* GPT-5 cost: Pricing to be announced (expected similar or slightly higher than GPT-4o)
* GPT-4o cost: $3.00 input + $0.48 output = \$3.48 per generation
* Monthly usage (10 repos): \~\$35-50/month (estimated)
**Match model to task complexity:**
1. **Simple projects:** Use o4-mini for cost savings
2. **Standard projects:** Use gpt-5 for latest capabilities or gpt-4o for proven reliability
3. **Complex analysis:** Use gpt-5 for advanced reasoning or o1 series for deep insights
4. **Budget constraints:** Start with o4-mini, upgrade if needed
5. **Cutting-edge needs:** Use gpt-5 for state-of-the-art performance
```javascript
{
"auto_model_selection": {
"repository_size": {
"small": "o4-mini", // < 100 files
"medium": "gpt-5", // 100-1000 files (if available, else gpt-4o)
"large": "gpt-5" // 1000+ files (if available, else gpt-4o)
},
"complexity_factors": [
"multiple_languages",
"microservice_architecture",
"complex_algorithms"
]
}
}
```
## OpenRouter
OpenRouter provides access to 100+ AI models through a single API, perfect for comparison and specialized needs.
### Setup
1. Sign up at [OpenRouter](https://openrouter.ai/)
2. Verify your email address
3. Add payment method for paid models
4. Navigate to the Keys section
Some models are free, others require credits. Check individual model pricing.
1. Click "Create Key"
2. Name your key (e.g., "DeepWiki-Prod")
3. Copy the key (starts with `sk-or-`)
4. Optionally set spending limits
OpenRouter API key generated with spending limits configured
```env
OPENROUTER_API_KEY=sk-or-...your_actual_key_here
```
### Popular Models
**Models:** `anthropic/claude-3.5-sonnet`, `anthropic/claude-3-haiku`
**Best for:**
* Excellent code analysis and explanation
* Clear, structured documentation
* Complex reasoning tasks
* Safe, helpful responses
**Pricing:** $3/1M input tokens, $15/1M output tokens (3.5 Sonnet)
**Use cases:**
* API documentation generation
* Code architecture explanation
* Security-focused analysis
**Models:** `google/gemini-pro`, `google/gemini-pro-vision`
**Best for:**
* Multimodal analysis (code + diagrams)
* Fast processing
* Good balance of quality and speed
**Pricing:** Often lower than direct Google API
**Use cases:**
* Visual diagram analysis
* Multi-language projects
* Quick documentation updates
**Models:** `meta-llama/llama-3-70b`, `mistralai/mixtral-8x7b`
**Best for:**
* Cost-effective documentation
* Privacy-conscious projects
* Experimentation and development
**Pricing:** Usually $0.50-$2.00 per 1M tokens
**Use cases:**
* Large-scale documentation projects
* Internal/proprietary code analysis
* Development and testing
**Models:** `deepseek/deepseek-coder`, `phind/phind-codellama`
**Best for:**
* Code-specific analysis
* Programming language expertise
* Technical documentation
**Use cases:**
* Algorithm explanation
* Code optimization documentation
* Programming tutorial generation
### Model Comparison Strategy
Start with a reliable, fast model:
```javascript
{
"baseline_model": "anthropic/claude-3.5-sonnet",
"test_repository": "https://github.com/small/test-repo"
}
```
Compare models for your specific use case:
```javascript
{
"comparison_models": [
"openai/gpt-4o",
"google/gemini-pro",
"meta-llama/llama-3-70b"
],
"evaluation_criteria": [
"accuracy",
"completeness",
"code_understanding",
"diagram_quality",
"cost_per_generation"
]
}
```
Select the best model based on results:
```javascript
{
"selected_model": "anthropic/claude-3.5-sonnet",
"reason": "Best code analysis with reasonable cost",
"fallback_model": "google/gemini-pro",
"reason_fallback": "Faster generation when speed needed"
}
```
## Azure OpenAI
Enterprise-grade OpenAI models with enhanced security, compliance, and control.
### Setup
1. Sign in to [Azure Portal](https://portal.azure.com/)
2. Create new Azure OpenAI resource
3. Choose region (check model availability)
4. Configure pricing tier and network settings
5. Wait for deployment completion
Azure OpenAI may require approval for access. Check the application status.
1. Go to Azure OpenAI Studio
2. Navigate to Deployments
3. Deploy required models (GPT-4, GPT-3.5-turbo, etc.)
4. Note deployment names and endpoints
Models deployed and endpoints configured
Collect the required information:
* **Endpoint:** `https://your-resource.openai.azure.com`
* **API Key:** From resource keys section
* **API Version:** e.g., `2024-02-15-preview`
```env
AZURE_OPENAI_API_KEY=abc123...your_actual_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_VERSION=2024-02-15-preview
```
### Enterprise Features
**Key benefits:**
* Data processed within your Azure tenant
* No data used for model training
* GDPR, SOC 2, HIPAA compliance available
* Private networking with VNet integration
**Configuration:**
```javascript
{
"privacy_settings": {
"data_residency": "eu-west", // Keep data in specific region
"logging": "minimal", // Reduce data logging
"retention": "30_days", // Automatic data deletion
"private_endpoint": true // Use private networking
}
}
```
**Built-in safety features:**
* Automatic content filtering for harmful content
* Customizable filter levels
* Compliance with organizational policies
**Configuration:**
```javascript
{
"content_filter": {
"severity_level": "medium", // low, medium, high
"categories": ["hate", "violence", "self_harm", "sexual"],
"custom_blocklists": ["internal_terms"],
"action_on_filter": "block" // block, warn, log
}
}
```
**Enterprise-grade performance:**
* Dedicated capacity options
* Predictable performance
* Custom rate limits
* Multi-region deployment
**Configuration:**
```javascript
{
"performance_settings": {
"capacity_type": "provisioned", // provisioned vs pay-per-token
"throughput_units": 100, // Dedicated throughput
"auto_scaling": true, // Scale with demand
"load_balancing": "round_robin" // Distribute across regions
}
}
```
## AWS Bedrock
AWS-hosted AI models with enterprise features and AWS service integration.
### Setup
1. Ensure you have an AWS account
2. Enable AWS Bedrock in your region
3. Request access to required models (may require approval)
4. Create IAM user with Bedrock permissions
Bedrock is not available in all AWS regions. Check regional availability.
Create IAM policy for Bedrock access:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": "*"
}
]
}
```
```env
AWS_ACCESS_KEY_ID=AKIA...your_access_key_here
AWS_SECRET_ACCESS_KEY=your_secret_access_key_here
AWS_REGION=us-east-1
```
AWS credentials configured and Bedrock access verified
### Available Models
**Models:**
* `anthropic.claude-3-sonnet-20240229-v1:0`
* `anthropic.claude-3-haiku-20240307-v1:0`
* `anthropic.claude-3-opus-20240229-v1:0`
**Best for:** Code analysis, documentation, safety-conscious generation
**Pricing:** \$3-15 per 1M tokens depending on model
**Models:**
* `amazon.titan-text-express-v1`
* `amazon.titan-text-lite-v1`
**Best for:** Cost-effective text generation
**Pricing:** \$0.50-2.00 per 1M tokens
**Models:**
* `ai21.j2-ultra-v1`
* `ai21.j2-mid-v1`
**Best for:** Long-form documentation, detailed analysis
**Pricing:** Varies by model
## Ollama (Local Models)
Run AI models locally for complete privacy, cost control, and offline capability.
### Setup
```bash
# Using Homebrew
brew install ollama
# Or download installer from https://ollama.ai
curl -fsSL https://ollama.ai/install.sh | sh
```
```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Start as system service
sudo systemctl enable ollama
sudo systemctl start ollama
```
Download and install from [Ollama website](https://ollama.ai/download/windows)
Or use Windows Subsystem for Linux (WSL) with Linux instructions.
```bash
# Run Ollama in Docker
docker run -d \
-v ollama:/root/.ollama \
-p 11434:11434 \
--name ollama \
ollama/ollama
# Pull and run a model
docker exec -it ollama ollama run llama3:8b
```
Download models you want to use:
```bash
# Recommended models for documentation
ollama pull qwen3:8b # Excellent for code
ollama pull llama3:8b # Good general model
ollama pull qwen3:1.7b # Lightweight option
# Verify installation
ollama list
```
Models downloaded and verified
```env
OLLAMA_HOST=http://localhost:11434
```
For remote Ollama servers:
```env
OLLAMA_HOST=http://ollama-server.internal:11434
```
### Model Selection
**qwen3:8b (Recommended)**
* **Size:** 4.8GB download
* **RAM:** 8GB required
* **Strengths:** Excellent code understanding, multilingual
* **Best for:** Most documentation tasks
**deepseek-coder:6.7b**
* **Size:** 3.8GB download
* **RAM:** 6GB required
* **Strengths:** Specialized for code generation and analysis
* **Best for:** Technical documentation, API docs
**llama3:8b**
* **Size:** 4.7GB download
* **RAM:** 8GB required
* **Strengths:** Well-balanced, good reasoning
* **Best for:** General documentation, explanations
**llama3:70b** (High-end)
* **Size:** 40GB download
* **RAM:** 64GB+ required
* **Strengths:** Excellent quality, very detailed
* **Best for:** High-quality documentation with powerful hardware
**qwen3:1.7b**
* **Size:** 1.0GB download
* **RAM:** 2GB required
* **Strengths:** Fast, efficient, good for simple tasks
* **Best for:** Quick documentation, low-resource environments
**tinyllama:1.1b**
* **Size:** 637MB download
* **RAM:** 1GB required
* **Strengths:** Very fast, minimal resources
* **Best for:** Testing, simple explanations
### Performance Optimization
**Minimum specs by model size:**
* **1B-3B models:** 4GB RAM, any modern CPU
* **7B-8B models:** 8GB RAM, modern CPU (preferably 8+ cores)
* **13B models:** 16GB RAM, high-performance CPU
* **70B+ models:** 64GB+ RAM, server-grade hardware
**GPU acceleration (optional):**
```bash
# Enable GPU support (NVIDIA)
ollama pull llama3:8b
CUDA_VISIBLE_DEVICES=0 ollama run llama3:8b
# Check GPU usage
nvidia-smi
```
**Optimize memory usage:**
```bash
# Set memory limits
export OLLAMA_MAX_LOADED_MODELS=2
export OLLAMA_MAX_QUEUE=4
# Configure model parameters
export OLLAMA_NUM_PARALLEL=2
export OLLAMA_FLASH_ATTENTION=1
```
**Model configuration:**
```javascript
{
"model_config": {
"num_ctx": 4096, // Context window size
"num_predict": 2048, // Max output tokens
"temperature": 0.7, // Randomness
"top_p": 0.8, // Nucleus sampling
"repeat_penalty": 1.1 // Avoid repetition
}
}
```
## Multi-Provider Strategy
### Provider Selection Matrix
| Project Type | Primary | Fallback | Reason |
| --------------- | ------------- | --------------------- | ----------------------- |
| **Open Source** | Google Gemini | OpenRouter | Free tier, good quality |
| **Enterprise** | Azure OpenAI | OpenAI | Security, compliance |
| **Startup** | OpenRouter | Google | Cost optimization |
| **Research** | OpenAI GPT-4o | Claude via OpenRouter | Highest quality |
| **Personal** | Ollama | Google | Privacy, no cost |
| Size | Primary | Reason |
| ----------------------- | ------------------- | ---------------------------- |
| **Small (\<100 files)** | Google Gemini Flash | Fast, sufficient quality |
| **Medium (100-1000)** | OpenAI GPT-4o | Better architecture analysis |
| **Large (1000+)** | Claude 3.5 Sonnet | Excellent at large contexts |
| **Enterprise** | Azure OpenAI | Security and compliance |
| Use Case | Best Provider | Model | Why |
| --------------------- | -------------------------- | ----------------- | ------------------- |
| **API Documentation** | OpenAI | GPT-4o | Structured output |
| **Architecture Docs** | Anthropic (via OpenRouter) | Claude 3.5 Sonnet | System thinking |
| **Code Comments** | Google | Gemini Flash | Speed + accuracy |
| **Security Docs** | Azure OpenAI | GPT-4o | Enterprise security |
| **Cost-Conscious** | Ollama | Qwen3:8b | No API costs |
### Auto-Failover Configuration
```javascript
{
"provider_strategy": {
"primary": {
"provider": "google",
"model": "gemini-2.0-flash",
"timeout": 30
},
"fallback_chain": [
{
"provider": "openrouter",
"model": "anthropic/claude-3.5-sonnet",
"condition": "rate_limit_exceeded"
},
{
"provider": "ollama",
"model": "qwen3:8b",
"condition": "network_error"
}
],
"retry_logic": {
"max_retries": 3,
"backoff_factor": 2,
"jitter": true
}
}
}
```
## Next Steps
Set up access control for your DeepWiki deployment
Create your first repository documentation
Deploy with multiple providers for production use
Integrate provider selection into your workflows
# Quick Start Guide
Source: https://asyncfunc.mintlify.app/getting-started/quick-start
Get DeepWiki-Open running in under 5 minutes with Docker or manual setup
Get DeepWiki-Open up and running quickly with either Docker (recommended) or manual setup.
## Docker Setup (Recommended)
The fastest way to get started with DeepWiki-Open is using Docker Compose.
```bash
git clone https://github.com/AsyncFuncAI/deepwiki-open.git
cd deepwiki-open
```
Repository cloned successfully
Create a `.env` file with your API keys:
```bash .env
# Required: Choose at least one AI provider
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
# Optional: Additional providers
OPENROUTER_API_KEY=your_openrouter_api_key
AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_ENDPOINT=your_azure_openai_endpoint
AZURE_OPENAI_VERSION=your_azure_openai_version
OLLAMA_HOST=http://localhost:11434
```
At minimum, you need either `GOOGLE_API_KEY` or `OPENAI_API_KEY` to get started.
```bash
docker-compose up
```
This will start both the backend API server (port 8001) and frontend web app (port 3000).
Open your browser to [http://localhost:3000](http://localhost:3000)
You should see the DeepWiki interface ready to generate your first wiki!
## Manual Setup
For development or custom configurations, you can set up DeepWiki manually.
Create a `.env` file in the project root:
```bash Minimal Setup
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
```
```bash Complete Setup
# AI Provider Keys
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
OPENROUTER_API_KEY=your_openrouter_api_key
# Azure OpenAI (Enterprise)
AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_ENDPOINT=your_azure_openai_endpoint
AZURE_OPENAI_VERSION=your_azure_openai_version
# Local/External Services
OLLAMA_HOST=http://localhost:11434
OPENAI_BASE_URL=https://api.openai.com/v1
# Server Configuration
PORT=8001
SERVER_BASE_URL=http://localhost:8001
# Authorization (Optional)
DEEPWIKI_AUTH_MODE=false
DEEPWIKI_AUTH_CODE=your_secret_code
# Logging
LOG_LEVEL=INFO
LOG_FILE_PATH=./api/logs/application.log
```
```bash
# Install Python dependencies
pip install -r api/requirements.txt
# Start the API server
python -m api.main
```
The API server will start on port 8001 by default.
Open a new terminal and run:
```bash
npm install
npm run dev
```
```bash
yarn install
yarn dev
```
The frontend will be available at [http://localhost:3000](http://localhost:3000)
## Generate Your First Wiki
In the DeepWiki interface, enter a GitHub, GitLab, or BitBucket repository URL:
* `https://github.com/openai/codex`
* `https://github.com/microsoft/autogen`
* `https://gitlab.com/gitlab-org/gitlab`
* `https://bitbucket.org/redradish/atlassian_app_versions`
Start with a smaller repository for your first test to see faster results.
If accessing a private repository:
1. Click "+ Add access tokens"
2. Enter your GitHub, GitLab, or BitBucket personal access token
Ensure your token has appropriate repository access permissions.
Choose your preferred AI model provider and model:
* **Default**: `gemini-2.0-flash`
* **Also available**: `gemini-1.5-flash`, `gemini-1.0-pro`
* **Best for**: Fast, reliable documentation generation
* **Default**: `gpt-4o`
* **Also available**: `o1`, `o3`, `o4-mini`
* **Best for**: High-quality, detailed documentation
* **Access to**: Claude, Llama, Mistral, and 100+ models
* **Best for**: Trying different models without multiple API keys
Click "Generate Wiki" and watch the magic happen!
Generation time varies by repository size. Smaller repos take 30 seconds to 2 minutes, while larger ones may take 5-10 minutes.
## API Key Setup
1. Visit [Google AI Studio](https://makersuite.google.com/app/apikey)
2. Create a new API key
3. Add to `.env` as `GOOGLE_API_KEY=your_key_here`
Google Gemini offers generous free tier limits and fast performance.
1. Visit [OpenAI Platform](https://platform.openai.com/api-keys)
2. Create a new API key
3. Add to `.env` as `OPENAI_API_KEY=your_key_here`
OpenAI requires a paid account for API access. Check your usage limits.
1. Sign up at [OpenRouter](https://openrouter.ai/)
2. Get your API key from the dashboard
3. Add to `.env` as `OPENROUTER_API_KEY=your_key_here`
OpenRouter provides access to 100+ models through one API, great for experimentation.
1. Create an Azure OpenAI resource in [Azure Portal](https://portal.azure.com/)
2. Deploy a model (e.g., GPT-4o)
3. Get your endpoint, API key, and API version
4. Add all three to your `.env` file
Azure OpenAI is ideal for enterprise deployments with enhanced security.
## Verification
Visit [http://localhost:8001/docs](http://localhost:8001/docs) to see the FastAPI documentation and test endpoints.
```bash
# Test API health
curl http://localhost:8001/health
```
The frontend at [http://localhost:3000](http://localhost:3000) should show:
* Repository input field
* Model selection dropdown
* Generate Wiki button
Check that your environment variables are loaded correctly:
```bash
# In the API logs, you should see
python -m api.main
# INFO - Starting Streaming API on port 8001
# No warnings about missing API keys for your chosen provider
```
## Next Steps
Explore wiki generation, Ask feature, and DeepResearch
Set up multiple AI providers and choose the best models
Deploy DeepWiki for production use
Integrate DeepWiki into your existing workflows
## Troubleshooting
* Ensure the backend is running on port 8001
* Check firewall settings
* Verify no other services are using port 8001
* Check `.env` file exists in project root
* Verify API keys are correctly formatted
* Ensure no extra spaces in environment variables
* Double-check API key accuracy
* Verify API key permissions and quotas
* Test API key with provider's documentation
For more detailed troubleshooting, see the [Troubleshooting Guide](/reference/troubleshooting).
# Ask Feature Guide
Source: https://asyncfunc.mintlify.app/guides/ask-feature
Learn how to use the RAG-powered Q&A system to interact with your code documentation
# Ask Feature Guide
The Ask feature is a powerful RAG (Retrieval Augmented Generation) system that enables natural language conversations with your code documentation. It combines advanced retrieval techniques with AI to provide accurate, contextual answers about your codebase.
## What is the Ask Feature?
The Ask feature transforms your static documentation into an interactive knowledge base where you can:
* **Ask natural questions** about your code, APIs, and documentation
* **Get contextual answers** with relevant code snippets and explanations
* **Maintain conversation history** for follow-up questions
* **Access real-time streaming responses** for immediate feedback
* **Query programmatically** via API for integration into other tools
The Ask feature uses Retrieval Augmented Generation (RAG) to combine the best of both worlds: precise information retrieval from your docs with the natural language capabilities of large language models.
## How RAG Works with Code
### 1. Document Indexing
When your wiki is generated, DeepWikiOpen creates semantic embeddings of your documentation:
```mermaid
graph TD
A[Code Repository] --> B[Documentation Generation]
B --> C[Text Chunking]
C --> D[Embedding Creation]
D --> E[Vector Database]
E --> F[Searchable Index]
```
### 2. Query Processing
When you ask a question, the system:
1. **Converts your question** into a semantic embedding
2. **Searches the vector database** for relevant documentation chunks
3. **Retrieves contextual information** from multiple sources
4. **Generates a comprehensive answer** using the retrieved context
```typescript Query Processing Flow
// 1. Question embedding
const questionEmbedding = await embedQuery(userQuestion);
// 2. Similarity search
const relevantChunks = await vectorStore.similaritySearch(
questionEmbedding,
{ k: 5, threshold: 0.7 }
);
// 3. Context preparation
const context = relevantChunks.map(chunk => ({
content: chunk.content,
source: chunk.metadata.source,
score: chunk.score
}));
// 4. Response generation
const response = await generateResponse(userQuestion, context);
```
```python Python Example
# Vector similarity search
import numpy as np
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
def find_relevant_docs(question, doc_embeddings, docs, top_k=5):
question_embedding = model.encode([question])
# Calculate cosine similarity
similarities = np.dot(question_embedding, doc_embeddings.T)
top_indices = np.argsort(similarities[0])[-top_k:][::-1]
return [docs[i] for i in top_indices]
```
## Using the Ask Interface
### Basic Question Interface
The Ask feature provides an intuitive chat interface in your generated wiki:
### Conversation Flow
Type your question in natural language in the Ask input field
```
How do I authenticate users in this API?
```
Watch as the system processes your query with streaming responses
```
Searching documentation... ⚡
Found 3 relevant sections... 📚
Generating response... 🤖
```
Get a comprehensive answer with code examples and source references
Continue the conversation with contextual follow-up queries
## Types of Questions That Work Best
### ✅ Excellent Question Types
**Examples:**
* "How do I set up authentication?"
* "How to handle errors in the API?"
* "How do I deploy this application?"
**Examples:**
* "What is the UserService class?"
* "What does the config.json file contain?"
* "What are the available endpoints?"
**Examples:**
* "Where is the database connection configured?"
* "Where are the API routes defined?"
* "Which file contains the main application logic?"
**Examples:**
* "What's the recommended way to handle validation?"
* "Best practices for error handling?"
* "How should I structure my tests?"
### ❌ Less Effective Question Types
These types of questions may not yield optimal results:
* Vague questions without context ("How does this work?")
* Questions about code not in your documentation
* Highly specific implementation details not documented
* Questions requiring real-time data or external information
## Example Questions and Responses
### API Documentation Questions
**User:** "How do I authenticate requests to the API?"
**AI:** Based on your API documentation, authentication is handled using JWT tokens. Here's how to authenticate requests:
## Authentication Method
```typescript
// Include the JWT token in the Authorization header
const response = await fetch('/api/users', {
headers: {
'Authorization': `Bearer ${jwtToken}`,
'Content-Type': 'application/json'
}
});
```
## Getting a Token
First, authenticate with your credentials:
```typescript
const authResponse = await fetch('/api/auth/login', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
email: 'user@example.com',
password: 'yourpassword'
})
});
const { token } = await authResponse.json();
```
**Sources:**
* `/docs/api/authentication.md`
* `/examples/auth-example.js`
### Code Structure Questions
**User:** "What's the structure of the database models?"
**AI:** Your application uses several database models organized in the `/models` directory:
## Core Models
```typescript
// User Model
interface User {
id: string;
email: string;
name: string;
createdAt: Date;
updatedAt: Date;
}
// Project Model
interface Project {
id: string;
title: string;
description: string;
ownerId: string;
members: string[];
}
```
## Model Relationships
```mermaid
graph TD
A[User] -->|owns| B[Project]
A -->|member of| B
B -->|contains| C[Task]
C -->|assigned to| A
```
**Sources:**
* `/models/User.ts`
* `/models/Project.ts`
* `/docs/database-schema.md`
## Conversation History and Context
### Context Retention
The Ask feature maintains conversation context throughout your session:
```typescript
// Example conversation flow
const conversation = [
{
role: 'user',
content: 'How do I set up the database?',
timestamp: '2024-01-15T10:00:00Z'
},
{
role: 'assistant',
content: 'To set up the database, you need to...',
sources: ['/docs/setup.md', '/config/database.js'],
timestamp: '2024-01-15T10:00:05Z'
},
{
role: 'user',
content: 'What about migrations?', // Context: Still about database setup
timestamp: '2024-01-15T10:02:00Z'
}
];
```
### Follow-up Question Examples
**Q:** "What API endpoints are available?"
**A:** Lists all available endpoints with descriptions...
**Follow-up:** "How do I use the user endpoint?"
**Follow-up:** "What's the rate limiting for these endpoints?"
**Follow-up:** "Are there any authentication requirements?"
**Q:** "How does error handling work?"
**A:** Explains the error handling strategy with examples...
**Follow-up:** "What about async errors?"
**Follow-up:** "How do I log these errors?"
**Follow-up:** "Are there custom error types?"
## Streaming Responses and Real-time Interaction
### Streaming Implementation
The Ask feature provides real-time streaming responses for immediate feedback:
```typescript Client-Side Streaming
const askQuestion = async (question: string) => {
const response = await fetch('/api/ask', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ question })
});
const reader = response.body?.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = JSON.parse(line.slice(6));
updateUI(data); // Real-time UI updates
}
}
}
};
```
```python Server-Side Streaming
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
import json
app = FastAPI()
@app.post("/api/ask")
async def ask_question(question: dict):
def generate_response():
# Search for relevant documents
yield f"data: {json.dumps({'type': 'status', 'message': 'Searching...'})}\n\n"
relevant_docs = search_documents(question['question'])
yield f"data: {json.dumps({'type': 'status', 'message': f'Found {len(relevant_docs)} relevant sections'})}\n\n"
# Generate streaming response
for chunk in generate_answer(question['question'], relevant_docs):
yield f"data: {json.dumps({'type': 'content', 'chunk': chunk})}\n\n"
yield f"data: {json.dumps({'type': 'complete'})}\n\n"
return StreamingResponse(generate_response(), media_type="text/plain")
```
### Visual Feedback
The interface provides visual indicators for:
* **Search progress** with animated loading states
* **Token-by-token streaming** for real-time response building
* **Source highlighting** as references are found
* **Completion status** when the response is finished
## API Usage for Programmatic Q\&A
### REST API Endpoints
```bash cURL Example
curl -X POST "https://your-wiki.com/api/ask" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"question": "How do I implement user authentication?",
"conversation_id": "conv_123",
"include_sources": true,
"stream": false
}'
```
```javascript JavaScript SDK
import { DeepWikiClient } from '@deepwiki/client';
const client = new DeepWikiClient({
apiKey: 'your-api-key',
baseUrl: 'https://your-wiki.com'
});
// Simple question
const response = await client.ask({
question: "How do I set up the development environment?",
includeSource: true
});
console.log(response.answer);
console.log(response.sources);
// Streaming question
const stream = client.askStreaming({
question: "Explain the authentication flow",
conversationId: "conv_123"
});
for await (const chunk of stream) {
if (chunk.type === 'content') {
process.stdout.write(chunk.text);
}
}
```
```python Python SDK
from deepwiki import DeepWikiClient
client = DeepWikiClient(
api_key="your-api-key",
base_url="https://your-wiki.com"
)
# Synchronous question
response = client.ask(
question="What are the available API endpoints?",
include_sources=True
)
print(response.answer)
for source in response.sources:
print(f"Source: {source.file} (Score: {source.relevance_score})")
# Asynchronous streaming
async for chunk in client.ask_streaming(
question="How does error handling work?",
conversation_id="conv_456"
):
if chunk.type == "content":
print(chunk.text, end="", flush=True)
```
### API Response Format
```json
{
"id": "ask_123456789",
"question": "How do I authenticate users?",
"answer": "To authenticate users in this API, you use JWT tokens...",
"sources": [
{
"file": "/docs/authentication.md",
"title": "Authentication Guide",
"relevance_score": 0.95,
"excerpt": "JWT tokens are used for authentication..."
},
{
"file": "/examples/auth.js",
"title": "Authentication Example",
"relevance_score": 0.87,
"excerpt": "const token = jwt.sign(payload, secret);"
}
],
"conversation_id": "conv_123",
"timestamp": "2024-01-15T10:30:00Z",
"processing_time_ms": 1250
}
```
## Best Practices for Effective Questions
### 🎯 Writing Better Questions
**Instead of:** "How does this work?"
**Try:** "How does user authentication work in the login endpoint?"
**Instead of:** "Fix this error"
**Try:** "How do I handle the 'Database connection failed' error when starting the server?"
**Good:** "What configuration options are available for the email service?"
**Less Good:** "How do I integrate with an undocumented third-party service?"
**Better:** "How do I format a POST request to create a new user with email and password fields?"
**vs:** "How do I create users?"
### 📝 Question Templates
```
How do I [specific action] in [specific context]?
Examples:
- How do I configure Redis caching in the production environment?
- How do I validate user input in the registration form?
- How do I set up database migrations for the User model?
```
```
What is [specific component/concept] and how does it [function/relate]?
Examples:
- What is the AuthMiddleware class and how does it validate tokens?
- What is the config.database object and how does it connect to PostgreSQL?
- What is the UserService and how does it interact with the database?
```
```
Where can I find [specific functionality/configuration] for [context]?
Examples:
- Where can I find the API rate limiting configuration?
- Where are the email templates stored for password reset?
- Where is the logging configuration for production deployment?
```
```
What's the recommended way to [specific task] in this codebase?
Examples:
- What's the recommended way to handle async errors in API endpoints?
- What's the recommended way to structure test files for components?
- What's the recommended way to manage environment variables?
```
### 💡 Advanced Query Techniques
Break complex questions into parts:
**Instead of:** "How do I set up the database, configure authentication, and deploy to production?"
**Try:**
1. "How do I set up the database for development?"
2. "How do I configure JWT authentication?" (follow-up)
3. "What's the deployment process for production?" (follow-up)
Build on previous answers:
**After asking about authentication:**
* "How do I test these authentication endpoints?"
* "What error codes does this authentication return?"
* "How do I refresh expired tokens mentioned earlier?"
Request concrete examples:
**Good questions:**
* "Can you show me an example of making a POST request to create a user?"
* "What does a typical error response look like?"
* "Can you provide a sample configuration file?"
## Limitations and Troubleshooting
### Known Limitations
**Current limitations of the Ask feature:**
* **Documentation Scope**: Only searches indexed documentation and code files
* **Real-time Data**: Cannot access live databases or external APIs
* **Code Execution**: Cannot run or test code, only explain existing code
* **Version Specificity**: May not distinguish between different versions of your code
* **Complex Logic**: May struggle with highly complex business logic without clear documentation
### Common Issues and Solutions
**Symptoms:** The AI responds with "I couldn't find relevant information"
**Solutions:**
* Check if your question relates to documented features
* Try rephrasing with different terminology
* Ensure your documentation was properly indexed
* Ask more specific questions about documented components
**Example Fix:**
* ❌ "How do I use advanced features?"
* ✅ "How do I configure the email notification system?"
**Symptoms:** The response doesn't match your actual codebase
**Solutions:**
* Verify your documentation is up-to-date
* Re-index your documentation if you've made recent changes
* Provide more context in your question
* Ask follow-up questions for clarification
**Example:**
```
Q: "The authentication process you described doesn't match my code"
Follow-up: "Can you show me specifically how JWT tokens are validated in the middleware?"
```
**Symptoms:** Questions take a long time to process
**Solutions:**
* Check if you're asking very broad questions (try being more specific)
* Verify your internet connection for streaming responses
* Consider if your documentation is very large (may require longer processing)
* Use the API with appropriate timeout settings
**Optimization:**
```javascript
// Set appropriate timeouts for API calls
const response = await client.ask({
question: "How do I configure logging?",
timeout: 30000 // 30 seconds
});
```
**Symptoms:** Follow-up questions don't reference previous conversation
**Solutions:**
* Ensure you're using the same conversation session
* Check that cookies/session storage is enabled
* For API usage, pass the same `conversation_id`
* Start a new conversation if context becomes confusing
**API Example:**
```javascript
// Maintain conversation context
const conversationId = 'conv_' + Date.now();
const response1 = await client.ask({
question: "How do I set up authentication?",
conversationId
});
const response2 = await client.ask({
question: "What about authorization for these endpoints?",
conversationId // Same ID maintains context
});
```
### Performance Optimization Tips
**Optimize your questions for better performance:**
```markdown
✅ Good Performance:
- Specific questions about documented features
- Questions referencing specific file names or classes
- Well-structured questions with clear context
❌ Poor Performance:
- Extremely broad questions requiring full codebase analysis
- Questions about undocumented or external dependencies
- Vague questions without specific context
```
**Optimize API calls:**
```javascript
// Use appropriate parameters
const response = await client.ask({
question: "How do I configure the database?",
maxSources: 3, // Limit sources for faster processing
includeSourceText: false, // Skip full source text if not needed
stream: true // Use streaming for better perceived performance
});
```
**Manage conversations efficiently:**
```javascript
// Clear conversation context when changing topics
if (isNewTopic(question)) {
conversationId = generateNewConversationId();
}
// Archive old conversations to improve performance
if (conversationAge > MAX_CONVERSATION_AGE) {
await client.archiveConversation(conversationId);
conversationId = generateNewConversationId();
}
```
### Debug Mode
Enable debug mode to troubleshoot issues:
```javascript Client Debug
const client = new DeepWikiClient({
apiKey: 'your-api-key',
debug: true // Enables detailed logging
});
const response = await client.ask({
question: "How does caching work?",
debug: true // Request debug information
});
console.log('Debug Info:', response.debug);
// Shows: query processing time, sources found, embedding similarity scores
```
```python Python Debug
client = DeepWikiClient(
api_key="your-api-key",
debug=True
)
response = client.ask(
question="How does error handling work?",
debug=True
)
print("Sources found:", len(response.sources))
print("Processing time:", response.processing_time_ms)
print("Query embedding dimensions:", response.debug.embedding_dimensions)
```
***
**Still having issues?**
* Check the [API Documentation](/api-reference) for detailed endpoint information
* Review [Common Patterns](/guides/common-patterns) for question examples
* Contact support with specific error messages and question examples
* Join our community Discord for real-time help
## Next Steps
Detailed API documentation for programmatic access
Learn how to integrate Ask into your applications
Explore advanced Ask feature capabilities
Tips and patterns for effective documentation Q\&A
# null
Source: https://asyncfunc.mintlify.app/guides/deep-research
# Deep Research Guide
Deep Research is DeepWikiOpen's advanced multi-turn analysis feature that enables comprehensive investigation of complex topics through iterative AI-powered research cycles.
## What is Deep Research?
Deep Research is an intelligent research methodology that breaks down complex questions into manageable components, conducting thorough analysis through multiple iterative turns. Unlike traditional single-response queries, Deep Research employs a systematic approach:
1. **Planning Phase**: AI analyzes the question and creates a structured research plan
2. **Execution Phase**: Multiple research turns explore different aspects
3. **Synthesis Phase**: Findings are consolidated into comprehensive conclusions
### Core Components
* **Multi-turn Analysis**: Sequential research iterations building on previous findings
* **Adaptive Planning**: Research direction adjusts based on discovered information
* **Source Integration**: Combines multiple authoritative sources
* **Progressive Refinement**: Each turn deepens understanding and fills knowledge gaps
## Deep Research vs Regular Ask
| Feature | Regular Ask | Deep Research |
| -------------- | ------------------------------- | ----------------------------- |
| **Scope** | Single focused question | Complex, multi-faceted topics |
| **Process** | One-turn response | Multi-turn iterative analysis |
| **Depth** | Surface-level or specific facts | Comprehensive investigation |
| **Sources** | Limited source consultation | Extensive source integration |
| **Time** | Immediate response | Extended research process |
| **Complexity** | Simple to moderate | High complexity topics |
### When to Use Deep Research
**Choose Deep Research for:**
* Complex historical events with multiple perspectives
* Scientific topics requiring interdisciplinary analysis
* Market research spanning multiple industries
* Policy analysis with various stakeholder viewpoints
* Technical investigations requiring multiple sources
* Comparative studies across different domains
**Choose Regular Ask for:**
* Quick fact checks
* Simple definitions
* Specific date/number queries
* Direct procedural questions
* Basic how-to inquiries
## The Deep Research Process
### 1. Research Planning
The AI begins by analyzing your question and creating a structured research plan:
```json
{
"research_question": "How has climate change affected global food security?",
"research_plan": [
"Analyze climate change impacts on agriculture",
"Examine regional variations in food production",
"Investigate supply chain disruptions",
"Study adaptation strategies and technologies",
"Evaluate policy responses and effectiveness"
],
"estimated_turns": 4-6,
"key_sources": ["IPCC reports", "FAO data", "agricultural journals"]
}
```
### 2. Iterative Research Turns
Each research turn focuses on specific aspects:
**Turn 1: Foundation Building**
* Establish core concepts and definitions
* Identify primary data sources
* Map the research landscape
**Turn 2-N: Deep Dive Analysis**
* Explore specific subtopics
* Gather supporting evidence
* Identify patterns and connections
**Final Turn: Synthesis**
* Integrate findings across all turns
* Draw comprehensive conclusions
* Highlight limitations and future research needs
### 3. Progress Updates
Throughout the process, you receive real-time updates:
```
🔍 Research Progress: Turn 2/5
📊 Sources analyzed: 23
🎯 Current focus: Regional climate impacts on crop yields
⏱️ Estimated completion: 3-4 minutes
```
## Types of Questions Suitable for Deep Research
### Scientific & Technical Topics
**Example**: "What are the latest developments in quantum computing and their potential applications?"
**Research Approach**:
* Current quantum computing technologies
* Recent breakthroughs and publications
* Industry applications and use cases
* Challenges and limitations
* Future prospects and timeline
### Historical Analysis
**Example**: "How did the Silk Road influence cultural exchange between East and West?"
**Research Approach**:
* Historical context and timeline
* Trade routes and major cities
* Cultural, religious, and technological exchanges
* Economic impacts on civilizations
* Long-term historical significance
### Policy & Social Issues
**Example**: "What are the effects of universal basic income pilot programs worldwide?"
**Research Approach**:
* UBI pilot program overview
* Implementation models across countries
* Economic outcomes and metrics
* Social and behavioral impacts
* Policy implications and scalability
### Business & Market Analysis
**Example**: "How is artificial intelligence transforming the healthcare industry?"
**Research Approach**:
* AI applications in healthcare sectors
* Market size and growth projections
* Key players and technologies
* Regulatory challenges and compliance
* Patient outcomes and case studies
## Research Stages and Iteration Process
### Stage 1: Question Decomposition
```mermaid
graph TD
A[Complex Question] --> B[Identify Key Components]
B --> C[Create Research Subtopics]
C --> D[Prioritize Investigation Areas]
D --> E[Establish Success Criteria]
```
### Stage 2: Information Gathering
Each turn systematically explores different aspects:
1. **Primary Source Analysis**: Academic papers, official reports, government data
2. **Cross-referencing**: Validating information across multiple sources
3. **Trend Identification**: Recognizing patterns and emerging themes
4. **Gap Analysis**: Identifying areas needing additional investigation
### Stage 3: Synthesis and Conclusions
```mermaid
graph LR
A[Turn 1 Findings] --> D[Synthesis Engine]
B[Turn 2 Findings] --> D
C[Turn N Findings] --> D
D --> E[Integrated Analysis]
E --> F[Comprehensive Conclusions]
```
## Example Deep Research Session
### Question: "How has remote work adoption changed business operations post-pandemic?"
#### Research Plan
```
📋 Research Plan (5 turns estimated)
1. Pre-pandemic vs current remote work statistics
2. Technology infrastructure adaptations
3. Management and productivity impacts
4. Employee experience and workplace culture
5. Long-term business strategy changes
```
#### Turn 1: Statistical Foundation
```
🔍 Turn 1/5: Baseline Statistics
📊 Key Findings:
- Remote work adoption increased from 24% (2019) to 71% (2023)
- Hybrid models now represent 49% of work arrangements
- Technology spending on collaboration tools up 300%
📚 Sources: Bureau of Labor Statistics, McKinsey Global Institute, Gartner
⏱️ Next: Technology infrastructure analysis
```
#### Turn 2: Technology Infrastructure
```
🔍 Turn 2/5: Technology Adaptations
🛠️ Key Findings:
- Cloud infrastructure adoption accelerated by 3-5 years
- Cybersecurity spending increased 40% for remote access
- Video conferencing usage up 2000% since 2020
💡 Emerging Trend: AI-powered productivity tools gaining adoption
⏱️ Next: Management impact analysis
```
#### Turn 3: Management Evolution
```
🔍 Turn 3/5: Management & Productivity
📈 Key Findings:
- 67% of managers report improved productivity metrics
- Performance evaluation shifted to outcome-based models
- Middle management roles evolved toward coaching
⚠️ Challenge: Maintaining team cohesion and company culture
⏱️ Next: Employee experience deep dive
```
#### Turn 4: Employee Experience
```
🔍 Turn 4/5: Employee Perspectives
👥 Key Findings:
- 87% prefer hybrid/remote work options
- Work-life balance improved for 74% of workers
- Career development concerns increased 23%
🔄 Adaptation: New onboarding and mentorship programs
⏱️ Next: Strategic business implications
```
#### Turn 5: Strategic Business Changes
```
🔍 Turn 5/5: Long-term Strategy Impact
🏢 Key Findings:
- Office space reduction: average 30% smaller footprint
- Geographic talent acquisition expanded 340%
- Cost savings average $11K per remote employee annually
🎯 Strategic Shift: From location-based to talent-based hiring
⏱️ Finalizing comprehensive analysis...
```
#### Final Synthesis
```
📋 COMPREHENSIVE ANALYSIS: Remote Work Business Transformation
🔍 EXECUTIVE SUMMARY
The pandemic-driven shift to remote work has fundamentally transformed business operations,
creating a permanent change in workplace dynamics with significant strategic implications.
📊 KEY TRANSFORMATIONS
1. Operational Model: Hybrid-first approach now standard (71% adoption)
2. Technology Infrastructure: Accelerated digital transformation ($2.3T investment)
3. Management Philosophy: Outcome-based performance evaluation
4. Talent Strategy: Geographic boundaries eliminated for recruitment
5. Cost Structure: Average 20-30% reduction in operational expenses
🎯 STRATEGIC IMPLICATIONS
- Companies maintaining remote flexibility report 25% better talent retention
- Productivity gains sustained at 13-50% above pre-pandemic levels
- New competitive advantages through expanded talent pools
- Risk mitigation through distributed workforce models
⚠️ ONGOING CHALLENGES
- Maintaining organizational culture and team cohesion
- Ensuring equitable career development opportunities
- Managing cybersecurity in distributed environments
- Balancing flexibility with collaboration needs
🔮 FUTURE OUTLOOK
The remote work transformation appears permanent, with 83% of companies
planning long-term hybrid models. Success factors include:
- Investment in collaborative technologies
- Adaptive management training
- Culture-building initiatives
- Flexible policy frameworks
📚 Research based on 47 sources including McKinsey, Deloitte, Harvard Business Review,
and primary data from Fortune 500 companies.
```
## API Usage for Automated Research
### Basic Deep Research Request
```python
import requests
# Start deep research session
response = requests.post('https://api.deepwikiopen.com/v1/deep-research', {
'question': 'How is blockchain technology affecting supply chain management?',
'max_turns': 6,
'focus_areas': ['technology', 'case_studies', 'challenges', 'future_trends'],
'source_preferences': ['academic', 'industry_reports', 'case_studies']
})
research_id = response.json()['research_id']
```
### Monitor Research Progress
```python
# Check research status
status_response = requests.get(f'https://api.deepwikiopen.com/v1/deep-research/{research_id}/status')
progress = status_response.json()
print(f"Progress: {progress['current_turn']}/{progress['total_turns']}")
print(f"Current focus: {progress['current_focus']}")
print(f"Sources analyzed: {progress['sources_count']}")
```
### Retrieve Results
```python
# Get final results
results_response = requests.get(f'https://api.deepwikiopen.com/v1/deep-research/{research_id}/results')
research_data = results_response.json()
# Access structured findings
for turn in research_data['turns']:
print(f"Turn {turn['number']}: {turn['focus']}")
print(f"Key findings: {turn['key_findings']}")
print(f"Sources: {len(turn['sources'])}")
# Get final synthesis
synthesis = research_data['synthesis']
print(f"Comprehensive analysis: {synthesis['executive_summary']}")
```
### Advanced API Configuration
```python
# Advanced research configuration
advanced_config = {
'question': 'Impact of AI on financial services',
'research_parameters': {
'depth_level': 'comprehensive', # standard, comprehensive, exhaustive
'time_constraint': '15_minutes',
'source_diversity': 'high',
'regional_focus': ['north_america', 'europe', 'asia'],
'time_range': '2020-2024',
'confidence_threshold': 0.8
},
'output_preferences': {
'include_citations': True,
'visual_elements': ['charts', 'timelines'],
'summary_length': 'detailed',
'technical_level': 'expert'
}
}
```
## Performance Considerations
### Research Duration
| Question Complexity | Typical Turns | Duration | Sources |
| ------------------- | ------------- | ------------- | -------- |
| **Simple** | 2-3 turns | 2-4 minutes | 10-20 |
| **Moderate** | 3-5 turns | 5-8 minutes | 25-50 |
| **Complex** | 5-8 turns | 8-15 minutes | 50-100+ |
| **Exhaustive** | 8-12 turns | 15-25 minutes | 100-200+ |
### Optimization Strategies
**For Faster Results:**
* Use focused question framing
* Specify known time constraints
* Limit source types if appropriate
* Set explicit scope boundaries
**For Deeper Analysis:**
* Allow flexible turn limits
* Include interdisciplinary sources
* Enable follow-up question generation
* Request comprehensive citations
### Resource Usage
```python
# Monitor token consumption
research_stats = {
'input_tokens': 2500, # Question + context
'processing_tokens': 45000, # Multi-turn analysis
'output_tokens': 8000, # Final synthesis
'total_cost': '$0.32' # Estimated API cost
}
```
## Best Practices and Optimization Tips
### Question Formulation
**✅ Effective Deep Research Questions:**
* "How has cryptocurrency regulation evolved globally and what are the implications for adoption?"
* "What are the environmental and economic impacts of vertical farming technologies?"
* "How do different countries approach data privacy legislation and enforcement?"
**❌ Less Suitable for Deep Research:**
* "What is the capital of France?" (too simple)
* "How do I change a tire?" (procedural, not analytical)
* "What's the weather today?" (current, specific data)
### Scope Management
```markdown
## Research Scope Framework
### 🎯 Well-Scoped Questions
- Clear temporal boundaries (e.g., "since 2020", "in the next decade")
- Defined geographic focus (e.g., "in developed nations", "across Asia-Pacific")
- Specific industry/domain (e.g., "in healthcare", "for SMB companies")
- Measurable aspects (e.g., "economic impact", "user adoption rates")
### ⚠️ Overly Broad Questions
- "How does technology affect society?" → Too general
- "What is the future of business?" → Too vague
- "How does everything connect?" → No clear scope
```
### Source Quality Optimization
**High-Quality Source Preferences:**
* Peer-reviewed academic papers
* Government and institutional reports
* Industry analysis from recognized firms
* Primary research and surveys
* Expert interviews and case studies
**Source Validation Indicators:**
* Author credentials and affiliations
* Publication date and relevance
* Citation count and impact factor
* Methodology transparency
* Cross-reference validation
### Iterative Refinement
```python
# Example of research refinement workflow
refinement_process = {
'initial_question': 'How is AI changing education?',
'refined_questions': [
'How is AI being implemented in K-12 education systems?',
'What are the learning outcomes of AI-powered educational tools?',
'How do educators perceive AI integration in classrooms?',
'What are the equity implications of AI in education?'
],
'focus_areas': [
'implementation_models',
'effectiveness_metrics',
'teacher_training',
'accessibility_concerns'
]
}
```
### Quality Assurance
**Research Quality Metrics:**
* **Source Diversity**: Multiple perspectives and methodologies
* **Temporal Coverage**: Recent and historical context
* **Evidence Strength**: Primary data and expert consensus
* **Bias Mitigation**: Balanced viewpoint representation
* **Completeness**: Addressing all aspects of the question
### Common Pitfalls to Avoid
1. **Question Too Narrow**: Results in shallow, limited analysis
2. **Unrealistic Time Expectations**: Complex topics need adequate research time
3. **Source Bias**: Over-relying on single source types or viewpoints
4. **Scope Creep**: Allowing research to expand beyond useful boundaries
5. **Insufficient Context**: Not providing relevant background information
### Integration with Regular Ask
```mermaid
flowchart TD
A[User Question] --> B{Complexity Assessment}
B -->|Simple/Direct| C[Regular Ask]
B -->|Complex/Multi-faceted| D[Deep Research]
C --> E[Immediate Response]
D --> F[Multi-turn Analysis]
F --> G[Comprehensive Report]
G --> H{Follow-up Needed?}
H -->|Yes| I[Additional Regular Ask]
H -->|No| J[Research Complete]
```
## Advanced Features
### Real-time Research Updates
Enable live progress tracking:
```javascript
// WebSocket connection for real-time updates
const ws = new WebSocket('wss://api.deepwikiopen.com/v1/research-stream');
ws.onmessage = function(event) {
const update = JSON.parse(event.data);
console.log(`Turn ${update.turn}: ${update.status}`);
console.log(`Sources analyzed: ${update.sources_count}`);
console.log(`Current focus: ${update.current_focus}`);
};
```
### Collaborative Research
Multiple researchers can contribute to the same deep research session:
```python
# Add collaborative researchers
collaboration_config = {
'research_id': research_id,
'collaborators': [
{'role': 'domain_expert', 'focus': 'technical_analysis'},
{'role': 'market_analyst', 'focus': 'commercial_implications'},
{'role': 'policy_expert', 'focus': 'regulatory_landscape'}
],
'coordination_mode': 'parallel_with_synthesis'
}
```
### Research Templates
Use pre-configured research templates for common analysis types:
```python
# Market analysis template
market_template = {
'template_id': 'market_analysis_v2',
'standard_turns': [
'market_size_and_trends',
'competitive_landscape',
'customer_segments',
'regulatory_environment',
'future_projections'
],
'required_sources': ['industry_reports', 'financial_data', 'market_surveys'],
'output_format': 'executive_summary_with_appendices'
}
```
## Conclusion
Deep Research represents a paradigm shift in AI-powered information analysis, enabling comprehensive investigation of complex topics through systematic, iterative research processes. By understanding when and how to leverage this powerful feature, users can obtain insights that go far beyond traditional search and single-turn AI responses.
The key to successful deep research lies in proper question formulation, appropriate scope management, and leveraging the full potential of multi-turn analysis. As AI capabilities continue to evolve, Deep Research will become an increasingly valuable tool for researchers, analysts, and decision-makers across all industries.
***
*For more information about DeepWikiOpen's Deep Research capabilities, visit our [API documentation](/api/deep-research) or explore our [example research sessions](/examples/deep-research).*
# null
Source: https://asyncfunc.mintlify.app/guides/docker-deployment
# Docker Deployment Guide
This comprehensive guide covers everything you need to know about deploying DeepWikiOpen using Docker, from basic setups to production-grade deployments.
## Table of Contents
* [Prerequisites](#prerequisites)
* [Quick Start with Pre-built Images](#quick-start-with-pre-built-images)
* [Building Custom Images](#building-custom-images)
* [Docker Compose Setup](#docker-compose-setup)
* [Environment Configuration](#environment-configuration)
* [Volume Mounts and Data Persistence](#volume-mounts-and-data-persistence)
* [Container Networking](#container-networking)
* [Health Checks and Monitoring](#health-checks-and-monitoring)
* [Scaling Strategies](#scaling-strategies)
* [Security Considerations](#security-considerations)
* [Troubleshooting](#troubleshooting)
* [Production Deployments](#production-deployments)
## Prerequisites
Before you begin, ensure you have the following installed:
* **Docker** (version 20.10+)
* **Docker Compose** (version 2.0+)
* **Git** (for cloning repositories)
### Installation Verification
```bash
# Check Docker version
docker --version
docker-compose --version
# Verify Docker is running
docker info
```
## Quick Start with Pre-built Images
### Using GitHub Container Registry
DeepWikiOpen provides pre-built images through GitHub Container Registry (GHCR). This is the fastest way to get started.
```bash
# Pull the latest image
docker pull ghcr.io/your-org/deepwikiopen:latest
# Run a simple container
docker run -d \
--name deepwikiopen \
-p 3000:3000 \
-e NODE_ENV=production \
ghcr.io/your-org/deepwikiopen:latest
```
### Available Image Tags
| Tag | Description | Use Case |
| --------- | --------------------- | --------------------------------- |
| `latest` | Latest stable release | Production |
| `develop` | Development branch | Testing |
| `v1.2.3` | Specific version | Production pinning |
| `slim` | Minimal image size | Resource-constrained environments |
### Basic Docker Run Command
```bash
docker run -d \
--name deepwikiopen-app \
--restart unless-stopped \
-p 3000:3000 \
-e DATABASE_URL="postgresql://user:pass@host:5432/db" \
-e JWT_SECRET="your-secret-key" \
-e REDIS_URL="redis://redis:6379" \
-v deepwikiopen-data:/app/data \
ghcr.io/your-org/deepwikiopen:latest
```
## Building Custom Images
### Dockerfile Explanation
Here's a production-ready Dockerfile with explanations:
```dockerfile
# Use Node.js LTS Alpine for smaller image size
FROM node:18-alpine AS base
# Install system dependencies
RUN apk add --no-cache \
python3 \
make \
g++ \
cairo-dev \
jpeg-dev \
pango-dev \
musl-dev \
giflib-dev \
pixman-dev \
pangomm-dev \
libjpeg-turbo-dev \
freetype-dev
# Set working directory
WORKDIR /app
# Copy package files
COPY package*.json ./
COPY yarn.lock ./
# Install dependencies
FROM base AS dependencies
RUN npm ci --only=production --frozen-lockfile
# Development dependencies for building
FROM base AS dev-dependencies
RUN npm ci --frozen-lockfile
# Build stage
FROM dev-dependencies AS build
COPY . .
RUN npm run build
RUN npm run test:unit
# Production stage
FROM base AS production
# Create non-root user for security
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
# Copy production dependencies
COPY --from=dependencies /app/node_modules ./node_modules
# Copy built application
COPY --from=build --chown=nextjs:nodejs /app/.next ./.next
COPY --from=build --chown=nextjs:nodejs /app/public ./public
COPY --from=build --chown=nextjs:nodejs /app/package.json ./package.json
# Create data directory
RUN mkdir -p /app/data && chown -R nextjs:nodejs /app/data
# Switch to non-root user
USER nextjs
# Expose port
EXPOSE 3000
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/api/health || exit 1
# Start the application
CMD ["npm", "start"]
```
### Building the Image
```bash
# Build with default tag
docker build -t deepwikiopen:local .
# Build with specific tag and build args
docker build \
--build-arg NODE_ENV=production \
--build-arg BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ') \
-t deepwikiopen:v1.0.0 .
# Multi-platform build
docker buildx build \
--platform linux/amd64,linux/arm64 \
-t deepwikiopen:multi-arch \
--push .
```
### Optimized Development Dockerfile
```dockerfile
FROM node:18-alpine AS development
# Install development tools
RUN apk add --no-cache \
git \
curl \
vim
WORKDIR /app
# Copy package files
COPY package*.json ./
RUN npm install
# Copy source code
COPY . .
# Expose port and start dev server
EXPOSE 3000
CMD ["npm", "run", "dev"]
```
## Docker Compose Setup
### Complete Production Setup
```yaml
# docker-compose.yml
version: '3.8'
services:
# Main application
app:
image: ghcr.io/your-org/deepwikiopen:latest
container_name: deepwikiopen-app
restart: unless-stopped
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- DATABASE_URL=postgresql://postgres:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
- REDIS_URL=redis://redis:6379
- JWT_SECRET=${JWT_SECRET}
- NEXTAUTH_URL=${NEXTAUTH_URL}
- NEXTAUTH_SECRET=${NEXTAUTH_SECRET}
volumes:
- app-data:/app/data
- app-logs:/app/logs
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
networks:
- deepwikiopen-network
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
# PostgreSQL database
postgres:
image: postgres:15-alpine
container_name: deepwikiopen-postgres
restart: unless-stopped
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=${POSTGRES_DB}
- POSTGRES_INITDB_ARGS=--encoding=UTF-8 --lc-collate=C --lc-ctype=C
volumes:
- postgres-data:/var/lib/postgresql/data
- ./docker/postgres/init:/docker-entrypoint-initdb.d
ports:
- "5432:5432"
networks:
- deepwikiopen-network
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
# Redis cache
redis:
image: redis:7-alpine
container_name: deepwikiopen-redis
restart: unless-stopped
command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD}
volumes:
- redis-data:/data
- ./docker/redis/redis.conf:/usr/local/etc/redis/redis.conf
ports:
- "6379:6379"
networks:
- deepwikiopen-network
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 3
# Elasticsearch for search
elasticsearch:
image: elasticsearch:8.8.0
container_name: deepwikiopen-elasticsearch
restart: unless-stopped
environment:
- discovery.type=single-node
- ES_JAVA_OPTS=-Xms512m -Xmx512m
- xpack.security.enabled=false
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
ports:
- "9200:9200"
networks:
- deepwikiopen-network
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:9200/_cluster/health"]
interval: 30s
timeout: 10s
retries: 3
# Nginx reverse proxy
nginx:
image: nginx:alpine
container_name: deepwikiopen-nginx
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- ./docker/nginx/nginx.conf:/etc/nginx/nginx.conf
- ./docker/nginx/conf.d:/etc/nginx/conf.d
- ./docker/ssl:/etc/nginx/ssl
- nginx-logs:/var/log/nginx
depends_on:
- app
networks:
- deepwikiopen-network
# Monitoring with Prometheus
prometheus:
image: prom/prometheus:latest
container_name: deepwikiopen-prometheus
restart: unless-stopped
ports:
- "9090:9090"
volumes:
- ./docker/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
networks:
- deepwikiopen-network
# Grafana for visualization
grafana:
image: grafana/grafana:latest
container_name: deepwikiopen-grafana
restart: unless-stopped
ports:
- "3001:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
volumes:
- grafana-data:/var/lib/grafana
- ./docker/grafana/provisioning:/etc/grafana/provisioning
networks:
- deepwikiopen-network
volumes:
app-data:
driver: local
app-logs:
driver: local
postgres-data:
driver: local
redis-data:
driver: local
elasticsearch-data:
driver: local
nginx-logs:
driver: local
prometheus-data:
driver: local
grafana-data:
driver: local
networks:
deepwikiopen-network:
driver: bridge
ipam:
config:
- subnet: 172.20.0.0/16
```
### Development Compose Setup
```yaml
# docker-compose.dev.yml
version: '3.8'
services:
app:
build:
context: .
dockerfile: Dockerfile.dev
container_name: deepwikiopen-dev
ports:
- "3000:3000"
- "9229:9229" # Node.js debugger
environment:
- NODE_ENV=development
- DATABASE_URL=postgresql://postgres:password@postgres:5432/deepwikiopen_dev
- REDIS_URL=redis://redis:6379
volumes:
- .:/app
- /app/node_modules
- dev-logs:/app/logs
depends_on:
- postgres
- redis
networks:
- dev-network
command: npm run dev
postgres:
image: postgres:15-alpine
container_name: deepwikiopen-postgres-dev
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=password
- POSTGRES_DB=deepwikiopen_dev
volumes:
- postgres-dev-data:/var/lib/postgresql/data
ports:
- "5432:5432"
networks:
- dev-network
redis:
image: redis:7-alpine
container_name: deepwikiopen-redis-dev
ports:
- "6379:6379"
networks:
- dev-network
volumes:
postgres-dev-data:
dev-logs:
networks:
dev-network:
driver: bridge
```
## Environment Configuration
### Environment Variables Structure
Create a comprehensive `.env` file:
```bash
# .env
# Application Settings
NODE_ENV=production
PORT=3000
APP_URL=https://your-domain.com
APP_NAME="DeepWikiOpen"
# Database Configuration
DATABASE_URL=postgresql://username:password@postgres:5432/deepwikiopen
DB_HOST=postgres
DB_PORT=5432
DB_NAME=deepwikiopen
DB_USER=username
DB_PASSWORD=secure_password
# Redis Configuration
REDIS_URL=redis://redis:6379
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=redis_password
# Authentication
JWT_SECRET=your-super-secret-jwt-key-change-this
NEXTAUTH_URL=https://your-domain.com
NEXTAUTH_SECRET=another-super-secret-key
# OAuth Providers
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret
GOOGLE_CLIENT_ID=your_google_client_id
GOOGLE_CLIENT_SECRET=your_google_client_secret
# Email Configuration
SMTP_HOST=smtp.gmail.com
SMTP_PORT=587
SMTP_USER=your-email@gmail.com
SMTP_PASS=your-app-password
# Storage Configuration
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_REGION=us-east-1
S3_BUCKET_NAME=your-s3-bucket
# Monitoring
SENTRY_DSN=https://your-sentry-dsn
NEW_RELIC_LICENSE_KEY=your-new-relic-key
# Security
CORS_ORIGIN=https://your-domain.com
RATE_LIMIT_MAX=100
RATE_LIMIT_WINDOW_MS=900000
# Feature Flags
ENABLE_SEARCH=true
ENABLE_ANALYTICS=true
ENABLE_NOTIFICATIONS=true
```
### Environment Files for Different Stages
```bash
# .env.local (development)
NODE_ENV=development
DATABASE_URL=postgresql://postgres:password@localhost:5432/deepwikiopen_dev
REDIS_URL=redis://localhost:6379
JWT_SECRET=dev-secret
# .env.staging
NODE_ENV=staging
DATABASE_URL=postgresql://user:pass@staging-db:5432/deepwikiopen_staging
REDIS_URL=redis://staging-redis:6379
# .env.production
NODE_ENV=production
DATABASE_URL=postgresql://user:pass@prod-db:5432/deepwikiopen
REDIS_URL=redis://prod-redis:6379
```
### Docker Compose Environment Override
```yaml
# docker-compose.override.yml
version: '3.8'
services:
app:
environment:
- DEBUG=true
- LOG_LEVEL=debug
volumes:
- ./logs:/app/logs
```
## Volume Mounts and Data Persistence
### Volume Types and Use Cases
```yaml
# Named volumes (recommended for production)
volumes:
# Database data persistence
postgres-data:
driver: local
driver_opts:
type: none
o: bind
device: /opt/deepwikiopen/postgres
# Application data
app-data:
driver: local
driver_opts:
type: none
o: bind
device: /opt/deepwikiopen/data
# Logs
app-logs:
driver: local
driver_opts:
type: none
o: bind
device: /opt/deepwikiopen/logs
# Backups
backup-data:
driver: local
driver_opts:
type: none
o: bind
device: /opt/deepwikiopen/backups
```
### Bind Mounts for Development
```yaml
services:
app:
volumes:
# Source code (development)
- ./src:/app/src
- ./public:/app/public
- ./package.json:/app/package.json
# Configuration files
- ./config:/app/config
- ./docker/app/entrypoint.sh:/app/entrypoint.sh
# Exclude node_modules
- /app/node_modules
```
### NFS Volumes for Multi-Host Setup
```yaml
volumes:
shared-data:
driver: local
driver_opts:
type: nfs
o: addr=nfs.example.com,rw
device: ":/path/to/shared/data"
```
### Volume Backup Strategy
```bash
# Backup script
#!/bin/bash
# Create backup directory
mkdir -p /backups/$(date +%Y%m%d)
# Backup PostgreSQL
docker exec deepwikiopen-postgres pg_dump -U postgres deepwikiopen > \
/backups/$(date +%Y%m%d)/postgres-backup.sql
# Backup volumes
docker run --rm \
-v deepwikiopen_postgres-data:/source \
-v /backups/$(date +%Y%m%d):/backup \
alpine tar czf /backup/postgres-data.tar.gz -C /source .
docker run --rm \
-v deepwikiopen_app-data:/source \
-v /backups/$(date +%Y%m%d):/backup \
alpine tar czf /backup/app-data.tar.gz -C /source .
```
## Container Networking
### Custom Bridge Network
```yaml
networks:
deepwikiopen-network:
driver: bridge
ipam:
driver: default
config:
- subnet: 172.20.0.0/16
gateway: 172.20.0.1
```
### Service Discovery
```yaml
services:
app:
networks:
deepwikiopen-network:
aliases:
- webapp
- api
postgres:
networks:
deepwikiopen-network:
aliases:
- database
- db
```
### Port Configuration
```yaml
services:
# Internal communication only
app-internal:
expose:
- "3000"
networks:
- internal
# External access
app-external:
ports:
- "80:3000" # HTTP
- "443:3000" # HTTPS
- "3000:3000" # Direct access
networks:
- external
networks:
internal:
driver: bridge
internal: true
external:
driver: bridge
```
### Network Security
```yaml
# docker-compose.security.yml
version: '3.8'
services:
app:
networks:
- frontend
- backend
postgres:
networks:
- backend
# No external ports exposed
redis:
networks:
- backend
# No external ports exposed
nginx:
networks:
- frontend
ports:
- "80:80"
- "443:443"
networks:
frontend:
driver: bridge
backend:
driver: bridge
internal: true
```
## Health Checks and Monitoring
### Application Health Checks
```dockerfile
# Dockerfile health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/api/health || exit 1
```
```yaml
# Docker Compose health checks
services:
app:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
```
### Custom Health Check Endpoint
```javascript
// pages/api/health.js
export default function handler(req, res) {
const checks = {
status: 'ok',
timestamp: new Date().toISOString(),
checks: {
database: 'checking...',
redis: 'checking...',
memory: process.memoryUsage(),
uptime: process.uptime()
}
};
// Check database connection
try {
// Your database check logic
checks.checks.database = 'healthy';
} catch (error) {
checks.checks.database = 'unhealthy';
checks.status = 'error';
}
// Check Redis connection
try {
// Your Redis check logic
checks.checks.redis = 'healthy';
} catch (error) {
checks.checks.redis = 'unhealthy';
checks.status = 'error';
}
const statusCode = checks.status === 'ok' ? 200 : 503;
res.status(statusCode).json(checks);
}
```
### Monitoring with Docker Stats
```bash
# Monitor container resources
docker stats deepwikiopen-app
# Get detailed container information
docker inspect deepwikiopen-app
# View container logs
docker logs -f deepwikiopen-app
# Execute commands in running container
docker exec -it deepwikiopen-app sh
```
### Prometheus Metrics
```yaml
# docker/prometheus/prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'deepwikiopen'
static_configs:
- targets: ['app:3000']
metrics_path: '/api/metrics'
- job_name: 'postgres'
static_configs:
- targets: ['postgres-exporter:9187']
- job_name: 'redis'
static_configs:
- targets: ['redis-exporter:9121']
- job_name: 'nginx'
static_configs:
- targets: ['nginx-exporter:9113']
```
## Scaling Strategies
### Docker Compose Scale
```bash
# Scale application horizontally
docker-compose up -d --scale app=3
# Scale with load balancer
docker-compose -f docker-compose.yml -f docker-compose.scale.yml up -d
```
```yaml
# docker-compose.scale.yml
version: '3.8'
services:
app:
deploy:
replicas: 3
nginx:
depends_on:
- app
volumes:
- ./docker/nginx/nginx-scale.conf:/etc/nginx/nginx.conf
```
### Load Balancer Configuration
```nginx
# docker/nginx/nginx-scale.conf
upstream app_servers {
server app_1:3000;
server app_2:3000;
server app_3:3000;
}
server {
listen 80;
server_name localhost;
location / {
proxy_pass http://app_servers;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
```
### Docker Swarm Setup
```bash
# Initialize Docker Swarm
docker swarm init
# Deploy stack
docker stack deploy -c docker-compose.swarm.yml deepwikiopen
# Scale services
docker service scale deepwikiopen_app=5
```
```yaml
# docker-compose.swarm.yml
version: '3.8'
services:
app:
image: ghcr.io/your-org/deepwikiopen:latest
deploy:
replicas: 3
placement:
constraints:
- node.role == worker
resources:
limits:
cpus: '0.5'
memory: 512M
reservations:
cpus: '0.25'
memory: 256M
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
update_config:
parallelism: 1
delay: 10s
failure_action: rollback
order: start-first
postgres:
image: postgres:15-alpine
deploy:
replicas: 1
placement:
constraints:
- node.role == manager
resources:
limits:
memory: 1G
reservations:
memory: 512M
networks:
deepwikiopen-network:
driver: overlay
attachable: true
```
### Auto-scaling with Docker Swarm
```bash
# Create auto-scaling service
docker service create \
--name deepwikiopen-app \
--replicas 2 \
--limit-cpu 0.5 \
--limit-memory 512m \
--reserve-cpu 0.25 \
--reserve-memory 256m \
--update-parallelism 1 \
--update-delay 10s \
ghcr.io/your-org/deepwikiopen:latest
```
## Security Considerations
### Container Security Best Practices
```dockerfile
# Use non-root user
FROM node:18-alpine
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
# Set proper file permissions
COPY --chown=nextjs:nodejs . .
USER nextjs
# Use read-only root filesystem
docker run --read-only --tmpfs /tmp deepwikiopen:latest
# Drop capabilities
docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE deepwikiopen:latest
# Set security options
docker run --security-opt=no-new-privileges:true deepwikiopen:latest
```
### Secrets Management
```yaml
# docker-compose.secrets.yml
version: '3.8'
services:
app:
secrets:
- db_password
- jwt_secret
environment:
- DATABASE_PASSWORD_FILE=/run/secrets/db_password
- JWT_SECRET_FILE=/run/secrets/jwt_secret
secrets:
db_password:
file: ./secrets/db_password.txt
jwt_secret:
file: ./secrets/jwt_secret.txt
```
### Network Security
```yaml
services:
app:
networks:
- frontend
# Only expose necessary ports
expose:
- "3000"
postgres:
networks:
- backend
# No external ports
# Use internal network only
networks:
frontend:
driver: bridge
backend:
driver: bridge
internal: true # No external access
```
### Security Scanning
```bash
# Scan images for vulnerabilities
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
-v /tmp:/tmp anchore/grype:latest \
ghcr.io/your-org/deepwikiopen:latest
# Use Trivy for comprehensive scanning
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
aquasec/trivy image ghcr.io/your-org/deepwikiopen:latest
```
### Resource Limits
```yaml
services:
app:
deploy:
resources:
limits:
cpus: '0.50'
memory: 512M
pids: 100
reservations:
cpus: '0.25'
memory: 256M
ulimits:
nofile:
soft: 1024
hard: 2048
```
## Troubleshooting
### Common Issues and Solutions
#### Container Won't Start
```bash
# Check container logs
docker logs deepwikiopen-app
# Check container configuration
docker inspect deepwikiopen-app
# Debug with shell access
docker run -it --entrypoint /bin/sh ghcr.io/your-org/deepwikiopen:latest
```
#### Database Connection Issues
```bash
# Test database connection
docker exec deepwikiopen-app nc -zv postgres 5432
# Check PostgreSQL logs
docker logs deepwikiopen-postgres
# Test with psql
docker exec -it deepwikiopen-postgres psql -U postgres -d deepwikiopen
```
#### Memory Issues
```bash
# Monitor memory usage
docker stats deepwikiopen-app
# Check for memory leaks
docker exec deepwikiopen-app cat /proc/meminfo
# Increase memory limits
docker run -m 1g deepwikiopen:latest
```
#### Permission Issues
```bash
# Check file permissions
docker exec deepwikiopen-app ls -la /app
# Fix ownership
docker exec deepwikiopen-app chown -R nextjs:nodejs /app/data
# Run as different user
docker run --user 1001:1001 deepwikiopen:latest
```
### Debugging Tools
```bash
# Enter running container
docker exec -it deepwikiopen-app sh
# Copy files from container
docker cp deepwikiopen-app:/app/logs ./local-logs
# Run health checks manually
docker exec deepwikiopen-app curl -f http://localhost:3000/api/health
# Check network connectivity
docker exec deepwikiopen-app nslookup postgres
docker exec deepwikiopen-app ping redis
```
### Performance Debugging
```bash
# Monitor container performance
docker stats --no-stream deepwikiopen-app
# Profile application
docker exec deepwikiopen-app node --prof app.js
# Check disk usage
docker exec deepwikiopen-app df -h
docker system df
```
## Production Deployments
### Production-Ready Compose File
```yaml
# docker-compose.prod.yml
version: '3.8'
x-common-variables: &common-variables
POSTGRES_DB: ${POSTGRES_DB:-deepwikiopen}
POSTGRES_USER: ${POSTGRES_USER:-postgres}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
services:
app:
image: ghcr.io/your-org/deepwikiopen:${TAG:-latest}
restart: unless-stopped
environment:
<<: *common-variables
NODE_ENV: production
DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
volumes:
- app-data:/app/data:rw
- app-logs:/app/logs:rw
networks:
- app-network
depends_on:
postgres:
condition: service_healthy
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/api/health"]
interval: 30s
timeout: 10s
retries: 3
deploy:
resources:
limits:
memory: 1G
cpus: '1.0'
reservations:
memory: 512M
cpus: '0.5'
postgres:
image: postgres:15-alpine
restart: unless-stopped
environment:
<<: *common-variables
POSTGRES_INITDB_ARGS: "--encoding=UTF-8 --lc-collate=C --lc-ctype=C"
volumes:
- postgres-data:/var/lib/postgresql/data:rw
- postgres-backups:/backups:rw
networks:
- app-network
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
memory: 2G
cpus: '1.0'
redis:
image: redis:7-alpine
restart: unless-stopped
command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD}
volumes:
- redis-data:/data:rw
networks:
- app-network
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 30s
timeout: 3s
retries: 3
nginx:
image: nginx:alpine
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/ssl:/etc/nginx/ssl:ro
- nginx-logs:/var/log/nginx:rw
depends_on:
- app
networks:
- app-network
volumes:
app-data:
driver: local
driver_opts:
type: none
o: bind
device: /opt/deepwikiopen/data
app-logs:
driver: local
driver_opts:
type: none
o: bind
device: /opt/deepwikiopen/logs
postgres-data:
driver: local
driver_opts:
type: none
o: bind
device: /opt/deepwikiopen/postgres
postgres-backups:
driver: local
driver_opts:
type: none
o: bind
device: /opt/deepwikiopen/backups
redis-data:
driver: local
nginx-logs:
driver: local
networks:
app-network:
driver: bridge
```
### Deployment Scripts
```bash
#!/bin/bash
# deploy.sh
set -e
echo "🚀 Starting production deployment..."
# Load environment variables
source .env.production
# Pull latest images
echo "📥 Pulling latest images..."
docker-compose -f docker-compose.prod.yml pull
# Create backup before deployment
echo "💾 Creating backup..."
./scripts/backup.sh
# Stop services gracefully
echo "🛑 Stopping services..."
docker-compose -f docker-compose.prod.yml down --remove-orphans
# Start services
echo "▶️ Starting services..."
docker-compose -f docker-compose.prod.yml up -d
# Wait for health checks
echo "⏳ Waiting for services to be healthy..."
timeout 300 docker-compose -f docker-compose.prod.yml exec app \
bash -c 'while [[ "$(curl -s -o /dev/null -w ''%{http_code}'' localhost:3000/api/health)" != "200" ]]; do sleep 5; done'
# Run database migrations
echo "🔄 Running database migrations..."
docker-compose -f docker-compose.prod.yml exec app npm run migrate
# Clean up old images
echo "🧹 Cleaning up..."
docker image prune -f
echo "✅ Deployment completed successfully!"
```
### Zero-Downtime Deployment
```bash
#!/bin/bash
# zero-downtime-deploy.sh
set -e
# Blue-green deployment script
CURRENT_COLOR=$(docker-compose -f docker-compose.prod.yml ps -q app | head -1)
NEW_COLOR=$([ "$CURRENT_COLOR" == "blue" ] && echo "green" || echo "blue")
echo "🔄 Starting zero-downtime deployment (switching to $NEW_COLOR)..."
# Start new version alongside current
docker-compose -f docker-compose.$NEW_COLOR.yml up -d
# Health check new version
echo "⏳ Waiting for new version to be ready..."
timeout 300 bash -c "until curl -f http://localhost:3001/api/health; do sleep 5; done"
# Switch traffic
echo "🔀 Switching traffic..."
./scripts/switch-traffic.sh $NEW_COLOR
# Stop old version
echo "🛑 Stopping old version..."
docker-compose -f docker-compose.$CURRENT_COLOR.yml down
echo "✅ Zero-downtime deployment completed!"
```
### Monitoring and Alerting
```yaml
# monitoring/docker-compose.monitoring.yml
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
ports:
- "9090:9090"
grafana:
image: grafana/grafana:latest
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
volumes:
- grafana-data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
ports:
- "3001:3000"
alertmanager:
image: prom/alertmanager:latest
volumes:
- ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
ports:
- "9093:9093"
volumes:
prometheus-data:
grafana-data:
```
### SSL/TLS Configuration
```nginx
# nginx/nginx.conf
server {
listen 443 ssl http2;
server_name your-domain.com;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers off;
ssl_session_cache shared:SSL:10m;
location / {
proxy_pass http://app:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
server {
listen 80;
server_name your-domain.com;
return 301 https://$server_name$request_uri;
}
```
This comprehensive Docker deployment guide covers everything from basic setups to production-grade deployments. Use the appropriate sections based on your deployment needs and gradually implement more advanced features as your application grows.
Remember to:
* Always test deployments in a staging environment first
* Keep your images updated with security patches
* Monitor resource usage and adjust limits accordingly
* Implement proper backup and disaster recovery procedures
* Use secrets management for sensitive data
* Regular security audits of your containers and configurations
# Internationalization (i18n) Guide
Source: https://asyncfunc.mintlify.app/guides/internationalization
Comprehensive guide to DeepWiki's multi-language support, language detection, and global configuration options
DeepWiki offers robust internationalization support with built-in language detection,
comprehensive translation coverage, and intelligent AI model responses in your preferred language.
## Overview
DeepWiki provides comprehensive internationalization (i18n) support designed for global teams and diverse repositories. The platform automatically detects languages, provides localized interfaces, and generates documentation that respects cultural and linguistic preferences.
### Key Features
* **10+ Supported Languages**: Native support for major world languages
* **Automatic Language Detection**: Smart detection from browser settings and repository content
* **Contextual AI Responses**: AI models understand and respond in the appropriate language
* **Cultural Adaptation**: Documentation generated with cultural considerations
* **Multi-language Repository Handling**: Support for repositories containing multiple languages
## Supported Languages
DeepWiki currently supports the following languages:
**Code**: `en`
Default language with full feature support
**Code**: `zh`
中文 - Full localization support
**Code**: `zh-tw`
繁體中文 - Traditional Chinese variant
**Code**: `ja`
日本語 - Complete Japanese localization
**Code**: `kr`
한국어 - Korean language support
**Code**: `es`
Español - Spanish localization
**Code**: `vi`
Tiếng Việt - Vietnamese support
**Code**: `pt-br`
Português Brasileiro - Brazilian variant
**Code**: `fr`
Français - French localization
**Code**: `ru`
Русский - Russian language support
## Language Detection
### Automatic Browser Detection
DeepWiki automatically detects your preferred language using a sophisticated algorithm:
```typescript
// Language detection priority
1. Stored user preference (localStorage)
2. Browser language settings (navigator.language)
3. Repository primary language
4. System default (English)
```
The system examines `navigator.language` and extracts the language code:
```javascript
// Examples of browser language detection
'en-US' → 'en' (English)
'ja-JP' → 'ja' (Japanese)
'zh-CN' → 'zh' (Simplified Chinese)
'zh-TW' → 'zh-tw' (Traditional Chinese)
'es-ES' → 'es' (Spanish)
```
When a language isn't supported, DeepWiki uses intelligent fallbacks:
```javascript
// Fallback examples
'pt-PT' → 'pt-br' (Portuguese variants)
'zh-HK' → 'zh-tw' (Traditional Chinese variants)
'en-GB' → 'en' (English variants)
'fr-CA' → 'fr' (French variants)
```
### Repository Language Detection
For documentation generation, DeepWiki analyzes repository content:
Detects the main programming language used in the repository
Identifies existing documentation language (README, comments)
Considers regional coding patterns and naming conventions
## UI Language Configuration
### Manual Language Selection
Users can manually override automatic detection through the language selector:
```tsx Language Selector Component
import { useLanguage } from '@/contexts/LanguageContext';
function LanguageSelector() {
const { language, setLanguage, supportedLanguages } = useLanguage();
return (
);
}
```
```javascript Storage Persistence
// Language preference is automatically stored
localStorage.setItem('language', 'ja'); // Persists across sessions
document.documentElement.lang = 'ja'; // Updates HTML lang attribute
```
### Language Context Integration
```tsx
import { useLanguage } from '@/contexts/LanguageContext';
function MyComponent() {
const { messages, language } = useLanguage();
return (
{messages.common.appName}
Current language: {language}
);
}
```
```typescript
import { getRequestConfig } from 'next-intl/server';
export default getRequestConfig(async ({ locale }) => {
return {
locale: locale,
messages: (await import(`./messages/${locale}.json`)).default
};
});
```
## API Language Configuration
### Request Headers
Configure language preferences through API requests:
```bash
curl -X POST https://api.deepwiki.ai/generate-wiki \
-H "Accept-Language: ja,en;q=0.9" \
-H "Content-Type: application/json" \
-d '{
"repo_url": "owner/repo",
"language": "ja",
"wiki_language": "ja"
}'
```
### Configuration Files
Set default languages in your configuration:
```json lang.json
{
"supported_languages": {
"en": "English",
"ja": "Japanese (日本語)",
"zh": "Mandarin Chinese (中文)",
"zh-tw": "Traditional Chinese (繁體中文)",
"es": "Spanish (Español)",
"kr": "Korean (한국어)",
"vi": "Vietnamese (Tiếng Việt)",
"pt-br": "Brazilian Portuguese (Português Brasileiro)",
"fr": "Français (French)",
"ru": "Русский (Russian)"
},
"default": "en"
}
```
```yaml Environment Variables
# .env.local
DEEPWIKI_DEFAULT_LANGUAGE=en
DEEPWIKI_UI_LANGUAGE=auto
DEEPWIKI_WIKI_LANGUAGE=auto
DEEPWIKI_AI_LANGUAGE_CONTEXT=true
```
## AI Model Language Context
### Language-Aware Generation
DeepWiki's AI models understand linguistic context and generate appropriate documentation:
```markdown
# User Authentication System
This module provides secure user authentication with JWT tokens.
## Features
- Password hashing with bcrypt
- JWT token generation and validation
- Role-based access control (RBAC)
```
```markdown
# ユーザー認証システム
このモジュールはJWTトークンを使用した安全なユーザー認証を提供します。
## 機能
- bcryptによるパスワードハッシュ化
- JWTトークンの生成と検証
- ロールベースアクセス制御(RBAC)
```
```markdown
# 用户身份验证系统
该模块提供基于JWT令牌的安全用户身份验证。
## 功能特性
- 使用bcrypt进行密码哈希加密
- JWT令牌生成和验证
- 基于角色的访问控制(RBAC)
```
### Model Configuration
Configure AI models for multilingual responses:
```json
{
"generator": {
"language_context": true,
"cultural_adaptation": true,
"preserve_technical_terms": true,
"localization_depth": "comprehensive"
},
"embedder": {
"multilingual_embeddings": true,
"cross_language_similarity": true
}
}
```
## Multi-Language Repository Handling
### Repository Analysis
DeepWiki intelligently handles repositories with multiple languages:
Identifies the main programming language and documentation language
Recognizes additional languages and their contexts
Appropriately processes files with mixed language content
### Documentation Strategy
Generate single documentation in the dominant language with technical terms preserved
Create separate documentation versions for major languages in the repository
### Code Examples
```typescript
// English-dominant repository with Japanese comments
class UserService {
/**
* ユーザーを作成します
* Creates a new user account
*/
async createUser(userData: UserData): Promise {
// データバリデーション (Data validation)
const validatedData = this.validateUserData(userData);
// ユーザー保存 (Save user)
return await this.userRepository.save(validatedData);
}
}
```
Generated documentation preserves both contexts:
```markdown
## UserService クラス
`UserService` class provides comprehensive user management functionality with bilingual support.
### createUser メソッド
Creates a new user account with data validation.
**Parameters:**
- `userData: UserData` - User information to be processed
**Returns:**
- `Promise` - Created user object
**Implementation Notes:**
- データバリデーション (Data validation) ensures input integrity
- ユーザー保存 (User saving) persists data to repository
```
## Cultural Considerations
### Regional Preferences
DeepWiki adapts to regional documentation preferences:
```markdown
# Quick Start Guide
Get started with DeepWiki in 3 easy steps:
1. Clone the repository
2. Install dependencies
3. Run the application
## Prerequisites
- Node.js 18+
- Git
```
```markdown
# クイックスタートガイド
DeepWikiを3つの簡単なステップで始めましょう:
【手順1】リポジトリをクローンする
【手順2】依存関係をインストールする
【手順3】アプリケーションを実行する
## 前提条件
- Node.js 18以上
- Git
```
### Technical Term Handling
Keep technical terms in original language
**Example**: `useState` remains `useState` in all languages
Provide local explanations for complex terms
**Example**: "JWT (JSON Web Token / JSONウェブトークン)"
## Best Practices for International Teams
### 1. Language Strategy
Choose a primary language for technical documentation (usually English)
Decide which content needs full localization vs. technical preservation
Establish terminology guidelines for mixed-language projects
### 2. Repository Organization
```
docs/
├── en/ # English documentation
├── ja/ # Japanese documentation
├── zh/ # Chinese documentation
└── shared/ # Language-neutral resources
├── diagrams/
└── code-samples/
```
### 3. Development Guidelines
```typescript
// Use primary language for code comments
// 主要言語でコードコメントを記述
/**
* User authentication service
* ユーザー認証サービス
* @param credentials - Login credentials / ログイン資格情報
*/
```
```yaml
# Generate API docs in multiple languages
openapi: 3.0.0
info:
title:
en: "DeepWiki API"
ja: "DeepWiki API"
zh: "DeepWiki API"
description:
en: "AI-powered documentation generation API"
ja: "AI駆動ドキュメント生成API"
zh: "AI驱动的文档生成API"
```
### 4. Quality Assurance
Have native speakers review translated documentation for accuracy and cultural appropriateness
Ensure technical terms and concepts remain accurate across all languages
Use automated tools to check consistency between language versions
Test documentation with users from different linguistic backgrounds
## Advanced Configuration
### Custom Language Support
Extend DeepWiki with additional languages:
```json Add New Language
{
"supported_languages": {
"de": "Deutsch (German)",
"it": "Italiano (Italian)",
"nl": "Nederlands (Dutch)"
}
}
```
```javascript Message Files
// messages/de.json
{
"common": {
"appName": "DeepWiki-Open",
"generateWiki": "Wiki Generieren",
"loading": "Laden..."
}
}
```
### Language-Specific Features
```json
{
"language_features": {
"ja": {
"ruby_annotations": true,
"vertical_text_support": true
},
"zh": {
"traditional_characters": true,
"pinyin_support": true
},
"ar": {
"rtl_support": true,
"arabic_numerals": true
}
}
}
```
## Troubleshooting
### Common Issues
**Solution**: Check browser language settings and clear localStorage:
```javascript
localStorage.removeItem('language');
location.reload();
```
**Cause**: Repository contains multiple languages without clear primary language.
**Solution**: Explicitly set the wiki language in generation settings.
**Check**:
* API language header configuration
* Model language context settings
* Repository language detection accuracy
**Solution**: Add missing keys to message files:
```json
// messages/[lang].json
{
"newFeature": {
"title": "Translated Title",
"description": "Translated Description"
}
}
```
### Debug Mode
Enable language debugging for troubleshooting:
```javascript
// Enable in browser console
localStorage.setItem('deepwiki_debug_i18n', 'true');
// View language detection logs
console.log('Browser language:', navigator.language);
console.log('Detected language:', detectedLanguage);
console.log('Available languages:', supportedLanguages);
```
## Migration Guide
### Updating Language Support
When upgrading DeepWiki versions:
```bash
cp api/config/lang.json api/config/lang.json.backup
```
Merge new language options with existing preferences
Add new translation keys to all language files
Verify all languages work correctly after update
Always test language functionality after updates, as new features may require additional translations.
## API Reference
### Language Configuration Endpoints
```bash
# Get supported languages
GET /api/lang/config
# Set user language preference
POST /api/user/language
{
"language": "ja",
"persist": true
}
# Generate wiki with specific language
POST /api/wiki/generate
{
"repo_url": "owner/repo",
"language": "zh",
"wiki_language": "zh"
}
```
### Language Detection API
```bash
# Detect repository language
POST /api/detect/language
{
"repo_url": "owner/repo"
}
# Response
{
"primary_language": "JavaScript",
"documentation_language": "en",
"suggested_wiki_language": "en",
"confidence": 0.95
}
```
DeepWiki's internationalization system ensures that teams around the world can generate high-quality documentation in their preferred languages while maintaining technical accuracy and cultural appropriateness.
# null
Source: https://asyncfunc.mintlify.app/guides/manual-setup
# Manual Setup Guide
A comprehensive guide for developers who prefer hands-on control over their DeepWiki-Open development environment.
## Prerequisites
Before starting, ensure you have the following installed on your system:
* **Python 3.12+** (Required by pyproject.toml)
* **Node.js 18+** (Required for Next.js)
* **Git** (For repository cloning)
* **Basic terminal/command line knowledge**
## 1. Environment Setup
### 1.1 Python Environment Setup
#### Option A: Using Virtual Environment (Recommended)
```bash
# Create a virtual environment
python -m venv deepwiki-env
# Activate the virtual environment
# On Windows:
deepwiki-env\Scripts\activate
# On macOS/Linux:
source deepwiki-env/bin/activate
# Verify Python version
python --version # Should be 3.12+
```
#### Option B: Using Conda
```bash
# Create conda environment
conda create -n deepwiki python=3.12
conda activate deepwiki
# Verify installation
python --version
which python # Should point to conda environment
```
#### Option C: Using pyenv (Advanced)
```bash
# Install Python 3.12 if not available
pyenv install 3.12.0
pyenv local 3.12.0
# Create virtual environment
python -m venv deepwiki-env
source deepwiki-env/bin/activate
```
### 1.2 Node.js and Package Manager Setup
#### Install Node.js
**Option A: Using Node Version Manager (Recommended)**
```bash
# Install nvm (macOS/Linux)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
source ~/.bashrc
# Install and use Node.js LTS
nvm install --lts
nvm use --lts
nvm alias default node
```
**Option B: Direct Installation**
Download from [nodejs.org](https://nodejs.org/) or use package managers:
```bash
# macOS with Homebrew
brew install node
# Ubuntu/Debian
curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -
sudo apt-get install -y nodejs
# CentOS/RHEL/Fedora
curl -fsSL https://rpm.nodesource.com/setup_lts.x | sudo bash -
sudo yum install -y nodejs
```
#### Choose Package Manager
```bash
# npm (comes with Node.js)
npm --version
# Yarn (optional, faster alternative)
npm install -g yarn
yarn --version
# pnpm (optional, efficient alternative)
npm install -g pnpm
pnpm --version
```
## 2. Project Setup
### 2.1 Clone and Initial Setup
```bash
# Clone the repository
git clone https://github.com/AsyncFuncAI/deepwiki-open.git
cd deepwiki-open
# Create necessary directories
mkdir -p logs
mkdir -p ~/.adalflow/{repos,databases,wikicache}
```
### 2.2 Python Dependencies Installation
#### Using pip with requirements.txt
```bash
# Ensure virtual environment is activated
# Install backend dependencies
pip install -r api/requirements.txt
# Verify installation
pip list | grep fastapi
pip list | grep uvicorn
```
#### Using uv (Modern Python Package Manager)
```bash
# Install uv if not available
pip install uv
# Install dependencies using uv
uv pip install -r api/requirements.txt
# Alternative: Use pyproject.toml
uv pip install -e .
```
#### Troubleshooting Python Dependencies
```bash
# If you encounter version conflicts
pip install --upgrade pip
pip install --no-cache-dir -r api/requirements.txt
# For Apple Silicon Macs (M1/M2)
pip install --no-cache-dir --compile --no-use-pep517 numpy
pip install -r api/requirements.txt
# For systems with limited resources
pip install --no-cache-dir -r api/requirements.txt
```
### 2.3 Node.js Dependencies Installation
```bash
# Using npm
npm install
# Using yarn
yarn install
# Using pnpm
pnpm install
# Verify installation
npm list --depth=0
# or
ls node_modules/
```
## 3. Environment Configuration
### 3.1 Environment Variables Setup
Create a `.env` file in the project root:
```bash
# Create .env file
touch .env
```
**Basic Configuration:**
```env
# Required API Keys (choose at least one)
GOOGLE_API_KEY=your_google_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
# Optional API Keys
OPENROUTER_API_KEY=your_openrouter_api_key_here
AZURE_OPENAI_API_KEY=your_azure_openai_api_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_VERSION=2023-12-01-preview
# Ollama Configuration (if using local models)
OLLAMA_HOST=http://localhost:11434
# Server Configuration
PORT=8001
SERVER_BASE_URL=http://localhost:8001
# Authorization (optional)
DEEPWIKI_AUTH_MODE=false
DEEPWIKI_AUTH_CODE=your_secret_code_here
# Logging Configuration
LOG_LEVEL=INFO
LOG_FILE_PATH=./api/logs/application.log
# Custom Configuration Directory (optional)
DEEPWIKI_CONFIG_DIR=./api/config
# OpenAI Base URL (for custom endpoints)
OPENAI_BASE_URL=https://api.openai.com/v1
```
**Development Configuration:**
```env
# Development-specific settings
LOG_LEVEL=DEBUG
NODE_ENV=development
NEXT_PUBLIC_API_URL=http://localhost:8001
```
**Production Configuration:**
```env
# Production-specific settings
LOG_LEVEL=WARNING
NODE_ENV=production
NEXT_PUBLIC_API_URL=https://your-domain.com/api
```
### 3.2 API Key Acquisition
#### Google AI Studio
1. Visit [Google AI Studio](https://makersuite.google.com/app/apikey)
2. Create a new project or select existing
3. Generate API key
4. Copy to `GOOGLE_API_KEY` in `.env`
#### OpenAI Platform
1. Visit [OpenAI Platform](https://platform.openai.com/api-keys)
2. Create account and add billing information
3. Generate new secret key
4. Copy to `OPENAI_API_KEY` in `.env`
#### OpenRouter
1. Visit [OpenRouter](https://openrouter.ai/)
2. Sign up and add credits
3. Generate API key from dashboard
4. Copy to `OPENROUTER_API_KEY` in `.env`
#### Azure OpenAI
1. Go to [Azure Portal](https://portal.azure.com/)
2. Create Azure OpenAI resource
3. Get keys and endpoint from resource
4. Configure all three Azure variables in `.env`
## 4. Database and Storage Setup
### 4.1 Local Storage Directories
DeepWiki-Open uses local file storage. Create required directories:
```bash
# Create storage directories
mkdir -p ~/.adalflow/repos # Cloned repositories
mkdir -p ~/.adalflow/databases # Vector embeddings
mkdir -p ~/.adalflow/wikicache # Generated wikis
mkdir -p ./api/logs # Application logs
# Set appropriate permissions
chmod 755 ~/.adalflow
chmod 755 ~/.adalflow/repos
chmod 755 ~/.adalflow/databases
chmod 755 ~/.adalflow/wikicache
chmod 755 ./api/logs
```
### 4.2 FAISS Vector Database
DeepWiki uses FAISS for vector storage (included in requirements):
```bash
# Verify FAISS installation
python -c "import faiss; print('FAISS version:', faiss.__version__)"
# For GPU acceleration (optional)
pip install faiss-gpu # Only if you have CUDA
```
### 4.3 Storage Configuration
Edit `api/config/embedder.json` to customize storage settings:
```json
{
"embedder": {
"model": "text-embedding-ada-002",
"provider": "openai"
},
"retriever": {
"similarity_top_k": 5,
"vector_store_type": "faiss"
},
"text_splitter": {
"type": "recursive_character",
"chunk_size": 1000,
"chunk_overlap": 200
}
}
```
## 5. Service Configuration
### 5.1 Backend API Configuration
#### FastAPI Server Settings
Create `api/config/server.json`:
```json
{
"host": "0.0.0.0",
"port": 8001,
"reload": true,
"workers": 1,
"log_config": {
"version": 1,
"disable_existing_loggers": false,
"formatters": {
"default": {
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
}
},
"handlers": {
"default": {
"formatter": "default",
"class": "logging.StreamHandler",
"stream": "ext://sys.stdout"
}
},
"root": {
"level": "INFO",
"handlers": ["default"]
}
}
}
```
#### CORS Configuration
The API allows all origins by default. For production, modify `api/api.py`:
```python
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:3000", "https://yourdomain.com"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
```
### 5.2 Frontend Configuration
#### Next.js Configuration
Edit `next.config.ts`:
```typescript
import type { NextConfig } from 'next';
const nextConfig: NextConfig = {
env: {
NEXT_PUBLIC_API_URL: process.env.NEXT_PUBLIC_API_URL || 'http://localhost:8001',
},
async rewrites() {
return [
{
source: '/api/:path*',
destination: `${process.env.NEXT_PUBLIC_API_URL}/api/:path*`,
},
];
},
};
export default nextConfig;
```
#### Internationalization Setup
Configure supported languages in `src/i18n.ts`:
```typescript
import {notFound} from 'next/navigation';
import {getRequestConfig} from 'next-intl/server';
export const locales = ['en', 'zh', 'ja', 'es', 'fr', 'ko', 'vi', 'pt-br', 'ru', 'zh-tw'];
export default getRequestConfig(async ({locale}) => {
if (!locales.includes(locale as any)) notFound();
return {
messages: (await import(`./messages/${locale}.json`)).default
};
});
```
## 6. Development vs Production Configurations
### 6.1 Development Configuration
**Backend Development:**
```bash
# Install development dependencies
pip install -r api/requirements.txt
pip install pytest black flake8 mypy # Additional dev tools
# Run in development mode
cd api
python -m uvicorn main:app --reload --port 8001 --log-level debug
```
**Frontend Development:**
```bash
# Enable development features
export NODE_ENV=development
export NEXT_PUBLIC_API_URL=http://localhost:8001
# Run development server
npm run dev
# or
yarn dev
```
**Development `.env`:**
```env
NODE_ENV=development
LOG_LEVEL=DEBUG
NEXT_PUBLIC_API_URL=http://localhost:8001
DEEPWIKI_AUTH_MODE=false
```
### 6.2 Production Configuration
**Backend Production:**
```bash
# Install production server
pip install gunicorn
# Create gunicorn configuration
touch gunicorn.conf.py
```
`gunicorn.conf.py`:
```python
import multiprocessing
bind = "0.0.0.0:8001"
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornWorker"
worker_connections = 1000
max_requests = 10000
max_requests_jitter = 1000
timeout = 300
keepalive = 5
preload_app = True
```
**Frontend Production:**
```bash
# Build for production
npm run build
# Start production server
npm start
```
**Production `.env`:**
```env
NODE_ENV=production
LOG_LEVEL=WARNING
NEXT_PUBLIC_API_URL=https://your-domain.com
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=your-secure-code
```
## 7. Process Management
### 7.1 Using PM2 (Recommended)
#### Install PM2
```bash
npm install -g pm2
```
#### Create PM2 Configuration
Create `ecosystem.config.js`:
```javascript
module.exports = {
apps: [
{
name: 'deepwiki-api',
script: 'python',
args: '-m uvicorn api.main:app --host 0.0.0.0 --port 8001',
cwd: '/path/to/deepwiki-open',
interpreter: '/path/to/deepwiki-env/bin/python',
env: {
NODE_ENV: 'production',
LOG_LEVEL: 'INFO'
},
instances: 1,
autorestart: true,
watch: false,
max_memory_restart: '2G',
error_file: './logs/api-error.log',
out_file: './logs/api-out.log',
log_file: './logs/api-combined.log'
},
{
name: 'deepwiki-frontend',
script: 'npm',
args: 'start',
cwd: '/path/to/deepwiki-open',
env: {
NODE_ENV: 'production',
PORT: 3000
},
instances: 1,
autorestart: true,
watch: false,
max_memory_restart: '1G',
error_file: './logs/frontend-error.log',
out_file: './logs/frontend-out.log',
log_file: './logs/frontend-combined.log'
}
]
};
```
#### PM2 Commands
```bash
# Start services
pm2 start ecosystem.config.js
# Monitor services
pm2 monit
# View logs
pm2 logs
# Restart services
pm2 restart all
# Stop services
pm2 stop all
# Save PM2 configuration
pm2 save
# Setup PM2 to start on boot
pm2 startup
```
### 7.2 Using systemd (Linux)
#### Backend Service
Create `/etc/systemd/system/deepwiki-api.service`:
```ini
[Unit]
Description=DeepWiki API Server
After=network.target
[Service]
Type=exec
User=yourusername
Group=yourusername
WorkingDirectory=/path/to/deepwiki-open
Environment=PATH=/path/to/deepwiki-env/bin
EnvironmentFile=/path/to/deepwiki-open/.env
ExecStart=/path/to/deepwiki-env/bin/python -m uvicorn api.main:app --host 0.0.0.0 --port 8001
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
#### Frontend Service
Create `/etc/systemd/system/deepwiki-frontend.service`:
```ini
[Unit]
Description=DeepWiki Frontend Server
After=network.target deepwiki-api.service
[Service]
Type=exec
User=yourusername
Group=yourusername
WorkingDirectory=/path/to/deepwiki-open
Environment=NODE_ENV=production
Environment=PORT=3000
EnvironmentFile=/path/to/deepwiki-open/.env
ExecStart=/usr/bin/npm start
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
#### systemd Commands
```bash
# Reload systemd configuration
sudo systemctl daemon-reload
# Enable services to start on boot
sudo systemctl enable deepwiki-api.service
sudo systemctl enable deepwiki-frontend.service
# Start services
sudo systemctl start deepwiki-api.service
sudo systemctl start deepwiki-frontend.service
# Check status
sudo systemctl status deepwiki-api.service
sudo systemctl status deepwiki-frontend.service
# View logs
sudo journalctl -u deepwiki-api.service -f
sudo journalctl -u deepwiki-frontend.service -f
```
## 8. Monitoring and Logging Setup
### 8.1 Application Logging
#### Python Logging Configuration
Create `api/logging_config.py`:
```python
import logging
import logging.handlers
import os
from pathlib import Path
def setup_logging():
log_level = os.getenv('LOG_LEVEL', 'INFO').upper()
log_file = os.getenv('LOG_FILE_PATH', './api/logs/application.log')
# Create logs directory
Path(log_file).parent.mkdir(parents=True, exist_ok=True)
# Configure logging
logging.basicConfig(
level=getattr(logging, log_level),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.StreamHandler(),
logging.handlers.RotatingFileHandler(
log_file,
maxBytes=10*1024*1024, # 10MB
backupCount=5
)
]
)
```
#### Next.js Logging
Create `src/utils/logger.ts`:
```typescript
interface LogEntry {
timestamp: string;
level: 'info' | 'warn' | 'error' | 'debug';
message: string;
data?: any;
}
class Logger {
private isDevelopment = process.env.NODE_ENV === 'development';
private log(level: LogEntry['level'], message: string, data?: any) {
const entry: LogEntry = {
timestamp: new Date().toISOString(),
level,
message,
data
};
if (this.isDevelopment) {
console[level](entry);
}
// Send to backend logging endpoint in production
if (!this.isDevelopment && level === 'error') {
this.sendToServer(entry);
}
}
private async sendToServer(entry: LogEntry) {
try {
await fetch('/api/logs', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(entry)
});
} catch (error) {
console.error('Failed to send log to server:', error);
}
}
info(message: string, data?: any) {
this.log('info', message, data);
}
warn(message: string, data?: any) {
this.log('warn', message, data);
}
error(message: string, data?: any) {
this.log('error', message, data);
}
debug(message: string, data?: any) {
this.log('debug', message, data);
}
}
export const logger = new Logger();
```
### 8.2 Health Monitoring
#### Health Check Endpoint
Add to `api/api.py`:
```python
@app.get("/health")
async def health_check():
return {
"status": "healthy",
"timestamp": datetime.utcnow().isoformat(),
"version": "0.1.0",
"services": {
"api": "running",
"storage": "accessible" if os.path.exists(os.path.expanduser("~/.adalflow")) else "unavailable"
}
}
```
#### Monitoring Script
Create `scripts/monitor.py`:
```python
#!/usr/bin/env python3
import requests
import time
import sys
import os
def check_service(url, service_name):
try:
response = requests.get(url, timeout=10)
if response.status_code == 200:
print(f"✅ {service_name} is healthy")
return True
else:
print(f"❌ {service_name} returned status {response.status_code}")
return False
except requests.exceptions.RequestException as e:
print(f"❌ {service_name} is unreachable: {e}")
return False
def main():
api_url = os.getenv('SERVER_BASE_URL', 'http://localhost:8001')
frontend_url = os.getenv('FRONTEND_URL', 'http://localhost:3000')
services = [
(f"{api_url}/health", "API Server"),
(frontend_url, "Frontend Server")
]
all_healthy = True
for url, name in services:
if not check_service(url, name):
all_healthy = False
if not all_healthy:
sys.exit(1)
print("🎉 All services are healthy!")
if __name__ == "__main__":
main()
```
### 8.3 Performance Monitoring
#### Simple Performance Tracking
Create `scripts/performance_monitor.sh`:
```bash
#!/bin/bash
# Configuration
API_URL="http://localhost:8001"
LOG_FILE="./logs/performance.log"
# Create logs directory
mkdir -p logs
# Function to log with timestamp
log_with_timestamp() {
echo "$(date '+%Y-%m-%d %H:%M:%S') $1" >> "$LOG_FILE"
}
# Monitor API response time
monitor_api() {
start_time=$(date +%s.%N)
response=$(curl -s -w "%{http_code}" -o /dev/null "$API_URL/health")
end_time=$(date +%s.%N)
response_time=$(echo "$end_time - $start_time" | bc)
if [ "$response" = "200" ]; then
log_with_timestamp "API_HEALTH_OK response_time=${response_time}s"
else
log_with_timestamp "API_HEALTH_ERROR http_code=$response"
fi
}
# Monitor system resources
monitor_resources() {
# CPU usage
cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
# Memory usage
memory_usage=$(free | grep Mem | awk '{printf "%.1f", $3/$2 * 100.0}')
# Disk usage
disk_usage=$(df -h . | tail -1 | awk '{print $5}' | cut -d'%' -f1)
log_with_timestamp "RESOURCES cpu=${cpu_usage}% memory=${memory_usage}% disk=${disk_usage}%"
}
# Main monitoring loop
while true; do
monitor_api
monitor_resources
sleep 60 # Monitor every minute
done
```
## 9. Backup and Maintenance
### 9.1 Data Backup Strategy
#### Backup Script
Create `scripts/backup.sh`:
```bash
#!/bin/bash
# Configuration
BACKUP_DIR="$HOME/deepwiki-backups"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_NAME="deepwiki_backup_$DATE"
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Backup function
create_backup() {
echo "🔄 Starting backup process..."
# Create backup folder
BACKUP_PATH="$BACKUP_DIR/$BACKUP_NAME"
mkdir -p "$BACKUP_PATH"
# Backup configuration
echo "📁 Backing up configuration..."
cp -r api/config "$BACKUP_PATH/"
cp .env "$BACKUP_PATH/" 2>/dev/null || echo "No .env file found"
# Backup generated wikis
echo "📚 Backing up wiki cache..."
if [ -d "$HOME/.adalflow/wikicache" ]; then
cp -r "$HOME/.adalflow/wikicache" "$BACKUP_PATH/"
fi
# Backup vector databases
echo "🗄️ Backing up databases..."
if [ -d "$HOME/.adalflow/databases" ]; then
cp -r "$HOME/.adalflow/databases" "$BACKUP_PATH/"
fi
# Backup logs
echo "📊 Backing up logs..."
cp -r logs "$BACKUP_PATH/" 2>/dev/null || echo "No logs directory found"
# Create archive
echo "🗜️ Creating archive..."
cd "$BACKUP_DIR"
tar -czf "$BACKUP_NAME.tar.gz" "$BACKUP_NAME"
rm -rf "$BACKUP_NAME"
echo "✅ Backup completed: $BACKUP_DIR/$BACKUP_NAME.tar.gz"
# Cleanup old backups (keep last 7 days)
find "$BACKUP_DIR" -name "deepwiki_backup_*.tar.gz" -mtime +7 -delete
echo "🧹 Cleaned up old backups"
}
# Restore function
restore_backup() {
if [ -z "$1" ]; then
echo "Usage: $0 restore "
exit 1
fi
BACKUP_FILE="$1"
if [ ! -f "$BACKUP_FILE" ]; then
echo "❌ Backup file not found: $BACKUP_FILE"
exit 1
fi
echo "🔄 Restoring from backup: $BACKUP_FILE"
# Extract backup
TEMP_DIR=$(mktemp -d)
tar -xzf "$BACKUP_FILE" -C "$TEMP_DIR"
# Restore configuration
echo "📁 Restoring configuration..."
cp -r "$TEMP_DIR"/*/config api/ 2>/dev/null || echo "No config backup found"
cp "$TEMP_DIR"/*/.env . 2>/dev/null || echo "No .env backup found"
# Restore wiki cache
echo "📚 Restoring wiki cache..."
mkdir -p "$HOME/.adalflow"
cp -r "$TEMP_DIR"/*/wikicache "$HOME/.adalflow/" 2>/dev/null || echo "No wikicache backup found"
# Restore databases
echo "🗄️ Restoring databases..."
cp -r "$TEMP_DIR"/*/databases "$HOME/.adalflow/" 2>/dev/null || echo "No databases backup found"
# Cleanup
rm -rf "$TEMP_DIR"
echo "✅ Restore completed"
}
# Main script
case "$1" in
"backup")
create_backup
;;
"restore")
restore_backup "$2"
;;
*)
echo "Usage: $0 {backup|restore }"
echo "Example: $0 backup"
echo "Example: $0 restore ~/deepwiki-backups/deepwiki_backup_20231201_120000.tar.gz"
exit 1
;;
esac
```
### 9.2 Maintenance Tasks
#### Database Cleanup Script
Create `scripts/maintenance.py`:
```python
#!/usr/bin/env python3
import os
import shutil
import glob
from datetime import datetime, timedelta
from pathlib import Path
def cleanup_old_repositories(days_old=30):
"""Remove repositories older than specified days"""
repos_dir = Path.home() / ".adalflow" / "repos"
if not repos_dir.exists():
print("No repositories directory found")
return
cutoff_date = datetime.now() - timedelta(days=days_old)
cleaned_count = 0
for repo_dir in repos_dir.iterdir():
if repo_dir.is_dir():
mod_time = datetime.fromtimestamp(repo_dir.stat().st_mtime)
if mod_time < cutoff_date:
print(f"Removing old repository: {repo_dir.name}")
shutil.rmtree(repo_dir)
cleaned_count += 1
print(f"Cleaned up {cleaned_count} old repositories")
def cleanup_old_wikis(days_old=30):
"""Remove wiki cache older than specified days"""
wiki_dir = Path.home() / ".adalflow" / "wikicache"
if not wiki_dir.exists():
print("No wiki cache directory found")
return
cutoff_date = datetime.now() - timedelta(days=days_old)
cleaned_count = 0
for wiki_file in wiki_dir.glob("*.json"):
mod_time = datetime.fromtimestamp(wiki_file.stat().st_mtime)
if mod_time < cutoff_date:
print(f"Removing old wiki: {wiki_file.name}")
wiki_file.unlink()
cleaned_count += 1
print(f"Cleaned up {cleaned_count} old wiki files")
def cleanup_logs(days_old=7):
"""Remove log files older than specified days"""
logs_dir = Path("logs")
if not logs_dir.exists():
print("No logs directory found")
return
cutoff_date = datetime.now() - timedelta(days=days_old)
cleaned_count = 0
for log_file in logs_dir.glob("*.log*"):
if log_file.is_file():
mod_time = datetime.fromtimestamp(log_file.stat().st_mtime)
if mod_time < cutoff_date:
print(f"Removing old log: {log_file.name}")
log_file.unlink()
cleaned_count += 1
print(f"Cleaned up {cleaned_count} old log files")
def optimize_vector_databases():
"""Optimize vector databases by removing unused indexes"""
db_dir = Path.home() / ".adalflow" / "databases"
if not db_dir.exists():
print("No databases directory found")
return
repos_dir = Path.home() / ".adalflow" / "repos"
active_repos = set()
if repos_dir.exists():
active_repos = {repo.name for repo in repos_dir.iterdir() if repo.is_dir()}
cleaned_count = 0
for db_dir_item in db_dir.iterdir():
if db_dir_item.is_dir() and db_dir_item.name not in active_repos:
print(f"Removing unused database: {db_dir_item.name}")
shutil.rmtree(db_dir_item)
cleaned_count += 1
print(f"Cleaned up {cleaned_count} unused databases")
def main():
print(f"🧹 Starting maintenance tasks at {datetime.now()}")
try:
cleanup_old_repositories(30)
cleanup_old_wikis(30)
cleanup_logs(7)
optimize_vector_databases()
print("✅ Maintenance tasks completed successfully")
except Exception as e:
print(f"❌ Error during maintenance: {e}")
if __name__ == "__main__":
main()
```
#### Automated Maintenance with Cron
Add to crontab (`crontab -e`):
```bash
# Daily maintenance at 2 AM
0 2 * * * /path/to/deepwiki-open/scripts/maintenance.py >> /path/to/deepwiki-open/logs/maintenance.log 2>&1
# Weekly backup on Sundays at 3 AM
0 3 * * 0 /path/to/deepwiki-open/scripts/backup.sh backup >> /path/to/deepwiki-open/logs/backup.log 2>&1
# Performance monitoring every minute
* * * * * /path/to/deepwiki-open/scripts/performance_monitor.sh
```
## 10. Troubleshooting
### 10.1 Common Issues and Solutions
#### Python Environment Issues
```bash
# Issue: ModuleNotFoundError
# Solution: Verify virtual environment activation
which python
pip list | grep fastapi
# Issue: Permission denied
# Solution: Check file permissions
chmod +x scripts/*.sh
chmod +x scripts/*.py
# Issue: Port already in use
# Solution: Find and kill process
lsof -ti:8001 | xargs kill -9
lsof -ti:3000 | xargs kill -9
```
#### Node.js Issues
```bash
# Issue: npm ERR! permission denied
# Solution: Use nvm or fix npm permissions
npm config set prefix '~/.npm-global'
export PATH=~/.npm-global/bin:$PATH
# Issue: Module not found
# Solution: Clear cache and reinstall
rm -rf node_modules package-lock.json
npm cache clean --force
npm install
```
#### API Connection Issues
```bash
# Check if services are running
curl -I http://localhost:8001/health
curl -I http://localhost:3000
# Check firewall settings
# Ubuntu/Debian
sudo ufw status
sudo ufw allow 8001
sudo ufw allow 3000
# CentOS/RHEL
sudo firewall-cmd --list-ports
sudo firewall-cmd --add-port=8001/tcp --permanent
sudo firewall-cmd --add-port=3000/tcp --permanent
sudo firewall-cmd --reload
```
### 10.2 Performance Optimization
#### System Optimization
```bash
# Increase file descriptor limits
echo "* soft nofile 65536" | sudo tee -a /etc/security/limits.conf
echo "* hard nofile 65536" | sudo tee -a /etc/security/limits.conf
# Optimize Python performance
export PYTHONUNBUFFERED=1
export PYTHONDONTWRITEBYTECODE=1
# Node.js optimization
export NODE_OPTIONS="--max-old-space-size=4096"
```
#### Application Optimization
Edit `api/main.py` for production optimizations:
```python
import uvicorn
from api.api import app
if __name__ == "__main__":
uvicorn.run(
"api.api:app",
host="0.0.0.0",
port=8001,
workers=4, # Adjust based on CPU cores
loop="uvloop", # Performance improvement
http="httptools", # Performance improvement
access_log=False, # Disable in production
server_header=False, # Security
date_header=False, # Performance
)
```
## 11. Security Considerations
### 11.1 API Security
#### Rate Limiting
Add to `api/api.py`:
```python
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
@app.get("/api/wiki/generate")
@limiter.limit("5/minute")
async def generate_wiki(request: Request, ...):
# Implementation
pass
```
#### Input Validation
```python
from pydantic import BaseModel, validator
import re
class RepositoryRequest(BaseModel):
repo_url: str
access_token: Optional[str] = None
@validator('repo_url')
def validate_repo_url(cls, v):
pattern = r'^https?://(github|gitlab|bitbucket)\.(com|org)/[\w\-\.]+/[\w\-\.]+/?$'
if not re.match(pattern, v):
raise ValueError('Invalid repository URL format')
return v
```
### 11.2 Environment Security
```bash
# Secure .env file
chmod 600 .env
# Use environment-specific configurations
# Development
export DEEPWIKI_ENV=development
# Production
export DEEPWIKI_ENV=production
```
## 12. Advanced Configuration
### 12.1 Custom Model Configurations
Edit `api/config/generator.json`:
```json
{
"providers": {
"google": {
"default_model": "gemini-2.0-flash",
"models": ["gemini-2.0-flash", "gemini-1.5-flash", "gemini-1.0-pro"],
"api_base": "https://generativelanguage.googleapis.com/v1beta",
"parameters": {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 8192
}
},
"openai": {
"default_model": "gpt-4o",
"models": ["gpt-4o", "gpt-4-turbo", "gpt-3.5-turbo"],
"api_base": "https://api.openai.com/v1",
"parameters": {
"temperature": 0.7,
"top_p": 1.0,
"max_tokens": 4096
}
}
}
}
```
### 12.2 Custom Embedding Configuration
Edit `api/config/embedder.json`:
```json
{
"embedder": {
"provider": "openai",
"model": "text-embedding-ada-002",
"dimensions": 1536,
"batch_size": 100
},
"retriever": {
"similarity_top_k": 5,
"similarity_threshold": 0.7,
"vector_store_type": "faiss",
"index_type": "IndexFlatIP"
},
"text_splitter": {
"type": "recursive_character",
"chunk_size": 1000,
"chunk_overlap": 200,
"separators": ["\n\n", "\n", " ", ""]
}
}
```
## Conclusion
This manual setup guide provides comprehensive control over your DeepWiki-Open installation. The manual approach offers:
* **Full Control**: Complete visibility into every component and configuration
* **Customization**: Ability to modify any aspect of the system
* **Debugging**: Direct access to logs and processes for troubleshooting
* **Performance Tuning**: Fine-grained control over resource allocation
* **Security**: Implementation of custom security measures
Choose the components and configurations that best fit your development workflow and production requirements. Regular maintenance and monitoring will ensure optimal performance and reliability of your DeepWiki-Open installation.
For additional support, refer to the project's GitHub repository or community forums.
# Mermaid Diagrams in DeepWiki
Source: https://asyncfunc.mintlify.app/guides/mermaid-diagrams
Comprehensive guide to creating, customizing, and managing Mermaid diagrams in DeepWiki documentation
# Mermaid Diagrams in DeepWiki
DeepWiki leverages the power of Mermaid to automatically generate beautiful, interactive diagrams from your codebase and documentation. This guide covers everything you need to know about working with Mermaid diagrams in DeepWiki.
## What are Mermaid Diagrams?
Mermaid is a JavaScript-based diagramming tool that renders markdown-inspired text definitions to create diagrams dynamically. In DeepWiki, these diagrams help visualize:
* Code architecture and relationships
* Data flow and process workflows
* System interactions and dependencies
* Database schemas and entity relationships
* Project timelines and milestones
### Benefits of Mermaid Diagrams
* **Version Control Friendly**: Text-based definitions that work with Git
* **Automatically Generated**: DeepWiki creates diagrams from your code structure
* **Interactive**: Zoom, pan, and click functionality
* **Customizable**: Extensive styling and theming options
* **Lightweight**: No external image files needed
* **Accessible**: Screen reader compatible with proper markup
## Types of Diagrams Generated
DeepWiki automatically generates various diagram types based on your codebase analysis:
### 1. Flowcharts
Perfect for visualizing code logic, decision trees, and process flows.
```mermaid
graph TD
A[Start] --> B{Is user authenticated?}
B -->|Yes| C[Load Dashboard]
B -->|No| D[Show Login Form]
D --> E[Validate Credentials]
E -->|Valid| F[Create Session]
E -->|Invalid| G[Show Error]
F --> C
G --> D
C --> H[End]
```
### 2. Sequence Diagrams
Ideal for API interactions, user flows, and system communications.
```mermaid
sequenceDiagram
participant User
participant Frontend
participant API
participant Database
User->>Frontend: Submit Form
Frontend->>API: POST /api/users
API->>Database: INSERT user data
Database-->>API: Success response
API-->>Frontend: User created (201)
Frontend-->>User: Success message
```
### 3. Class Diagrams
Generated from your codebase to show object-oriented relationships.
```mermaid
classDiagram
class User {
-String id
-String email
-String password
+login()
+logout()
+updateProfile()
}
class Admin {
-List~Permission~ permissions
+manageUsers()
+viewAnalytics()
}
class Article {
-String title
-String content
-Date publishedAt
+publish()
+archive()
}
User <|-- Admin
User ||--o{ Article : creates
```
### 4. Entity Relationship Diagrams
Database schema visualization from your models and migrations.
```mermaid
erDiagram
USERS {
uuid id PK
string email UK
string password_hash
timestamp created_at
timestamp updated_at
}
ARTICLES {
uuid id PK
uuid author_id FK
string title
text content
enum status
timestamp published_at
timestamp created_at
timestamp updated_at
}
COMMENTS {
uuid id PK
uuid article_id FK
uuid user_id FK
text content
timestamp created_at
}
USERS ||--o{ ARTICLES : writes
ARTICLES ||--o{ COMMENTS : has
USERS ||--o{ COMMENTS : writes
```
### 5. State Diagrams
System state transitions and workflow states.
```mermaid
stateDiagram-v2
[*] --> Draft
Draft --> Review : submit
Review --> Approved : approve
Review --> Rejected : reject
Review --> Draft : request_changes
Approved --> Published : publish
Published --> Archived : archive
Rejected --> Draft : revise
Archived --> [*]
```
### 6. Git Graphs
Version control branch visualization.
```mermaid
gitgraph
commit
commit
branch feature
checkout feature
commit
commit
checkout main
commit
merge feature
commit
branch hotfix
checkout hotfix
commit
checkout main
merge hotfix
```
### 7. Gantt Charts
Project timelines and development schedules.
```mermaid
gantt
title DeepWiki Development Timeline
dateFormat YYYY-MM-DD
section Phase 1
Research & Planning :a1, 2024-01-01, 30d
Design Architecture :a2, after a1, 20d
section Phase 2
Core Development :b1, after a2, 45d
Testing & QA :b2, after b1, 15d
section Phase 3
Documentation :c1, after b2, 10d
Deployment :c2, after c1, 5d
```
## Automatic Diagram Generation
DeepWiki's AI analyzes your codebase and automatically generates appropriate diagrams:
### Code Analysis Process
1. **File Structure Analysis**: Scans directories, files, and imports
2. **Dependency Mapping**: Identifies relationships between modules
3. **Data Flow Detection**: Traces function calls and data transformations
4. **Pattern Recognition**: Identifies common architectural patterns
5. **Diagram Selection**: Chooses optimal diagram types for each context
### Automatic Generation Features
* **Smart Context Detection**: Different diagrams for different code contexts
* **Relationship Inference**: Automatically detects class inheritance and composition
* **API Flow Mapping**: Generates sequence diagrams from API endpoint analysis
* **Database Schema Extraction**: Creates ERDs from model definitions
* **Workflow Recognition**: Identifies state machines and business processes
## Interactive Features
DeepWiki's Mermaid diagrams are fully interactive:
### Navigation Controls
* **Zoom**: Mouse wheel or pinch gestures
* **Pan**: Click and drag to move around large diagrams
* **Reset**: Double-click to return to original view
* **Full Screen**: Expand diagrams for detailed viewing
### Click Interactions
```mermaid
graph LR
A[Component A] --> B[Component B]
B --> C[Component C]
C --> D[Component D]
click A "https://github.com/user/repo/blob/main/src/ComponentA.js"
click B "https://github.com/user/repo/blob/main/src/ComponentB.js"
click C "https://github.com/user/repo/blob/main/src/ComponentC.js"
click D "https://github.com/user/repo/blob/main/src/ComponentD.js"
```
* **Source Code Links**: Click nodes to jump to corresponding code
* **Documentation Links**: Navigate to related documentation sections
* **External Resources**: Link to external APIs, services, or documentation
## Diagram Customization
### Theme Configuration
DeepWiki supports multiple built-in themes and custom styling:
```javascript
// Custom theme configuration
{
"theme": "dark",
"themeVariables": {
"primaryColor": "#ff6b6b",
"primaryTextColor": "#ffffff",
"primaryBorderColor": "#ff4757",
"lineColor": "#70a1ff",
"sectionBkgColor": "#2f3542",
"altSectionBkgColor": "#57606f",
"gridColor": "#3d4454",
"secondaryColor": "#7bed9f",
"tertiaryColor": "#5352ed"
}
}
```
### Style Customization
Apply custom CSS classes to diagram elements:
```mermaid
graph TD
A[Normal Node]
B[Highlighted Node]:::highlight
C[Warning Node]:::warning
D[Success Node]:::success
classDef highlight fill:#ff6b6b,stroke:#ff4757,color:#fff
classDef warning fill:#ffa502,stroke:#ff8c00,color:#fff
classDef success fill:#2ed573,stroke:#26de81,color:#fff
```
### Advanced Styling Options
* **Custom CSS Classes**: Define reusable styles
* **Node Shapes**: Choose from various shapes (rectangles, circles, diamonds)
* **Edge Styles**: Customize line types (solid, dashed, dotted)
* **Color Schemes**: Brand-specific color palettes
* **Font Customization**: Typography and sizing options
## Troubleshooting Diagram Rendering
### Common Issues and Solutions
#### 1. Diagram Not Rendering
**Symptoms**: Blank space where diagram should appear
**Solutions**:
* Check Mermaid syntax for errors
* Verify JavaScript is enabled in browser
* Clear browser cache and refresh
* Check browser console for error messages
#### 2. Syntax Errors
**Symptoms**: Error messages in diagram area
**Common fixes**:
```mermaid
// ❌ Incorrect
graph TD
A -> B // Wrong arrow syntax
// ✅ Correct
graph TD
A --> B
```
#### 3. Performance Issues with Large Diagrams
**Symptoms**: Slow loading or browser freezing
**Solutions**:
* Break large diagrams into smaller sections
* Use subgraphs to organize complex diagrams
* Implement lazy loading for diagram-heavy pages
* Consider static image generation for very complex diagrams
#### 4. Mobile Rendering Issues
**Symptoms**: Diagrams too small or not responsive
**Solutions**:
* Use responsive container sizing
* Implement touch-friendly zoom controls
* Consider simplified mobile versions
* Test on various device sizes
### Debug Mode
Enable debug mode for detailed error information:
```javascript
// Enable Mermaid debug mode
mermaid.initialize({
startOnLoad: true,
logLevel: 'debug',
securityLevel: 'loose'
});
```
## Manual Diagram Editing
### Inline Diagram Creation
Add custom diagrams directly in your documentation:
````markdown
## System Architecture
```mermaid
graph TB
subgraph "Frontend"
UI[React UI]
Store[Redux Store]
end
subgraph "Backend"
API[Express API]
Auth[Auth Service]
DB[(Database)]
end
UI --> Store
Store --> API
API --> Auth
API --> DB
````
````
### Dynamic Diagram Generation
Use DeepWiki's API to generate diagrams programmatically:
```javascript
// Generate diagram from code analysis
const diagram = await deepwiki.generateDiagram({
type: 'flowchart',
source: 'src/components/',
options: {
theme: 'dark',
direction: 'TD'
}
});
````
### Diagram Templates
Create reusable diagram templates:
```yaml
# diagram-templates.yaml
api_flow:
type: sequence
participants:
- Client
- API Gateway
- Microservice
- Database
interactions:
- Client->API Gateway: Request
- API Gateway->Microservice: Forward
- Microservice->Database: Query
- Database-->Microservice: Result
- Microservice-->API Gateway: Response
- API Gateway-->Client: Response
```
## Exporting Diagrams
### SVG Export
Export diagrams as scalable SVG files:
```javascript
// Export current diagram as SVG
const svgData = mermaid.render('diagram-id', diagramDefinition);
const svgBlob = new Blob([svgData], {type: 'image/svg+xml'});
const downloadUrl = URL.createObjectURL(svgBlob);
```
### PNG/JPEG Export
Convert to raster formats for presentations:
```javascript
// Convert SVG to PNG using canvas
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
const img = new Image();
img.onload = () => {
canvas.width = img.width;
canvas.height = img.height;
ctx.drawImage(img, 0, 0);
const pngUrl = canvas.toDataURL('image/png');
};
img.src = 'data:image/svg+xml;base64,' + btoa(svgData);
```
### PDF Integration
Include diagrams in PDF documentation:
```javascript
// Generate PDF with embedded diagrams
const pdf = new jsPDF();
const diagramSvg = await mermaid.render('diagram', definition);
pdf.addSVG(diagramSvg, 10, 10, 190, 100);
pdf.save('documentation.pdf');
```
## Best Practices for Diagram-Heavy Documentation
### 1. Strategic Diagram Placement
* **Overview First**: Start with high-level architecture diagrams
* **Progressive Detail**: Follow with detailed component diagrams
* **Context Switching**: Use diagrams to transition between topics
* **Reference Points**: Include diagrams as visual anchors
### 2. Diagram Organization
```markdown
## Architecture Overview
### Frontend Architecture
### Backend Services
### Data Flow
### Database Schema
```
### 3. Accessibility Considerations
* **Alt Text**: Provide descriptive text alternatives
* **High Contrast**: Ensure sufficient color contrast
* **Keyboard Navigation**: Support keyboard-only users
* **Screen Reader Support**: Include semantic markup
```mermaid
graph TD
A[Start Process]
B{Decision Point}
C[Path A]
D[Path B]
E[End Process]
A --> B
B -->|Condition 1| C
B -->|Condition 2| D
C --> E
D --> E
%% Accessibility improvements
A -.- AA["`**Start**: Initialize the process with user input`"]
B -.- BB["`**Decision**: Evaluate user permissions and route accordingly`"]
E -.- EE["`**End**: Complete process and notify user`"]
```
### 4. Performance Optimization
* **Lazy Loading**: Load diagrams on demand
* **Caching**: Cache generated diagrams
* **Compression**: Optimize SVG output
* **Progressive Enhancement**: Provide fallback text descriptions
### 5. Version Control Best Practices
* **Atomic Commits**: Include diagram updates with related code changes
* **Meaningful Names**: Use descriptive diagram IDs and titles
* **Documentation**: Comment complex diagram definitions
* **Review Process**: Include diagrams in code review workflows
## Advanced Examples
### Complex System Architecture
```mermaid
graph TB
subgraph "Client Layer"
Web[Web App]
Mobile[Mobile App]
API_GW[API Gateway]
end
subgraph "Service Layer"
Auth[Auth Service]
User[User Service]
Content[Content Service]
Analytics[Analytics Service]
end
subgraph "Data Layer"
UserDB[(User DB)]
ContentDB[(Content DB)]
Cache[(Redis Cache)]
Queue[(Message Queue)]
end
subgraph "External Services"
CDN[CDN]
Email[Email Service]
Payment[Payment Gateway]
end
Web --> API_GW
Mobile --> API_GW
API_GW --> Auth
API_GW --> User
API_GW --> Content
API_GW --> Analytics
Auth --> UserDB
User --> UserDB
Content --> ContentDB
Analytics --> Queue
User --> Cache
Content --> Cache
Content --> CDN
User --> Email
User --> Payment
style Auth fill:#ff9999
style User fill:#99ff99
style Content fill:#9999ff
style Analytics fill:#ffff99
```
### Database Relationship Diagram
```mermaid
erDiagram
ORGANIZATION {
uuid id PK
string name
string slug UK
text description
timestamp created_at
timestamp updated_at
}
USERS {
uuid id PK
uuid organization_id FK
string email UK
string name
enum role
timestamp last_login
timestamp created_at
timestamp updated_at
}
PROJECTS {
uuid id PK
uuid organization_id FK
uuid owner_id FK
string name
string repository_url
enum status
json config
timestamp last_processed
timestamp created_at
timestamp updated_at
}
WIKI_PAGES {
uuid id PK
uuid project_id FK
string path
string title
text content
text summary
json metadata
timestamp generated_at
timestamp updated_at
}
DIAGRAMS {
uuid id PK
uuid page_id FK
string type
text definition
json config
string svg_data
timestamp generated_at
}
ORGANIZATION ||--o{ USERS : contains
ORGANIZATION ||--o{ PROJECTS : owns
USERS ||--o{ PROJECTS : manages
PROJECTS ||--o{ WIKI_PAGES : contains
WIKI_PAGES ||--o{ DIAGRAMS : includes
```
### API Interaction Flow
```mermaid
sequenceDiagram
participant Client
participant Gateway
participant Auth
participant Wiki
participant AI
participant Storage
Note over Client,Storage: Wiki Generation Request
Client->>Gateway: POST /api/wiki/generate
Gateway->>Auth: Validate token
Auth-->>Gateway: User info
Gateway->>Wiki: Process repository
Wiki->>Storage: Fetch repository data
Storage-->>Wiki: Repository files
Wiki->>AI: Analyze codebase
AI-->>Wiki: Structure analysis
Wiki->>AI: Generate content
AI-->>Wiki: Wiki pages
Wiki->>Storage: Save pages
Wiki->>Wiki: Generate diagrams
Wiki-->>Gateway: Generation complete
Gateway-->>Client: Success response
Note over Client,Storage: Real-time Updates
loop Real-time updates
Wiki->>Client: WebSocket update
Client->>Client: Update UI
end
```
## Conclusion
Mermaid diagrams in DeepWiki provide a powerful way to visualize and communicate complex software architectures, processes, and relationships. By leveraging automatic generation, interactive features, and extensive customization options, you can create documentation that is both informative and engaging.
Remember to:
* Use appropriate diagram types for different contexts
* Optimize for performance with large documentation sites
* Consider accessibility in diagram design
* Maintain diagrams as living documentation alongside your code
* Leverage automation while retaining the flexibility for manual customization
With these tools and techniques, you can create documentation that truly brings your codebase to life through visual storytelling.
# Private Repository Access
Source: https://asyncfunc.mintlify.app/guides/private-repositories
Complete guide to accessing private repositories on GitHub, GitLab, and BitBucket with personal access tokens, security best practices, and troubleshooting
DeepWiki-Open can access private repositories across multiple platforms using personal access tokens. This comprehensive guide covers token creation, management, security best practices, and troubleshooting for GitHub, GitLab, and BitBucket.
## Overview
Private repository access requires authentication tokens that prove your authorization to access the repository content. DeepWiki supports multiple platforms and token types with different permission models.
Classic and fine-grained personal access tokens with repository scope
Personal access tokens with read\_repository scope
App passwords with repository read permissions
Organization and enterprise-specific considerations
## GitHub Access Tokens
GitHub offers two types of personal access tokens with different scopes and capabilities.
### Classic Personal Access Tokens
1. Go to GitHub.com and sign in
2. Click your profile picture → **Settings**
3. In the left sidebar, click **Developer settings**
4. Click **Personal access tokens** → **Tokens (classic)**
Classic tokens provide broad access but are easier to set up for multiple repositories.
1. Click **Generate new token** → **Generate new token (classic)**
2. Enter a descriptive note (e.g., "DeepWiki Documentation Access")
3. Set expiration (recommended: 90 days for security)
4. Select scopes based on your needs:
**Required scope:** `repo`
* Full control of private repositories
* Includes read/write access to code, issues, pull requests
* **Use when:** Accessing private repositories you own or have been granted access to
**Required scope:** `public_repo`
* Access to public repositories only
* More limited permissions
* **Use when:** Only accessing public repositories (DeepWiki works without tokens for public repos)
**Additional scope:** `read:org`
* Read organization membership
* Required for some organization private repositories
* **Use when:** Accessing private repositories owned by organizations
1. Click **Generate token**
2. **Important:** Copy the token immediately - you won't see it again
3. Store securely (see security best practices below)
GitHub classic tokens start with `ghp_` and are 40 characters long. Never share or commit tokens to code repositories.
### Fine-Grained Personal Access Tokens (Beta)
For more granular control over repository access:
1. In **Developer settings** → **Personal access tokens**
2. Click **Fine-grained tokens** → **Generate new token**
3. Configure token details:
* **Token name:** Descriptive name
* **Expiration:** 90 days recommended
* **Resource owner:** Select your account or organization
Choose repository access level:
**Best for:** Specific repositories
* Click "Selected repositories"
* Choose specific repositories from dropdown
* More secure, limited scope
**Permissions needed:**
* **Repository permissions:** Contents (Read)
* **Metadata:** Read
**Best for:** Multiple repositories
* Select "All repositories" under resource owner
* Grants access to all current and future repositories
* Less secure but more convenient
**Permissions needed:**
* **Repository permissions:** Contents (Read)
* **Metadata:** Read
Set minimum required permissions:
```json
{
"Contents": "Read",
"Metadata": "Read",
"Pull requests": "Read",
"Issues": "Read"
}
```
Fine-grained tokens provide better security through specific repository and permission selection.
### GitHub Enterprise
For GitHub Enterprise Server instances:
**Token creation process:**
1. Navigate to your GitHub Enterprise instance
2. Follow same steps as GitHub.com
3. Ensure your DeepWiki instance can reach enterprise server
**Additional considerations:**
* Network connectivity requirements
* Certificate trust for self-signed certificates
* Custom API endpoints configuration
```javascript
{
"repo_url": "https://github.enterprise.com/company/private-repo",
"access_token": "ghp_xxxxxxxxxxxxxxxxxxxx",
"enterprise_base_url": "https://github.enterprise.com"
}
```
**With SAML SSO enabled:**
1. Create personal access token normally
2. Authorize token for SSO:
* Go to token settings
* Click "Configure SSO" next to your organization
* Click "Authorize"
**Enterprise considerations:**
* Organization policies may restrict token creation
* Admin approval may be required
* Token expiration policies may be enforced
## GitLab Access Tokens
GitLab uses personal access tokens with specific scope-based permissions.
### Personal Access Token Creation
1. Sign in to GitLab.com or your GitLab instance
2. Click your avatar → **Edit profile**
3. In the left sidebar, click **Access Tokens**
GitLab tokens are more granular than GitHub classic tokens, allowing precise permission control.
1. Click **Add new token**
2. Configure token settings:
* **Token name:** Descriptive name (e.g., "DeepWiki Access")
* **Expiration date:** Set appropriate expiration
* **Select scopes:** Choose required permissions
**For private repository access:**
* ✅ `read_repository` - Clone and pull from repositories
* ⚠️ `read_user` - Read user information (optional)
* ⚠️ `read_api` - Read API access (if using API features)
**Avoid unnecessary scopes:**
* ❌ `write_repository` - Not needed for documentation
* ❌ `api` - Full API access (too broad)
**For group projects:**
* Add `read_repository` scope
* Ensure you're a group member
* Check group access policies
**For specific projects:**
* Token automatically inherits project access
* No additional configuration needed
* Respects project visibility settings
1. Click **Create personal access token**
2. Copy the generated token immediately
3. Store securely with appropriate labels
GitLab tokens start with `glpat-` followed by 20 characters. They cannot be viewed again after creation.
### GitLab Self-Managed
For self-hosted GitLab instances:
**Configuration requirements:**
* Same token creation process
* Verify network connectivity to your GitLab instance
* Check SSL certificate configuration
```javascript
{
"repo_url": "https://gitlab.company.com/team/private-project",
"access_token": "glpat-xxxxxxxxxxxxxxxxxxxx",
"gitlab_base_url": "https://gitlab.company.com"
}
```
**Instance-level settings:**
* Personal access token policies
* Maximum token lifetime
* Required scopes restrictions
* API rate limiting
**Security policies:**
* LDAP/SAML integration impacts
* IP allowlists for API access
* Audit logging for token usage
## BitBucket Access
BitBucket uses app passwords instead of traditional tokens, with different permission models.
### App Password Creation
1. Sign in to BitBucket.org
2. Click your avatar → **Personal BitBucket settings**
3. In the left menu, click **App passwords**
BitBucket app passwords are repository-specific credentials with granular permissions.
1. Click **Create app password**
2. Configure password settings:
* **Label:** Descriptive name (e.g., "DeepWiki Documentation")
* **Permissions:** Select required access levels
**Required permissions:**
* ✅ **Repositories: Read** - Access repository content
* ✅ **Pull requests: Read** - Access PR information (optional)
* ⚠️ **Issues: Read** - Access issues (optional)
**Avoid unnecessary permissions:**
* ❌ **Repositories: Write** - Not needed for documentation
* ❌ **Repositories: Admin** - Administrative access
* ❌ **Account** - Account-level access
**Team repositories:**
* Ensure you have repository access
* App password inherits team permissions
* Verify with team administrators
**Personal repositories:**
* Full access to your own repositories
* No additional configuration needed
1. Click **Create**
2. Copy the generated app password
3. Store with username for authentication
BitBucket app passwords are unique strings (not prefixed). You'll need both your username and app password for authentication.
### BitBucket Server/Data Center
For on-premises BitBucket instances:
**BitBucket Server uses personal access tokens:**
1. Go to your BitBucket Server instance
2. Click your avatar → **Manage account**
3. Click **Personal access tokens**
4. Create token with **Repository read** permission
```javascript
{
"repo_url": "https://bitbucket.company.com/projects/TEAM/repos/private-repo",
"access_token": "your-personal-access-token",
"bitbucket_base_url": "https://bitbucket.company.com"
}
```
**Alternative authentication method:**
* Use username and app password
* Base64 encode credentials
* Less secure than tokens
```javascript
{
"repo_url": "https://bitbucket.company.com/projects/TEAM/repos/private-repo",
"username": "your-username",
"password": "your-app-password"
}
```
## Token Security Best Practices
### Secure Token Storage
**Local development:**
```bash
# Use environment files with restricted permissions
echo "GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx" > .env.local
chmod 600 .env.local
# Add to gitignore
echo ".env*" >> .gitignore
echo "*.token" >> .gitignore
```
**Configuration management:**
```javascript
// Use environment variables
const config = {
github_token: process.env.GITHUB_TOKEN,
gitlab_token: process.env.GITLAB_TOKEN,
bitbucket_token: process.env.BITBUCKET_TOKEN
};
// Never hardcode tokens
// ❌ Bad
const token = "ghp_actual_token_here";
// ✅ Good
const token = process.env.GITHUB_TOKEN;
```
**Container secrets:**
```docker
# Docker secrets
docker secret create github_token /path/to/token/file
# Docker Compose
version: '3.8'
services:
deepwiki:
image: deepwiki-open
secrets:
- github_token
environment:
- GITHUB_TOKEN_FILE=/run/secrets/github_token
```
**Kubernetes secrets:**
```yaml
apiVersion: v1
kind: Secret
metadata:
name: repository-tokens
type: Opaque
data:
github-token:
gitlab-token:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: deepwiki
spec:
template:
spec:
containers:
- name: deepwiki
env:
- name: GITHUB_TOKEN
valueFrom:
secretKeyRef:
name: repository-tokens
key: github-token
```
**Automated rotation strategy:**
1. **Generate new tokens** before current ones expire
2. **Test new tokens** in staging environment
3. **Update production** with new tokens
4. **Revoke old tokens** after successful deployment
5. **Monitor for failures** and rollback if needed
**Rotation schedule:**
* **High-security environments:** 30-60 days
* **Standard environments:** 90 days
* **Development environments:** 180 days
```bash
#!/bin/bash
# Token rotation script example
NEW_TOKEN="ghp_new_token_here"
OLD_TOKEN="ghp_old_token_here"
# Test new token
if curl -H "Authorization: token $NEW_TOKEN" https://api.github.com/user; then
# Update production
kubectl patch secret repository-tokens -p='{"data":{"github-token":"'$(echo -n $NEW_TOKEN | base64)'"}}'
# Revoke old token (manual step)
echo "Update successful. Revoke old token: $OLD_TOKEN"
else
echo "New token validation failed. Aborting rotation."
exit 1
fi
```
### Access Control
**Minimize token permissions:**
* Use read-only scopes when possible
* Avoid admin or write permissions
* Prefer fine-grained tokens over classic tokens
* Regular audit of token permissions
Review token permissions quarterly to ensure they match current needs.
**Restrict token usage:**
```nginx
# Nginx configuration for IP restrictions
location /api/wiki/generate {
allow 192.168.1.0/24; # Internal network
allow 10.0.0.0/8; # Private network
deny all; # Block external access
proxy_pass http://deepwiki-backend;
}
```
**API endpoint protection:**
```python
# Rate limiting by token
from functools import lru_cache
@lru_cache(maxsize=1000)
def get_rate_limit(token: str):
return RateLimiter(requests_per_minute=10)
def validate_token_request(token: str):
rate_limiter = get_rate_limit(token)
if not rate_limiter.allow_request():
raise RateLimitExceeded("Token rate limit exceeded")
```
**Token usage monitoring:**
```javascript
{
"token_monitoring": {
"track_usage": true,
"alert_on_failures": true,
"log_access_patterns": true,
"detect_anomalies": true
},
"alerts": {
"token_expiry_warning": "7d",
"unusual_usage_pattern": true,
"failed_authentication": {
"threshold": 5,
"window": "1h"
}
}
}
```
**Security event logging:**
```python
import logging
security_logger = logging.getLogger('security')
def log_token_usage(token_hash: str, repo_url: str, success: bool):
security_logger.info({
'event': 'token_usage',
'token_hash': token_hash,
'repository': repo_url,
'success': success,
'timestamp': datetime.now().isoformat(),
'ip_address': get_client_ip()
})
```
## Using Tokens with DeepWiki
### API Integration
**Direct API calls:**
```bash
# GitHub private repository
curl -X POST "http://localhost:8001/wiki/generate" \
-H "Content-Type: application/json" \
-d '{
"repo_url": "https://github.com/company/private-repo",
"access_token": "ghp_xxxxxxxxxxxxxxxxxxxx",
"model_provider": "google"
}'
# GitLab private repository
curl -X POST "http://localhost:8001/wiki/generate" \
-H "Content-Type: application/json" \
-d '{
"repo_url": "https://gitlab.com/company/private-project",
"access_token": "glpat-xxxxxxxxxxxxxxxxxxxx",
"model_provider": "google"
}'
# BitBucket private repository
curl -X POST "http://localhost:8001/wiki/generate" \
-H "Content-Type: application/json" \
-d '{
"repo_url": "https://bitbucket.org/company/private-repo",
"access_token": "your-app-password",
"username": "your-username",
"model_provider": "google"
}'
```
**Python client example:**
```python
import requests
import os
class DeepWikiClient:
def __init__(self, base_url: str):
self.base_url = base_url
def generate_wiki(self, repo_url: str, platform: str = "github"):
token_env_map = {
"github": "GITHUB_TOKEN",
"gitlab": "GITLAB_TOKEN",
"bitbucket": "BITBUCKET_TOKEN"
}
token = os.getenv(token_env_map[platform])
if not token:
raise ValueError(f"No token found for {platform}")
payload = {
"repo_url": repo_url,
"access_token": token,
"model_provider": "google",
"force_regenerate": False
}
# Add username for BitBucket
if platform == "bitbucket":
payload["username"] = os.getenv("BITBUCKET_USERNAME")
response = requests.post(
f"{self.base_url}/wiki/generate",
json=payload
)
return response.json()
# Usage
client = DeepWikiClient("http://localhost:8001")
result = client.generate_wiki(
"https://github.com/company/private-repo",
platform="github"
)
```
**JavaScript client example:**
```javascript
class DeepWikiClient {
constructor(baseUrl) {
this.baseUrl = baseUrl;
}
async generateWiki(repoUrl, options = {}) {
const {
platform = 'github',
accessToken,
username,
modelProvider = 'google',
forceRegenerate = false
} = options;
const payload = {
repo_url: repoUrl,
access_token: accessToken,
model_provider: modelProvider,
force_regenerate: forceRegenerate
};
// Add username for BitBucket
if (platform === 'bitbucket' && username) {
payload.username = username;
}
const response = await fetch(`${this.baseUrl}/wiki/generate`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload)
});
return response.json();
}
}
// Usage with environment variables
const client = new DeepWikiClient('http://localhost:8001');
const result = await client.generateWiki(
'https://github.com/company/private-repo',
{
platform: 'github',
accessToken: process.env.GITHUB_TOKEN
}
);
```
### Web Interface Integration
**Secure token input:**
```javascript
import { useState } from 'react';
function TokenInput({ onTokenChange, platform }) {
const [token, setToken] = useState('');
const [showToken, setShowToken] = useState(false);
const handleTokenChange = (value) => {
setToken(value);
onTokenChange(value);
};
const tokenPatterns = {
github: /^(ghp_[a-zA-Z0-9]{36}|github_pat_[a-zA-Z0-9]+)$/,
gitlab: /^glpat-[a-zA-Z0-9]{20}$/,
bitbucket: /^[a-zA-Z0-9]+$/
};
const isValidToken = tokenPatterns[platform]?.test(token);
return (
## Frontend Architecture
The frontend is built with Next.js 14, leveraging modern React patterns and TypeScript for type safety.
### Component Structure
### Key Frontend Features
1. **Server-Side Rendering (SSR)**
* Pre-renders pages for better SEO and initial load performance
* Dynamic route generation for repository pages
2. **Real-time Communication**
* WebSocket connection for streaming chat responses
* Progressive content rendering
3. **Internationalization (i18n)**
* Support for 10+ languages
* Context-based language switching
4. **Responsive Design**
* Mobile-first approach
* Adaptive layouts for different screen sizes
## Backend Architecture
The backend is built with FastAPI, providing high-performance async capabilities and automatic API documentation.
### API Structure
```mermaid
graph LR
subgraph "API Endpoints"
Auth["/auth/*"]
Chat["/chat/*"]
Wiki["/api/wiki_cache"]
Models["/models/config"]
Export["/export/wiki"]
Health["/health"]
end
subgraph "Core Modules"
Main[main.py]
API[api.py]
RAG_Module[rag.py]
Pipeline[data_pipeline.py]
end
subgraph "Client Adapters"
OpenAI_Client[openai_client.py]
Google_Client[gemini_client.py]
Bedrock_Client[bedrock_client.py]
Azure_Client[azureai_client.py]
Ollama_Client[ollama_patch.py]
end
API --> Auth
API --> Chat
API --> Wiki
API --> Models
API --> Export
Chat --> RAG_Module
Wiki --> Pipeline
RAG_Module --> OpenAI_Client
RAG_Module --> Google_Client
RAG_Module --> Bedrock_Client
```
### RAG Pipeline Architecture
The Retrieval-Augmented Generation (RAG) pipeline is the core of DeepWiki's intelligence.
```mermaid
sequenceDiagram
participant User
participant Frontend
participant API
participant Cache
participant RAG
participant VCS
participant AI
User->>Frontend: Request Wiki for Repo
Frontend->>API: Check Wiki Cache
API->>Cache: Query Cache
alt Cache Hit
Cache-->>API: Return Cached Wiki
API-->>Frontend: Send Wiki Data
else Cache Miss
API->>VCS: Fetch Repository
VCS-->>API: Repository Data
API->>RAG: Process Repository
RAG->>AI: Generate Wiki Structure
AI-->>RAG: Wiki Structure
loop For Each Page
RAG->>AI: Generate Page Content
AI-->>RAG: Page Content
end
RAG-->>API: Complete Wiki
API->>Cache: Store Wiki
API-->>Frontend: Send Wiki Data
end
Frontend-->>User: Display Wiki
```
### Real-time Chat Flow
```mermaid
sequenceDiagram
participant User
participant Frontend
participant WebSocket
participant RAG
participant VectorDB
participant AI
User->>Frontend: Ask Question
Frontend->>WebSocket: Open Connection
WebSocket->>RAG: Process Query
RAG->>VectorDB: Search Context
VectorDB-->>RAG: Relevant Documents
RAG->>AI: Generate Response
AI-->>RAG: Stream Tokens
loop Streaming
RAG-->>WebSocket: Send Token
WebSocket-->>Frontend: Update UI
Frontend-->>User: Display Token
end
WebSocket->>WebSocket: Close Connection
```
## Component Interactions
### Core Service Dependencies
```mermaid
graph TB
subgraph "Service Layer"
API[API Service]
RAG[RAG Service]
Cache[Cache Service]
Embed[Embedding Service]
end
subgraph "Data Layer"
FileSystem[File System]
VectorStore[Vector Store]
Memory[Memory Store]
end
subgraph "External"
AIProviders[AI Providers]
VCSProviders[VCS Providers]
end
API --> RAG
API --> Cache
RAG --> Embed
Cache --> FileSystem
Embed --> VectorStore
RAG --> Memory
RAG --> AIProviders
API --> VCSProviders
style API fill:#f9f,stroke:#333,stroke-width:4px
style RAG fill:#bbf,stroke:#333,stroke-width:4px
```
## Database and Storage Design
### Storage Architecture