DeepWiki-Open supports multiple AI model providers, each with unique strengths for different documentation needs. This guide covers setup, configuration, and optimization for all supported providers.

Supported Providers

Google Gemini

Google’s Gemini models offer excellent performance with generous free tiers, making them ideal for getting started.

Setup

1

Get API Key

  1. Visit Google AI Studio
  2. Sign in with your Google account
  3. Click “Create API Key”
  4. Copy the generated key (starts with AIza)
API key generated and copied
2

Configure Environment

Add to your .env file:
GOOGLE_API_KEY=AIzaSyC...your_actual_key_here
Never commit API keys to version control. Add .env to your .gitignore.
3

Verify Setup

Test the configuration by starting DeepWiki:
python -m api.main
# Should show: "Google API key configured successfully"

Available Models

Optimization Tips

Free tier limits:
  • 15 requests per minute (Flash models)
  • 60 requests per minute (Pro models)
  • 32,000 tokens per minute
Best practices:
{
  "rate_limiting": {
    "requests_per_minute": 12,  // Stay below limit
    "retry_delay": 5,           // Wait 5s on rate limit
    "batch_processing": true    // Process files in batches
  }
}
Gemini models have large context windows. Optimize usage:
  • Large repositories: Use full context for better understanding
  • Complex files: Include more surrounding context
  • API documentation: Include related endpoints together
{
  "context_settings": {
    "max_file_size": 100000,     // 100KB per file
    "include_dependencies": true,  // Include related files
    "context_overlap": 0.1        // 10% overlap between chunks
  }
}

OpenAI

OpenAI’s GPT models provide exceptional quality documentation with advanced reasoning capabilities.

Setup

1

Create Account & Get Credits

  1. Sign up at OpenAI Platform
  2. Add payment method (required for API access)
  3. Purchase credits or set up billing
  4. Navigate to API Keys
OpenAI requires a paid account. Free ChatGPT accounts cannot access the API.
2

Generate API Key

  1. Click “Create new secret key”
  2. Add a name (e.g., “DeepWiki-Development”)
  3. Copy the key (starts with sk-)
  4. Store securely (you won’t see it again)
API key generated and stored securely
3

Configure Environment

OPENAI_API_KEY=sk-proj-...your_actual_key_here
# Optional: Custom endpoint for compatible services
OPENAI_BASE_URL=https://api.openai.com/v1

Available Models

Best for: State-of-the-art documentation generation with advanced reasoning
  • Speed: Fast to moderate (3-8 seconds per request)
  • Quality: Next-generation AI capabilities with superior understanding
  • Context: 256K tokens input/output (estimated)
  • Temperature: 1.0 (default for creative yet accurate responses)
  • Availability: Rolling out to API users (check availability in your region)
Ideal for:
  • Cutting-edge documentation projects
  • Complex architectural documentation
  • Multi-language codebases
  • Advanced technical analysis
  • Projects requiring latest AI capabilities
GPT-5 is now the default model in DeepWiki as of commit 05693d5. Ensure your OpenAI account has access to GPT-5 API.

Cost Optimization

Monitor and optimize token consumption:
{
  "token_optimization": {
    "max_input_tokens": 100000,    // Limit input size
    "target_output_tokens": 4000,  // Reasonable output length
    "preprocessing": true,         // Clean input before sending
    "compression": "smart"         // Remove redundant content
  }
}
Cost calculation example:
  • Large repository: ~200K input tokens, 8K output tokens
  • GPT-5 cost: Pricing to be announced (expected similar or slightly higher than GPT-4o)
  • GPT-4o cost: 3.00input+3.00 input + 0.48 output = $3.48 per generation
  • Monthly usage (10 repos): ~$35-50/month (estimated)
Match model to task complexity:
  1. Simple projects: Use o4-mini for cost savings
  2. Standard projects: Use gpt-5 for latest capabilities or gpt-4o for proven reliability
  3. Complex analysis: Use gpt-5 for advanced reasoning or o1 series for deep insights
  4. Budget constraints: Start with o4-mini, upgrade if needed
  5. Cutting-edge needs: Use gpt-5 for state-of-the-art performance
{
  "auto_model_selection": {
    "repository_size": {
      "small": "o4-mini",      // < 100 files
      "medium": "gpt-5",       // 100-1000 files (if available, else gpt-4o)
      "large": "gpt-5"         // 1000+ files (if available, else gpt-4o)
    },
    "complexity_factors": [
      "multiple_languages",
      "microservice_architecture", 
      "complex_algorithms"
    ]
  }
}

OpenRouter

OpenRouter provides access to 100+ AI models through a single API, perfect for comparison and specialized needs.

Setup

1

Create Account

  1. Sign up at OpenRouter
  2. Verify your email address
  3. Add payment method for paid models
  4. Navigate to the Keys section
Some models are free, others require credits. Check individual model pricing.
2

Generate API Key

  1. Click “Create Key”
  2. Name your key (e.g., “DeepWiki-Prod”)
  3. Copy the key (starts with sk-or-)
  4. Optionally set spending limits
OpenRouter API key generated with spending limits configured
3

Configure Environment

OPENROUTER_API_KEY=sk-or-...your_actual_key_here
Models: anthropic/claude-3.5-sonnet, anthropic/claude-3-haikuBest for:
  • Excellent code analysis and explanation
  • Clear, structured documentation
  • Complex reasoning tasks
  • Safe, helpful responses
Pricing: 3/1Minputtokens,3/1M input tokens, 15/1M output tokens (3.5 Sonnet)Use cases:
  • API documentation generation
  • Code architecture explanation
  • Security-focused analysis

Model Comparison Strategy

1

Baseline Generation

Start with a reliable, fast model:
{
  "baseline_model": "anthropic/claude-3.5-sonnet",
  "test_repository": "https://github.com/small/test-repo"
}
2

A/B Testing

Compare models for your specific use case:
{
  "comparison_models": [
    "openai/gpt-4o",
    "google/gemini-pro", 
    "meta-llama/llama-3-70b"
  ],
  "evaluation_criteria": [
    "accuracy",
    "completeness", 
    "code_understanding",
    "diagram_quality",
    "cost_per_generation"
  ]
}
3

Optimization

Select the best model based on results:
{
  "selected_model": "anthropic/claude-3.5-sonnet",
  "reason": "Best code analysis with reasonable cost",
  "fallback_model": "google/gemini-pro",
  "reason_fallback": "Faster generation when speed needed"
}

Azure OpenAI

Enterprise-grade OpenAI models with enhanced security, compliance, and control.

Setup

1

Create Azure OpenAI Resource

  1. Sign in to Azure Portal
  2. Create new Azure OpenAI resource
  3. Choose region (check model availability)
  4. Configure pricing tier and network settings
  5. Wait for deployment completion
Azure OpenAI may require approval for access. Check the application status.
2

Deploy Models

  1. Go to Azure OpenAI Studio
  2. Navigate to Deployments
  3. Deploy required models (GPT-4, GPT-3.5-turbo, etc.)
  4. Note deployment names and endpoints
Models deployed and endpoints configured
3

Get Configuration Details

Collect the required information:
  • Endpoint: https://your-resource.openai.azure.com
  • API Key: From resource keys section
  • API Version: e.g., 2024-02-15-preview
AZURE_OPENAI_API_KEY=abc123...your_actual_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_VERSION=2024-02-15-preview

Enterprise Features

Key benefits:
  • Data processed within your Azure tenant
  • No data used for model training
  • GDPR, SOC 2, HIPAA compliance available
  • Private networking with VNet integration
Configuration:
{
  "privacy_settings": {
    "data_residency": "eu-west",      // Keep data in specific region
    "logging": "minimal",             // Reduce data logging
    "retention": "30_days",           // Automatic data deletion
    "private_endpoint": true          // Use private networking
  }
}
Built-in safety features:
  • Automatic content filtering for harmful content
  • Customizable filter levels
  • Compliance with organizational policies
Configuration:
{
  "content_filter": {
    "severity_level": "medium",       // low, medium, high
    "categories": ["hate", "violence", "self_harm", "sexual"],
    "custom_blocklists": ["internal_terms"],
    "action_on_filter": "block"       // block, warn, log
  }
}
Enterprise-grade performance:
  • Dedicated capacity options
  • Predictable performance
  • Custom rate limits
  • Multi-region deployment
Configuration:
{
  "performance_settings": {
    "capacity_type": "provisioned",   // provisioned vs pay-per-token
    "throughput_units": 100,          // Dedicated throughput
    "auto_scaling": true,             // Scale with demand
    "load_balancing": "round_robin"   // Distribute across regions
  }
}

AWS Bedrock

AWS-hosted AI models with enterprise features and AWS service integration.

Setup

1

AWS Account Setup

  1. Ensure you have an AWS account
  2. Enable AWS Bedrock in your region
  3. Request access to required models (may require approval)
  4. Create IAM user with Bedrock permissions
Bedrock is not available in all AWS regions. Check regional availability.
2

Configure IAM Permissions

Create IAM policy for Bedrock access:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    }
  ]
}
3

Configure Environment

AWS_ACCESS_KEY_ID=AKIA...your_access_key_here
AWS_SECRET_ACCESS_KEY=your_secret_access_key_here
AWS_REGION=us-east-1
AWS credentials configured and Bedrock access verified

Available Models

Models:
  • anthropic.claude-3-sonnet-20240229-v1:0
  • anthropic.claude-3-haiku-20240307-v1:0
  • anthropic.claude-3-opus-20240229-v1:0
Best for: Code analysis, documentation, safety-conscious generation Pricing: $3-15 per 1M tokens depending on model

Ollama (Local Models)

Run AI models locally for complete privacy, cost control, and offline capability.

Setup

1

Install Ollama

# Using Homebrew
brew install ollama

# Or download installer from https://ollama.ai
curl -fsSL https://ollama.ai/install.sh | sh
2

Pull Models

Download models you want to use:
# Recommended models for documentation
ollama pull qwen3:8b          # Excellent for code
ollama pull llama3:8b         # Good general model
ollama pull qwen3:1.7b        # Lightweight option

# Verify installation
ollama list
Models downloaded and verified
3

Configure DeepWiki

OLLAMA_HOST=http://localhost:11434
For remote Ollama servers:
OLLAMA_HOST=http://ollama-server.internal:11434

Model Selection

qwen3:8b (Recommended)
  • Size: 4.8GB download
  • RAM: 8GB required
  • Strengths: Excellent code understanding, multilingual
  • Best for: Most documentation tasks
deepseek-coder:6.7b
  • Size: 3.8GB download
  • RAM: 6GB required
  • Strengths: Specialized for code generation and analysis
  • Best for: Technical documentation, API docs

Performance Optimization

Minimum specs by model size:
  • 1B-3B models: 4GB RAM, any modern CPU
  • 7B-8B models: 8GB RAM, modern CPU (preferably 8+ cores)
  • 13B models: 16GB RAM, high-performance CPU
  • 70B+ models: 64GB+ RAM, server-grade hardware
GPU acceleration (optional):
# Enable GPU support (NVIDIA)
ollama pull llama3:8b
CUDA_VISIBLE_DEVICES=0 ollama run llama3:8b

# Check GPU usage
nvidia-smi
Optimize memory usage:
# Set memory limits
export OLLAMA_MAX_LOADED_MODELS=2
export OLLAMA_MAX_QUEUE=4

# Configure model parameters
export OLLAMA_NUM_PARALLEL=2
export OLLAMA_FLASH_ATTENTION=1
Model configuration:
{
  "model_config": {
    "num_ctx": 4096,          // Context window size
    "num_predict": 2048,      // Max output tokens
    "temperature": 0.7,       // Randomness
    "top_p": 0.8,            // Nucleus sampling
    "repeat_penalty": 1.1     // Avoid repetition
  }
}

Multi-Provider Strategy

Provider Selection Matrix

Project TypePrimaryFallbackReason
Open SourceGoogle GeminiOpenRouterFree tier, good quality
EnterpriseAzure OpenAIOpenAISecurity, compliance
StartupOpenRouterGoogleCost optimization
ResearchOpenAI GPT-4oClaude via OpenRouterHighest quality
PersonalOllamaGooglePrivacy, no cost

Auto-Failover Configuration

{
  "provider_strategy": {
    "primary": {
      "provider": "google",
      "model": "gemini-2.0-flash",
      "timeout": 30
    },
    "fallback_chain": [
      {
        "provider": "openrouter", 
        "model": "anthropic/claude-3.5-sonnet",
        "condition": "rate_limit_exceeded"
      },
      {
        "provider": "ollama",
        "model": "qwen3:8b", 
        "condition": "network_error"
      }
    ],
    "retry_logic": {
      "max_retries": 3,
      "backoff_factor": 2,
      "jitter": true
    }
  }
}

Next Steps