Model Providers Setup - DeepWiki-Open

DeepWiki-Open supports multiple AI model providers, each with unique strengths for different documentation needs. This guide covers setup, configuration, and optimization for all supported providers.

Supported Providers

Google Gemini

Fast, reliable, generous free tier

OpenAI GPT

High-quality, detailed documentation

OpenRouter

Access to 100+ models through one API

Azure OpenAI

Enterprise-grade with enhanced security

AWS Bedrock

AWS-hosted models with enterprise features

Ollama

Local, private, cost-free AI models

Google Gemini

Google’s Gemini models offer excellent performance with generous free tiers, making them ideal for getting started.

Setup

Get API Key

Visit Google AI Studio
Sign in with your Google account
Click “Create API Key”
Copy the generated key (starts with AIza)

API key generated and copied

Configure Environment

Add to your .env file:

GOOGLE_API_KEY=AIzaSyC...your_actual_key_here

Never commit API keys to version control. Add .env to your .gitignore.

Verify Setup

Test the configuration by starting DeepWiki:

python -m api.main
# Should show: "Google API key configured successfully"

Available Models

gemini-2.0-flash (Recommended)
gemini-1.5-flash
gemini-1.0-pro

Best for: Most documentation tasks

Speed: Very fast (1-3 seconds per request)
Quality: Excellent for code analysis
Context: 1M+ tokens input, 8K output
Cost: Free tier: 15 RPM, 1M TPM

Ideal for:

General repository documentation
Quick prototyping and testing
Regular development workflows
Small to medium repositories

Optimization Tips

Rate Limit Management

Free tier limits:

15 requests per minute (Flash models)
60 requests per minute (Pro models)
32,000 tokens per minute

Best practices:

{
  "rate_limiting": {
    "requests_per_minute": 12,  // Stay below limit
    "retry_delay": 5,           // Wait 5s on rate limit
    "batch_processing": true    // Process files in batches
  }
}

Context Window Optimization

Gemini models have large context windows. Optimize usage:

Large repositories: Use full context for better understanding
Complex files: Include more surrounding context
API documentation: Include related endpoints together

{
  "context_settings": {
    "max_file_size": 100000,     // 100KB per file
    "include_dependencies": true,  // Include related files
    "context_overlap": 0.1        // 10% overlap between chunks
  }
}

OpenAI

OpenAI’s GPT models provide exceptional quality documentation with advanced reasoning capabilities.

Setup

Create Account & Get Credits

Sign up at OpenAI Platform
Add payment method (required for API access)
Purchase credits or set up billing
Navigate to API Keys

OpenAI requires a paid account. Free ChatGPT accounts cannot access the API.

Generate API Key

Click “Create new secret key”
Add a name (e.g., “DeepWiki-Development”)
Copy the key (starts with sk-)
Store securely (you won’t see it again)

API key generated and stored securely

Configure Environment

OPENAI_API_KEY=sk-proj-...your_actual_key_here
# Optional: Custom endpoint for compatible services
OPENAI_BASE_URL=https://api.openai.com/v1

Available Models

gpt-5 (Latest - Default)
gpt-4o (Previous Default)
gpt-4.1
o1 Series (Reasoning Models)
o4-mini (Cost-Effective)

Best for: State-of-the-art documentation generation with advanced reasoning

Speed: Fast to moderate (3-8 seconds per request)
Quality: Next-generation AI capabilities with superior understanding
Context: 256K tokens input/output (estimated)
Temperature: 1.0 (default for creative yet accurate responses)
Availability: Rolling out to API users (check availability in your region)

Ideal for:

Cutting-edge documentation projects
Complex architectural documentation
Multi-language codebases
Advanced technical analysis
Projects requiring latest AI capabilities

GPT-5 is now the default model in DeepWiki as of commit 05693d5. Ensure your OpenAI account has access to GPT-5 API.

Cost Optimization

Token Usage Management

Monitor and optimize token consumption:

{
  "token_optimization": {
    "max_input_tokens": 100000,    // Limit input size
    "target_output_tokens": 4000,  // Reasonable output length
    "preprocessing": true,         // Clean input before sending
    "compression": "smart"         // Remove redundant content
  }
}

Cost calculation example:

Large repository: ~200K input tokens, 8K output tokens
GPT-5 cost: Pricing to be announced (expected similar or slightly higher than GPT-4o)
GPT-4o cost: $3.00 input +$ 0.48 output = $3.48 per generation
Monthly usage (10 repos): ~$35-50/month (estimated)

Model Selection Strategy

Match model to task complexity:

Simple projects: Use o4-mini for cost savings
Standard projects: Use gpt-5 for latest capabilities or gpt-4o for proven reliability
Complex analysis: Use gpt-5 for advanced reasoning or o1 series for deep insights
Budget constraints: Start with o4-mini, upgrade if needed
Cutting-edge needs: Use gpt-5 for state-of-the-art performance

{
  "auto_model_selection": {
    "repository_size": {
      "small": "o4-mini",      // < 100 files
      "medium": "gpt-5",       // 100-1000 files (if available, else gpt-4o)
      "large": "gpt-5"         // 1000+ files (if available, else gpt-4o)
    },
    "complexity_factors": [
      "multiple_languages",
      "microservice_architecture", 
      "complex_algorithms"
    ]
  }
}

OpenRouter

OpenRouter provides access to 100+ AI models through a single API, perfect for comparison and specialized needs.

Setup

Create Account

Sign up at OpenRouter
Verify your email address
Add payment method for paid models
Navigate to the Keys section

Some models are free, others require credits. Check individual model pricing.

Generate API Key

Click “Create Key”
Name your key (e.g., “DeepWiki-Prod”)
Copy the key (starts with sk-or-)
Optionally set spending limits

OpenRouter API key generated with spending limits configured

Configure Environment

OPENROUTER_API_KEY=sk-or-...your_actual_key_here

Popular Models

Anthropic Claude
Google Models
Open Source Models
Specialized Models

Models: anthropic/claude-3.5-sonnet, anthropic/claude-3-haikuBest for:

Excellent code analysis and explanation
Clear, structured documentation
Complex reasoning tasks
Safe, helpful responses

Pricing:

3/1M input tokens,

15/1M output tokens (3.5 Sonnet)Use cases:

API documentation generation
Code architecture explanation
Security-focused analysis

Model Comparison Strategy

Baseline Generation

Start with a reliable, fast model:

{
  "baseline_model": "anthropic/claude-3.5-sonnet",
  "test_repository": "https://github.com/small/test-repo"
}

A/B Testing

Compare models for your specific use case:

{
  "comparison_models": [
    "openai/gpt-4o",
    "google/gemini-pro", 
    "meta-llama/llama-3-70b"
  ],
  "evaluation_criteria": [
    "accuracy",
    "completeness", 
    "code_understanding",
    "diagram_quality",
    "cost_per_generation"
  ]
}

Optimization

Select the best model based on results:

{
  "selected_model": "anthropic/claude-3.5-sonnet",
  "reason": "Best code analysis with reasonable cost",
  "fallback_model": "google/gemini-pro",
  "reason_fallback": "Faster generation when speed needed"
}

Azure OpenAI

Enterprise-grade OpenAI models with enhanced security, compliance, and control.

Setup

Create Azure OpenAI Resource

Sign in to Azure Portal
Create new Azure OpenAI resource
Choose region (check model availability)
Configure pricing tier and network settings
Wait for deployment completion

Azure OpenAI may require approval for access. Check the application status.

Deploy Models

Go to Azure OpenAI Studio
Navigate to Deployments
Deploy required models (GPT-4, GPT-3.5-turbo, etc.)
Note deployment names and endpoints

Models deployed and endpoints configured

Get Configuration Details

Collect the required information:

Endpoint: https://your-resource.openai.azure.com
API Key: From resource keys section
API Version: e.g., 2024-02-15-preview

AZURE_OPENAI_API_KEY=abc123...your_actual_key_here
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_VERSION=2024-02-15-preview

Enterprise Features

Data Privacy & Compliance

Key benefits:

Data processed within your Azure tenant
No data used for model training
GDPR, SOC 2, HIPAA compliance available
Private networking with VNet integration

Configuration:

{
  "privacy_settings": {
    "data_residency": "eu-west",      // Keep data in specific region
    "logging": "minimal",             // Reduce data logging
    "retention": "30_days",           // Automatic data deletion
    "private_endpoint": true          // Use private networking
  }
}

Content Filtering

Built-in safety features:

Automatic content filtering for harmful content
Customizable filter levels
Compliance with organizational policies

Configuration:

{
  "content_filter": {
    "severity_level": "medium",       // low, medium, high
    "categories": ["hate", "violence", "self_harm", "sexual"],
    "custom_blocklists": ["internal_terms"],
    "action_on_filter": "block"       // block, warn, log
  }
}

Scale & Performance

Enterprise-grade performance:

Dedicated capacity options
Predictable performance
Custom rate limits
Multi-region deployment

Configuration:

{
  "performance_settings": {
    "capacity_type": "provisioned",   // provisioned vs pay-per-token
    "throughput_units": 100,          // Dedicated throughput
    "auto_scaling": true,             // Scale with demand
    "load_balancing": "round_robin"   // Distribute across regions
  }
}

AWS Bedrock

AWS-hosted AI models with enterprise features and AWS service integration.

Setup

AWS Account Setup

Ensure you have an AWS account
Enable AWS Bedrock in your region
Request access to required models (may require approval)
Create IAM user with Bedrock permissions

Bedrock is not available in all AWS regions. Check regional availability.

Configure IAM Permissions

Create IAM policy for Bedrock access:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    }
  ]
}

Configure Environment

AWS_ACCESS_KEY_ID=AKIA...your_access_key_here
AWS_SECRET_ACCESS_KEY=your_secret_access_key_here
AWS_REGION=us-east-1

AWS credentials configured and Bedrock access verified

Available Models

Anthropic Claude
Amazon Titan
AI21 Labs

Models:

anthropic.claude-3-sonnet-20240229-v1:0
anthropic.claude-3-haiku-20240307-v1:0
anthropic.claude-3-opus-20240229-v1:0

Best for: Code analysis, documentation, safety-conscious generation Pricing: $3-15 per 1M tokens depending on model

Ollama (Local Models)

Run AI models locally for complete privacy, cost control, and offline capability.

Setup

Install Ollama

macOS
Linux
Windows
Docker

# Using Homebrew
brew install ollama

# Or download installer from https://ollama.ai
curl -fsSL https://ollama.ai/install.sh | sh

Pull Models

Download models you want to use:

# Recommended models for documentation
ollama pull qwen3:8b          # Excellent for code
ollama pull llama3:8b         # Good general model
ollama pull qwen3:1.7b        # Lightweight option

# Verify installation
ollama list

Models downloaded and verified

Configure DeepWiki

OLLAMA_HOST=http://localhost:11434

For remote Ollama servers:

OLLAMA_HOST=http://ollama-server.internal:11434

Model Selection

Code-Focused Models
General Purpose Models
Lightweight Options

qwen3:8b (Recommended)

Size: 4.8GB download
RAM: 8GB required
Strengths: Excellent code understanding, multilingual
Best for: Most documentation tasks

deepseek-coder:6.7b

Size: 3.8GB download
RAM: 6GB required
Strengths: Specialized for code generation and analysis
Best for: Technical documentation, API docs

Performance Optimization

Hardware Requirements

Minimum specs by model size:

1B-3B models: 4GB RAM, any modern CPU
7B-8B models: 8GB RAM, modern CPU (preferably 8+ cores)
13B models: 16GB RAM, high-performance CPU
70B+ models: 64GB+ RAM, server-grade hardware

GPU acceleration (optional):

# Enable GPU support (NVIDIA)
ollama pull llama3:8b
CUDA_VISIBLE_DEVICES=0 ollama run llama3:8b

# Check GPU usage
nvidia-smi

Memory Management

Optimize memory usage:

# Set memory limits
export OLLAMA_MAX_LOADED_MODELS=2
export OLLAMA_MAX_QUEUE=4

# Configure model parameters
export OLLAMA_NUM_PARALLEL=2
export OLLAMA_FLASH_ATTENTION=1

Model configuration:

{
  "model_config": {
    "num_ctx": 4096,          // Context window size
    "num_predict": 2048,      // Max output tokens
    "temperature": 0.7,       // Randomness
    "top_p": 0.8,            // Nucleus sampling
    "repeat_penalty": 1.1     // Avoid repetition
  }
}

Multi-Provider Strategy

Provider Selection Matrix

By Project Type
By Repository Size
By Use Case

Project Type	Primary	Fallback	Reason
Open Source	Google Gemini	OpenRouter	Free tier, good quality
Enterprise	Azure OpenAI	OpenAI	Security, compliance
Startup	OpenRouter	Google	Cost optimization
Research	OpenAI GPT-4o	Claude via OpenRouter	Highest quality
Personal	Ollama	Google	Privacy, no cost

Auto-Failover Configuration

{
  "provider_strategy": {
    "primary": {
      "provider": "google",
      "model": "gemini-2.0-flash",
      "timeout": 30
    },
    "fallback_chain": [
      {
        "provider": "openrouter", 
        "model": "anthropic/claude-3.5-sonnet",
        "condition": "rate_limit_exceeded"
      },
      {
        "provider": "ollama",
        "model": "qwen3:8b", 
        "condition": "network_error"
      }
    ],
    "retry_logic": {
      "max_retries": 3,
      "backoff_factor": 2,
      "jitter": true
    }
  }
}

Next Steps

Authorization Mode

Set up access control for your DeepWiki deployment

Generate First Wiki

Create your first repository documentation

Production Setup

Deploy with multiple providers for production use

API Integration

Integrate provider selection into your workflows

Get Started

Configuration

Advanced

Support

​Supported Providers

Google Gemini

OpenAI GPT

OpenRouter

Azure OpenAI

AWS Bedrock

Ollama

​Google Gemini

​Setup

​Available Models

​Optimization Tips

​OpenAI

​Setup

​Available Models

​Cost Optimization

​OpenRouter

​Setup

​Popular Models

​Model Comparison Strategy

​Azure OpenAI

​Setup

​Enterprise Features

​AWS Bedrock

​Setup

​Available Models

​Ollama (Local Models)

​Setup

​Model Selection

​Performance Optimization

​Multi-Provider Strategy

​Provider Selection Matrix

​Auto-Failover Configuration

​Next Steps

Authorization Mode

Generate First Wiki

Production Setup

API Integration

Supported Providers

Google Gemini

Setup

Available Models

Optimization Tips

OpenAI

Setup

Available Models

Cost Optimization

OpenRouter

Setup

Popular Models

Model Comparison Strategy

Azure OpenAI

Setup

Enterprise Features

AWS Bedrock

Setup

Available Models

Ollama (Local Models)

Setup

Model Selection

Performance Optimization

Multi-Provider Strategy

Provider Selection Matrix

Auto-Failover Configuration

Next Steps