> ## Documentation Index > Fetch the complete documentation index at: https://asyncfunc.mintlify.app/llms.txt > Use this file to discover all available pages before exploring further. # Model Providers Setup > Configure multiple AI model providers for optimal documentation generation with DeepWiki-Open DeepWiki-Open supports multiple AI model providers, each with unique strengths for different documentation needs. This guide covers setup, configuration, and optimization for all supported providers. ## Supported Providers Fast, reliable, generous free tier High-quality, detailed documentation Access to 100+ models through one API Enterprise-grade with enhanced security AWS-hosted models with enterprise features Local, private, cost-free AI models ## Google Gemini Google's Gemini models offer excellent performance with generous free tiers, making them ideal for getting started. ### Setup 1. Visit [Google AI Studio](https://makersuite.google.com/app/apikey) 2. Sign in with your Google account 3. Click "Create API Key" 4. Copy the generated key (starts with `AIza`) API key generated and copied Add to your `.env` file: ```env theme={null} GOOGLE_API_KEY=AIzaSyC...your_actual_key_here ``` Never commit API keys to version control. Add `.env` to your `.gitignore`. Test the configuration by starting DeepWiki: ```bash theme={null} python -m api.main # Should show: "Google API key configured successfully" ``` ### Available Models **Best for:** Most documentation tasks * **Speed:** Very fast (1-3 seconds per request) * **Quality:** Excellent for code analysis * **Context:** 1M+ tokens input, 8K output * **Cost:** Free tier: 15 RPM, 1M TPM **Ideal for:** * General repository documentation * Quick prototyping and testing * Regular development workflows * Small to medium repositories **Best for:** Stable, proven performance * **Speed:** Fast (2-4 seconds per request) * **Quality:** Very good, well-tested * **Context:** 1M+ tokens input, 8K output * **Cost:** Free tier: 15 RPM, 1M TPM **Ideal for:** * Production environments requiring stability * Projects where consistency is critical * Long-term documentation maintenance **Best for:** Detailed analysis * **Speed:** Moderate (3-6 seconds per request) * **Quality:** High detail and accuracy * **Context:** 32K tokens input/output * **Cost:** Free tier: 60 RPM **Ideal for:** * Complex architectural analysis * Detailed technical documentation * Academic or research projects ### Optimization Tips **Free tier limits:** * 15 requests per minute (Flash models) * 60 requests per minute (Pro models) * 32,000 tokens per minute **Best practices:** ```javascript theme={null} { "rate_limiting": { "requests_per_minute": 12, // Stay below limit "retry_delay": 5, // Wait 5s on rate limit "batch_processing": true // Process files in batches } } ``` Gemini models have large context windows. Optimize usage: * **Large repositories**: Use full context for better understanding * **Complex files**: Include more surrounding context * **API documentation**: Include related endpoints together ```javascript theme={null} { "context_settings": { "max_file_size": 100000, // 100KB per file "include_dependencies": true, // Include related files "context_overlap": 0.1 // 10% overlap between chunks } } ``` ## OpenAI OpenAI's GPT models provide exceptional quality documentation with advanced reasoning capabilities. ### Setup 1. Sign up at [OpenAI Platform](https://platform.openai.com/) 2. Add payment method (required for API access) 3. Purchase credits or set up billing 4. Navigate to [API Keys](https://platform.openai.com/api-keys) OpenAI requires a paid account. Free ChatGPT accounts cannot access the API. 1. Click "Create new secret key" 2. Add a name (e.g., "DeepWiki-Development") 3. Copy the key (starts with `sk-`) 4. Store securely (you won't see it again) API key generated and stored securely ```env theme={null} OPENAI_API_KEY=sk-proj-...your_actual_key_here # Optional: Custom endpoint for compatible services OPENAI_BASE_URL=https://api.openai.com/v1 ``` ### Available Models **Best for:** State-of-the-art documentation generation with advanced reasoning * **Speed:** Fast to moderate (3-8 seconds per request) * **Quality:** Next-generation AI capabilities with superior understanding * **Context:** 256K tokens input/output (estimated) * **Temperature:** 1.0 (default for creative yet accurate responses) * **Availability:** Rolling out to API users (check availability in your region) **Ideal for:** * Cutting-edge documentation projects * Complex architectural documentation * Multi-language codebases * Advanced technical analysis * Projects requiring latest AI capabilities GPT-5 is now the default model in DeepWiki as of commit 05693d5. Ensure your OpenAI account has access to GPT-5 API. **Best for:** High-quality, comprehensive documentation * **Speed:** Moderate (5-10 seconds per request) * **Quality:** Exceptional writing and analysis * **Context:** 128K tokens input/output * **Cost:** $15/1M input tokens, $60/1M output tokens * **Temperature:** 0.7 (default) * **Top-p:** 0.8 (default) **Ideal for:** * Production documentation * Complex enterprise applications * Publication-quality content * Detailed architectural analysis * Fallback when GPT-5 is unavailable **Best for:** Enhanced reasoning and analysis * **Speed:** Moderate to slow (8-15 seconds) * **Quality:** Superior technical analysis * **Context:** 128K tokens * **Cost:** Premium pricing **Ideal for:** * Complex system analysis * Advanced architectural documentation * Research and academic projects **Best for:** Complex problem solving and analysis * **o1-preview:** Advanced reasoning, slower but thorough * **o1-mini:** Faster reasoning for simpler tasks * **Cost:** Higher than standard GPT-4 **Ideal for:** * Complex debugging documentation * System optimization analysis * Security assessment documentation **Best for:** Budget-conscious high-quality documentation * **Speed:** Fast (3-6 seconds per request) * **Quality:** Very good for most tasks * **Context:** 128K tokens * **Cost:** Lower than GPT-4o **Ideal for:** * Regular documentation updates * Smaller projects with quality requirements * Development and testing workflows ### Cost Optimization Monitor and optimize token consumption: ```javascript theme={null} { "token_optimization": { "max_input_tokens": 100000, // Limit input size "target_output_tokens": 4000, // Reasonable output length "preprocessing": true, // Clean input before sending "compression": "smart" // Remove redundant content } } ``` **Cost calculation example:** * Large repository: \~200K input tokens, 8K output tokens * GPT-5 cost: Pricing to be announced (expected similar or slightly higher than GPT-4o) * GPT-4o cost: $3.00 input + $0.48 output = \$3.48 per generation * Monthly usage (10 repos): \~\$35-50/month (estimated) **Match model to task complexity:** 1. **Simple projects:** Use o4-mini for cost savings 2. **Standard projects:** Use gpt-5 for latest capabilities or gpt-4o for proven reliability 3. **Complex analysis:** Use gpt-5 for advanced reasoning or o1 series for deep insights 4. **Budget constraints:** Start with o4-mini, upgrade if needed 5. **Cutting-edge needs:** Use gpt-5 for state-of-the-art performance ```javascript theme={null} { "auto_model_selection": { "repository_size": { "small": "o4-mini", // < 100 files "medium": "gpt-5", // 100-1000 files (if available, else gpt-4o) "large": "gpt-5" // 1000+ files (if available, else gpt-4o) }, "complexity_factors": [ "multiple_languages", "microservice_architecture", "complex_algorithms" ] } } ``` ## OpenRouter OpenRouter provides access to 100+ AI models through a single API, perfect for comparison and specialized needs. ### Setup 1. Sign up at [OpenRouter](https://openrouter.ai/) 2. Verify your email address 3. Add payment method for paid models 4. Navigate to the Keys section Some models are free, others require credits. Check individual model pricing. 1. Click "Create Key" 2. Name your key (e.g., "DeepWiki-Prod") 3. Copy the key (starts with `sk-or-`) 4. Optionally set spending limits OpenRouter API key generated with spending limits configured ```env theme={null} OPENROUTER_API_KEY=sk-or-...your_actual_key_here ``` ### Popular Models **Models:** `anthropic/claude-3.5-sonnet`, `anthropic/claude-3-haiku` **Best for:** * Excellent code analysis and explanation * Clear, structured documentation * Complex reasoning tasks * Safe, helpful responses **Pricing:** $3/1M input tokens, $15/1M output tokens (3.5 Sonnet) **Use cases:** * API documentation generation * Code architecture explanation * Security-focused analysis **Models:** `google/gemini-pro`, `google/gemini-pro-vision` **Best for:** * Multimodal analysis (code + diagrams) * Fast processing * Good balance of quality and speed **Pricing:** Often lower than direct Google API **Use cases:** * Visual diagram analysis * Multi-language projects * Quick documentation updates **Models:** `meta-llama/llama-3-70b`, `mistralai/mixtral-8x7b` **Best for:** * Cost-effective documentation * Privacy-conscious projects * Experimentation and development **Pricing:** Usually $0.50-$2.00 per 1M tokens **Use cases:** * Large-scale documentation projects * Internal/proprietary code analysis * Development and testing **Models:** `deepseek/deepseek-coder`, `phind/phind-codellama` **Best for:** * Code-specific analysis * Programming language expertise * Technical documentation **Use cases:** * Algorithm explanation * Code optimization documentation * Programming tutorial generation ### Model Comparison Strategy Start with a reliable, fast model: ```javascript theme={null} { "baseline_model": "anthropic/claude-3.5-sonnet", "test_repository": "https://github.com/small/test-repo" } ``` Compare models for your specific use case: ```javascript theme={null} { "comparison_models": [ "openai/gpt-4o", "google/gemini-pro", "meta-llama/llama-3-70b" ], "evaluation_criteria": [ "accuracy", "completeness", "code_understanding", "diagram_quality", "cost_per_generation" ] } ``` Select the best model based on results: ```javascript theme={null} { "selected_model": "anthropic/claude-3.5-sonnet", "reason": "Best code analysis with reasonable cost", "fallback_model": "google/gemini-pro", "reason_fallback": "Faster generation when speed needed" } ``` ## Azure OpenAI Enterprise-grade OpenAI models with enhanced security, compliance, and control. ### Setup 1. Sign in to [Azure Portal](https://portal.azure.com/) 2. Create new Azure OpenAI resource 3. Choose region (check model availability) 4. Configure pricing tier and network settings 5. Wait for deployment completion Azure OpenAI may require approval for access. Check the application status. 1. Go to Azure OpenAI Studio 2. Navigate to Deployments 3. Deploy required models (GPT-4, GPT-3.5-turbo, etc.) 4. Note deployment names and endpoints Models deployed and endpoints configured Collect the required information: * **Endpoint:** `https://your-resource.openai.azure.com` * **API Key:** From resource keys section * **API Version:** e.g., `2024-02-15-preview` ```env theme={null} AZURE_OPENAI_API_KEY=abc123...your_actual_key_here AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com AZURE_OPENAI_VERSION=2024-02-15-preview ``` ### Enterprise Features **Key benefits:** * Data processed within your Azure tenant * No data used for model training * GDPR, SOC 2, HIPAA compliance available * Private networking with VNet integration **Configuration:** ```javascript theme={null} { "privacy_settings": { "data_residency": "eu-west", // Keep data in specific region "logging": "minimal", // Reduce data logging "retention": "30_days", // Automatic data deletion "private_endpoint": true // Use private networking } } ``` **Built-in safety features:** * Automatic content filtering for harmful content * Customizable filter levels * Compliance with organizational policies **Configuration:** ```javascript theme={null} { "content_filter": { "severity_level": "medium", // low, medium, high "categories": ["hate", "violence", "self_harm", "sexual"], "custom_blocklists": ["internal_terms"], "action_on_filter": "block" // block, warn, log } } ``` **Enterprise-grade performance:** * Dedicated capacity options * Predictable performance * Custom rate limits * Multi-region deployment **Configuration:** ```javascript theme={null} { "performance_settings": { "capacity_type": "provisioned", // provisioned vs pay-per-token "throughput_units": 100, // Dedicated throughput "auto_scaling": true, // Scale with demand "load_balancing": "round_robin" // Distribute across regions } } ``` ## AWS Bedrock AWS-hosted AI models with enterprise features and AWS service integration. ### Setup 1. Ensure you have an AWS account 2. Enable AWS Bedrock in your region 3. Request access to required models (may require approval) 4. Create IAM user with Bedrock permissions Bedrock is not available in all AWS regions. Check regional availability. Create IAM policy for Bedrock access: ```json theme={null} { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream" ], "Resource": "*" } ] } ``` ```env theme={null} AWS_ACCESS_KEY_ID=AKIA...your_access_key_here AWS_SECRET_ACCESS_KEY=your_secret_access_key_here AWS_REGION=us-east-1 ``` AWS credentials configured and Bedrock access verified ### Available Models **Models:** * `anthropic.claude-3-sonnet-20240229-v1:0` * `anthropic.claude-3-haiku-20240307-v1:0` * `anthropic.claude-3-opus-20240229-v1:0` **Best for:** Code analysis, documentation, safety-conscious generation **Pricing:** \$3-15 per 1M tokens depending on model **Models:** * `amazon.titan-text-express-v1` * `amazon.titan-text-lite-v1` **Best for:** Cost-effective text generation **Pricing:** \$0.50-2.00 per 1M tokens **Models:** * `ai21.j2-ultra-v1` * `ai21.j2-mid-v1` **Best for:** Long-form documentation, detailed analysis **Pricing:** Varies by model ## Ollama (Local Models) Run AI models locally for complete privacy, cost control, and offline capability. ### Setup ```bash theme={null} # Using Homebrew brew install ollama # Or download installer from https://ollama.ai curl -fsSL https://ollama.ai/install.sh | sh ``` ```bash theme={null} # Install Ollama curl -fsSL https://ollama.ai/install.sh | sh # Start as system service sudo systemctl enable ollama sudo systemctl start ollama ``` Download and install from [Ollama website](https://ollama.ai/download/windows) Or use Windows Subsystem for Linux (WSL) with Linux instructions. ```bash theme={null} # Run Ollama in Docker docker run -d \ -v ollama:/root/.ollama \ -p 11434:11434 \ --name ollama \ ollama/ollama # Pull and run a model docker exec -it ollama ollama run llama3:8b ``` Download models you want to use: ```bash theme={null} # Recommended models for documentation ollama pull qwen3:8b # Excellent for code ollama pull llama3:8b # Good general model ollama pull qwen3:1.7b # Lightweight option # Verify installation ollama list ``` Models downloaded and verified ```env theme={null} OLLAMA_HOST=http://localhost:11434 ``` For remote Ollama servers: ```env theme={null} OLLAMA_HOST=http://ollama-server.internal:11434 ``` ### Model Selection **qwen3:8b (Recommended)** * **Size:** 4.8GB download * **RAM:** 8GB required * **Strengths:** Excellent code understanding, multilingual * **Best for:** Most documentation tasks **deepseek-coder:6.7b** * **Size:** 3.8GB download * **RAM:** 6GB required * **Strengths:** Specialized for code generation and analysis * **Best for:** Technical documentation, API docs **llama3:8b** * **Size:** 4.7GB download * **RAM:** 8GB required * **Strengths:** Well-balanced, good reasoning * **Best for:** General documentation, explanations **llama3:70b** (High-end) * **Size:** 40GB download * **RAM:** 64GB+ required * **Strengths:** Excellent quality, very detailed * **Best for:** High-quality documentation with powerful hardware **qwen3:1.7b** * **Size:** 1.0GB download * **RAM:** 2GB required * **Strengths:** Fast, efficient, good for simple tasks * **Best for:** Quick documentation, low-resource environments **tinyllama:1.1b** * **Size:** 637MB download * **RAM:** 1GB required * **Strengths:** Very fast, minimal resources * **Best for:** Testing, simple explanations ### Performance Optimization **Minimum specs by model size:** * **1B-3B models:** 4GB RAM, any modern CPU * **7B-8B models:** 8GB RAM, modern CPU (preferably 8+ cores) * **13B models:** 16GB RAM, high-performance CPU * **70B+ models:** 64GB+ RAM, server-grade hardware **GPU acceleration (optional):** ```bash theme={null} # Enable GPU support (NVIDIA) ollama pull llama3:8b CUDA_VISIBLE_DEVICES=0 ollama run llama3:8b # Check GPU usage nvidia-smi ``` **Optimize memory usage:** ```bash theme={null} # Set memory limits export OLLAMA_MAX_LOADED_MODELS=2 export OLLAMA_MAX_QUEUE=4 # Configure model parameters export OLLAMA_NUM_PARALLEL=2 export OLLAMA_FLASH_ATTENTION=1 ``` **Model configuration:** ```javascript theme={null} { "model_config": { "num_ctx": 4096, // Context window size "num_predict": 2048, // Max output tokens "temperature": 0.7, // Randomness "top_p": 0.8, // Nucleus sampling "repeat_penalty": 1.1 // Avoid repetition } } ``` ## Multi-Provider Strategy ### Provider Selection Matrix | Project Type | Primary | Fallback | Reason | | --------------- | ------------- | --------------------- | ----------------------- | | **Open Source** | Google Gemini | OpenRouter | Free tier, good quality | | **Enterprise** | Azure OpenAI | OpenAI | Security, compliance | | **Startup** | OpenRouter | Google | Cost optimization | | **Research** | OpenAI GPT-4o | Claude via OpenRouter | Highest quality | | **Personal** | Ollama | Google | Privacy, no cost | | Size | Primary | Reason | | ----------------------- | ------------------- | ---------------------------- | | **Small (\<100 files)** | Google Gemini Flash | Fast, sufficient quality | | **Medium (100-1000)** | OpenAI GPT-4o | Better architecture analysis | | **Large (1000+)** | Claude 3.5 Sonnet | Excellent at large contexts | | **Enterprise** | Azure OpenAI | Security and compliance | | Use Case | Best Provider | Model | Why | | --------------------- | -------------------------- | ----------------- | ------------------- | | **API Documentation** | OpenAI | GPT-4o | Structured output | | **Architecture Docs** | Anthropic (via OpenRouter) | Claude 3.5 Sonnet | System thinking | | **Code Comments** | Google | Gemini Flash | Speed + accuracy | | **Security Docs** | Azure OpenAI | GPT-4o | Enterprise security | | **Cost-Conscious** | Ollama | Qwen3:8b | No API costs | ### Auto-Failover Configuration ```javascript theme={null} { "provider_strategy": { "primary": { "provider": "google", "model": "gemini-2.0-flash", "timeout": 30 }, "fallback_chain": [ { "provider": "openrouter", "model": "anthropic/claude-3.5-sonnet", "condition": "rate_limit_exceeded" }, { "provider": "ollama", "model": "qwen3:8b", "condition": "network_error" } ], "retry_logic": { "max_retries": 3, "backoff_factor": 2, "jitter": true } } } ``` ## Next Steps Set up access control for your DeepWiki deployment Create your first repository documentation Deploy with multiple providers for production use Integrate provider selection into your workflows