> ## Documentation Index
> Fetch the complete documentation index at: https://asyncfunc.mintlify.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Model endpoints

# Model Endpoints API

DeepWiki provides a flexible provider-based model selection system that supports multiple LLM providers. This documentation covers the model-related API endpoints and how to work with different model providers.

## Overview

DeepWiki's model provider system allows you to choose from various AI model providers including:

* **Google** - Gemini models
* **OpenAI** - GPT models
* **OpenRouter** - Access to multiple model providers through a unified API
* **Azure OpenAI** - Azure-hosted OpenAI models
* **Ollama** - Locally running open-source models
* **AWS Bedrock** - Amazon's managed AI models
* **DashScope** - Alibaba's AI models

Each provider offers different models with specific capabilities and pricing. The system is designed to be extensible, allowing service providers to add custom models as needed.

## Authentication

Before using any model provider, you need to configure the appropriate API keys as environment variables:

```bash theme={null}
# Google Gemini
GOOGLE_API_KEY=your_google_api_key

# OpenAI
OPENAI_API_KEY=your_openai_api_key

# OpenRouter
OPENROUTER_API_KEY=your_openrouter_api_key

# Azure OpenAI
AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_ENDPOINT=your_azure_openai_endpoint
AZURE_OPENAI_VERSION=your_azure_openai_version

# AWS Bedrock
AWS_ACCESS_KEY_ID=your_aws_access_key
AWS_SECRET_ACCESS_KEY=your_aws_secret_key
AWS_REGION=your_aws_region

# Ollama (if not local)
OLLAMA_HOST=http://your-ollama-host:11434

# DashScope
DASHSCOPE_API_KEY=your_dashscope_api_key
```

## Endpoints

### Get Model Configuration

Retrieves the available model providers and their supported models.

```http theme={null}
GET /models/config
```

#### Response

```json theme={null}
{
  "providers": [
    {
      "id": "google",
      "name": "Google",
      "supportsCustomModel": true,
      "models": [
        {
          "id": "gemini-2.0-flash",
          "name": "gemini-2.0-flash"
        },
        {
          "id": "gemini-2.5-flash-preview-05-20",
          "name": "gemini-2.5-flash-preview-05-20"
        },
        {
          "id": "gemini-2.5-pro-preview-03-25",
          "name": "gemini-2.5-pro-preview-03-25"
        }
      ]
    },
    {
      "id": "openai",
      "name": "Openai",
      "supportsCustomModel": true,
      "models": [
        {
          "id": "gpt-4o",
          "name": "gpt-4o"
        },
        {
          "id": "gpt-4.1",
          "name": "gpt-4.1"
        },
        {
          "id": "o1",
          "name": "o1"
        },
        {
          "id": "o3",
          "name": "o3"
        },
        {
          "id": "o4-mini",
          "name": "o4-mini"
        }
      ]
    }
  ],
  "defaultProvider": "google"
}
```

#### Example Requests

**cURL:**

```bash theme={null}
curl -X GET "http://localhost:8001/models/config" \
  -H "Accept: application/json"
```

**Python:**

```python theme={null}
import requests

response = requests.get("http://localhost:8001/models/config")
config = response.json()

# List all providers
for provider in config["providers"]:
    print(f"Provider: {provider['name']}")
    for model in provider["models"]:
        print(f"  - {model['id']}")
```

**JavaScript:**

```javascript theme={null}
const response = await fetch('http://localhost:8001/models/config');
const config = await response.json();

// Get available models for a specific provider
const googleModels = config.providers
  .find(p => p.id === 'google')
  ?.models || [];
```

### Using Models in Chat Completions

The model selection is integrated into the chat completions endpoint. You specify the provider and model when making requests.

```http theme={null}
POST /chat/completions/stream
```

#### Request Body

```json theme={null}
{
  "repo_url": "https://github.com/user/repo",
  "messages": [
    {
      "role": "user",
      "content": "Explain the main functionality of this repository"
    }
  ],
  "provider": "google",
  "model": "gemini-2.0-flash",
  "language": "en",
  "token": "optional_github_token_for_private_repos"
}
```

#### Parameters

| Parameter  | Type   | Required | Description                                                                    |
| ---------- | ------ | -------- | ------------------------------------------------------------------------------ |
| `repo_url` | string | Yes      | URL of the repository to analyze                                               |
| `messages` | array  | Yes      | Array of chat messages                                                         |
| `provider` | string | No       | Model provider ID (default: "google")                                          |
| `model`    | string | No       | Model ID for the specified provider (uses provider's default if not specified) |
| `language` | string | No       | Language for content generation (default: "en")                                |
| `token`    | string | No       | Personal access token for private repositories                                 |
| `type`     | string | No       | Repository type: "github", "gitlab", or "bitbucket" (default: "github")        |

#### Example Requests

**cURL with Google Gemini:**

```bash theme={null}
curl -X POST "http://localhost:8001/chat/completions/stream" \
  -H "Content-Type: application/json" \
  -d '{
    "repo_url": "https://github.com/asyncfuncai/deepwiki-open",
    "messages": [
      {
        "role": "user",
        "content": "What is the main purpose of this project?"
      }
    ],
    "provider": "google",
    "model": "gemini-2.0-flash"
  }'
```

**Python with OpenAI:**

```python theme={null}
import requests
import json

url = "http://localhost:8001/chat/completions/stream"
data = {
    "repo_url": "https://github.com/asyncfuncai/deepwiki-open",
    "messages": [
        {
            "role": "user",
            "content": "Explain the architecture of this application"
        }
    ],
    "provider": "openai",
    "model": "gpt-4o"
}

response = requests.post(url, json=data, stream=True)
for line in response.iter_lines():
    if line:
        print(line.decode('utf-8'))
```

**JavaScript with OpenRouter:**

```javascript theme={null}
const response = await fetch('http://localhost:8001/chat/completions/stream', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    repo_url: 'https://github.com/asyncfuncai/deepwiki-open',
    messages: [
      {
        role: 'user',
        content: 'What are the key features of this repository?'
      }
    ],
    provider: 'openrouter',
    model: 'anthropic/claude-3.5-sonnet'
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  console.log(decoder.decode(value));
}
```

## Model Provider Details

### Google (Gemini)

Default provider with fast and capable models.

**Available Models:**

* `gemini-2.0-flash` - Fast, efficient model (default)
* `gemini-2.5-flash-preview-05-20` - Preview of upcoming flash model
* `gemini-2.5-pro-preview-03-25` - Preview of pro model

**Configuration:**

```json theme={null}
{
  "provider": "google",
  "model": "gemini-2.0-flash"
}
```

### OpenAI

Industry-standard GPT models.

**Available Models:**

* `gpt-4o` - Latest GPT-4 model (default)
* `gpt-4.1` - Updated GPT-4 version
* `o1` - Reasoning model
* `o3` - Advanced model
* `o4-mini` - Smaller, faster model

**Configuration:**

```json theme={null}
{
  "provider": "openai",
  "model": "gpt-4o"
}
```

### OpenRouter

Access multiple model providers through a unified API.

**Available Models:**

* `openai/gpt-4o` - OpenAI GPT-4 (default)
* `deepseek/deepseek-r1` - DeepSeek reasoning model
* `anthropic/claude-3.7-sonnet` - Claude 3.7 Sonnet
* `anthropic/claude-3.5-sonnet` - Claude 3.5 Sonnet
* And many more...

**Configuration:**

```json theme={null}
{
  "provider": "openrouter",
  "model": "anthropic/claude-3.5-sonnet"
}
```

### Azure OpenAI

Azure-hosted OpenAI models with enterprise features.

**Available Models:**

* `gpt-4o` - GPT-4 on Azure (default)
* `gpt-4` - Standard GPT-4
* `gpt-35-turbo` - GPT-3.5 Turbo
* `gpt-4-turbo` - GPT-4 Turbo

**Configuration:**

```json theme={null}
{
  "provider": "azure",
  "model": "gpt-4o"
}
```

**Note:** Requires Azure OpenAI endpoint and API version configuration.

### Ollama

Run models locally for privacy and cost efficiency.

**Available Models:**

* `qwen3:1.7b` - Small, fast model (default)
* `llama3:8b` - Llama 3 8B model
* `qwen3:8b` - Qwen 3 8B model

**Configuration:**

```json theme={null}
{
  "provider": "ollama",
  "model": "llama3:8b"
}
```

**Note:** Requires Ollama to be running locally or accessible via OLLAMA\_HOST.

### AWS Bedrock

Amazon's managed AI service.

**Available Models:**

* `anthropic.claude-3-sonnet-20240229-v1:0` - Claude 3 Sonnet (default)
* `anthropic.claude-3-haiku-20240307-v1:0` - Claude 3 Haiku
* `anthropic.claude-3-opus-20240229-v1:0` - Claude 3 Opus
* `amazon.titan-text-express-v1` - Amazon Titan
* `cohere.command-r-v1:0` - Cohere Command R
* `ai21.j2-ultra-v1` - AI21 Jurassic

**Configuration:**

```json theme={null}
{
  "provider": "bedrock",
  "model": "anthropic.claude-3-sonnet-20240229-v1:0"
}
```

### DashScope

Alibaba's AI models.

**Available Models:**

* `qwen-plus` - Qwen Plus (default)
* `qwen-turbo` - Qwen Turbo
* `deepseek-r1` - DeepSeek R1

**Configuration:**

```json theme={null}
{
  "provider": "dashscope",
  "model": "qwen-plus"
}
```

## Custom Models

Providers that support custom models (where `supportsCustomModel: true`) allow you to specify model IDs not listed in the predefined options. This is useful for:

* Newly released models
* Fine-tuned models
* Private or custom deployments

**Example with custom model:**

```json theme={null}
{
  "provider": "openai",
  "model": "ft:gpt-3.5-turbo-0125:custom:model:id"
}
```

## Error Handling

The API returns standard HTTP status codes and error messages.

### Common Errors

**400 Bad Request:**

```json theme={null}
{
  "detail": "No messages provided"
}
```

**401 Unauthorized:**

```json theme={null}
{
  "detail": "Invalid API key for provider"
}
```

**404 Not Found:**

```json theme={null}
{
  "detail": "Model not found for provider"
}
```

**500 Internal Server Error:**

```json theme={null}
{
  "detail": "Error preparing retriever: No valid document embeddings found"
}
```

### Error Handling Examples

**Python:**

```python theme={null}
try:
    response = requests.post(url, json=data)
    response.raise_for_status()
    result = response.json()
except requests.exceptions.HTTPError as e:
    if e.response.status_code == 400:
        print(f"Bad request: {e.response.json()['detail']}")
    elif e.response.status_code == 500:
        print(f"Server error: {e.response.json()['detail']}")
```

**JavaScript:**

```javascript theme={null}
try {
  const response = await fetch(url, options);
  if (!response.ok) {
    const error = await response.json();
    throw new Error(error.detail);
  }
  const data = await response.json();
} catch (error) {
  console.error('API Error:', error.message);
}
```

## Rate Limiting

Rate limiting depends on the model provider being used:

* **Google Gemini**: Subject to Google AI Studio quotas
* **OpenAI**: Based on your OpenAI tier and usage
* **OpenRouter**: Depends on the specific model and your OpenRouter credits
* **Azure OpenAI**: Based on your Azure deployment quotas
* **Ollama**: Limited by local hardware resources
* **AWS Bedrock**: Subject to AWS service quotas
* **DashScope**: Based on Alibaba Cloud quotas

It's recommended to implement retry logic with exponential backoff for production applications.

## Best Practices

1. **Model Selection**: Choose models based on your specific needs:
   * Use faster models (e.g., `gemini-2.0-flash`, `gpt-4o-mini`) for simple queries
   * Use more capable models (e.g., `gpt-4o`, `claude-3.5-sonnet`) for complex analysis

2. **Error Handling**: Always implement proper error handling for API calls

3. **Streaming**: The chat endpoint supports streaming responses for better user experience

4. **Caching**: DeepWiki automatically caches wiki generation results to improve performance

5. **Security**: Never expose API keys in client-side code; use environment variables

6. **Cost Optimization**: Monitor usage and costs, especially with premium models

## Configuration Files

DeepWiki uses JSON configuration files to manage model settings:

* `api/config/generator.json` - Model provider configurations
* `api/config/embedder.json` - Embedding model settings
* `api/config/repo.json` - Repository processing settings

You can customize these files or use the `DEEPWIKI_CONFIG_DIR` environment variable to specify a custom configuration directory.
