DeepWiki-Open’s core feature is intelligent wiki generation that transforms any repository into comprehensive, navigable documentation. This guide covers all aspects of wiki generation, customization, and optimization.

How Wiki Generation Works

1

Repository Analysis

DeepWiki clones and analyzes the repository structure, identifying:
  • File types and programming languages
  • Directory organization and architecture patterns
  • Dependencies and configuration files
  • Documentation and README files
  • Test structures and examples
Repository successfully analyzed and indexed
2

Code Embedding

Creates vector embeddings of code content for intelligent retrieval:
  • Function and class definitions
  • Comments and documentation
  • Configuration settings
  • API endpoints and interfaces
  • Database schemas and models
Embeddings enable semantic search and context-aware documentation generation.
3

AI-Powered Documentation

Uses your selected AI model to generate:
  • Project overview and purpose
  • Installation and setup instructions
  • Architecture explanations
  • Component relationships
  • Usage examples and best practices
Different AI models produce varying documentation styles. Experiment to find your preference.
4

Visual Diagram Generation

Automatically creates Mermaid diagrams showing:
  • System architecture
  • Data flow and processing
  • Component relationships
  • Database schemas
  • API endpoint structures
Interactive diagrams generated and embedded in documentation

Generation Options

Model Selection

Choose the best AI model for your documentation needs:

Generation Parameters

When to use:
  • Repository has been significantly updated
  • You want to try a different AI model
  • Previous generation had errors
  • You want fresh documentation with latest model improvements
{
  "force_regenerate": true,
  "reason": "Updated to use new architecture patterns"
}
Force regeneration will overwrite existing cached documentation. Consider backing up important custom modifications.
Required for:
  • Private GitHub repositories
  • Private GitLab repositories
  • Private BitBucket repositories
  • Organizations with restricted access
Token permissions needed:
  • GitHub: repo scope (full repository access)
  • GitLab: read_repository scope
  • BitBucket: Repositories: Read permission
{
  "repo_url": "https://github.com/company/private-repo",
  "access_token": "ghp_xxxxxxxxxxxxxxxxxxxx"
}
Tokens are used only for repository access and are not stored permanently.
Advanced users can adjust model parameters:
{
  "model_config": {
    "temperature": 0.7,      // Creativity vs consistency (0.0-1.0)
    "top_p": 0.8,           // Response diversity (0.0-1.0) 
    "max_tokens": 4000,     // Maximum response length
    "top_k": 20             // Token selection diversity (Gemini only)
  }
}
Parameter effects:
  • Lower temperature (0.1-0.3): More consistent, factual documentation
  • Higher temperature (0.7-0.9): More creative, varied explanations
  • Lower top_p (0.3-0.5): More focused responses
  • Higher top_p (0.8-1.0): More diverse vocabulary and examples

Repository Types & Optimization

Programming Languages

DeepWiki optimizes documentation generation for different languages:
JavaScript/TypeScript:
  • React, Vue, Angular component analysis
  • Node.js server architecture
  • API endpoint documentation
  • Package.json and dependency analysis
Python:
  • Django/Flask application structure
  • FastAPI endpoint documentation
  • Class and function analysis
  • Requirements and virtual environment setup
Examples:
  • Express.js servers → API endpoint documentation
  • React apps → Component hierarchy and props
  • Django projects → Model, view, template analysis

Repository Size Optimization

Characteristics:
  • Fast generation (30 seconds - 2 minutes)
  • Comprehensive coverage of all files
  • Detailed analysis of each component
Optimization tips:
  • Use any model (all will perform well)
  • Enable detailed analysis
  • Include all file types
  • Generate comprehensive diagrams
Example repositories:
  • Personal projects
  • Small libraries
  • Configuration repositories
  • Simple applications
Characteristics:
  • Moderate generation time (2-10 minutes)
  • Focus on important files and patterns
  • Good balance of detail and overview
Optimization tips:
  • Use fast models like gemini-2.0-flash
  • Focus on core directories
  • Skip generated/compiled files
  • Prioritize documented code
Example repositories:
  • Open source libraries
  • Medium-sized applications
  • Framework implementations
  • Multi-component projects
Characteristics:
  • Longer generation time (10-30 minutes)
  • High-level architecture focus
  • Selective detailed analysis
  • Emphasis on main components
Optimization tips:
  • Use efficient models (Gemini Flash series)
  • Configure file filters
  • Focus on main source directories
  • Skip test files for initial generation
  • Use incremental regeneration
Example repositories:
  • Large frameworks (React, Vue, Angular)
  • Enterprise applications
  • Monorepos with multiple projects
  • Complex distributed systems

Customizing Generated Documentation

Content Customization

1

Repository-Specific Prompts

DeepWiki automatically adapts to repository types, but you can customize the focus:
{
  "focus_areas": [
    "architecture_patterns",
    "api_documentation", 
    "deployment_setup",
    "security_implementation"
  ],
  "exclude_areas": [
    "test_files",
    "generated_code",
    "vendor_dependencies"
  ]
}
2

Documentation Depth

Control the level of detail in generated documentation:
  • High Detail: Complete analysis of all components
  • Medium Detail: Focus on public APIs and main components
  • Overview: High-level architecture and key features only
{
  "detail_level": "medium",
  "include_private_methods": false,
  "focus_on_public_api": true
}
3

Diagram Types

Specify which types of diagrams to generate:
  • Architecture diagrams: System components and relationships
  • Data flow diagrams: Information processing flow
  • Database diagrams: Schema and relationships
  • API diagrams: Endpoint structure and data flow
  • Process diagrams: Workflow and business logic
{
  "diagram_types": [
    "architecture",
    "data_flow", 
    "api_structure"
  ]
}

Output Format Options

Format: Hierarchical pages with cross-referencesBest for:
  • General documentation browsing
  • Team onboarding
  • Project understanding
  • Code exploration
Features:
  • Navigation tree
  • Search functionality
  • Cross-page linking
  • Embedded diagrams

Quality Optimization

Improving Documentation Quality

Before generation, optimize your repository:
  1. Update README.md with current project information
  2. Add code comments for complex logic
  3. Update package.json/requirements.txt with current dependencies
  4. Add configuration examples (.env.example, config samples)
  5. Include API documentation (OpenAPI specs, GraphQL schemas)
Impact: 40-60% improvement in documentation accuracy and completeness
Match models to repository characteristics:
  • Simple projects: Use Gemini Flash for speed
  • Complex architectures: Use GPT-4o for depth
  • API-heavy projects: Use Claude 3.5 Sonnet via OpenRouter
  • Data projects: Use models with strong analytical capabilities
A/B Testing approach:
  1. Generate with fast model first (Gemini Flash)
  2. If quality insufficient, regenerate with premium model (GPT-4o)
  3. Compare results and choose best approach for similar projects
Use the Ask feature to improve documentation:
  1. Generate initial wiki
  2. Ask specific questions about unclear sections
  3. Use Deep Research for complex topics
  4. Incorporate answers into understanding
  5. Regenerate sections with better context
Example workflow:
1. Generate wiki → Review architecture section
2. Ask: "Explain the database connection pooling implementation"
3. Get detailed answer → Understand missing context  
4. Regenerate with better understanding

Troubleshooting Generation Issues

Symptoms:
  • Missing pages or sections
  • Truncated content
  • Error messages during generation
Solutions:
  1. Check API limits: Verify your AI provider has sufficient quota
  2. Reduce scope: Start with smaller directories or file sets
  3. Try different model: Some models handle large contexts better
  4. Check logs: Look for specific error messages in API logs
# Check generation logs
tail -f ./api/logs/application.log

# Look for specific errors like:
# - "Token limit exceeded"
# - "Repository access denied"  
# - "Model timeout"
Symptoms:
  • Generic or inaccurate descriptions
  • Missing technical details
  • Incorrect architecture analysis
Solutions:
  1. Improve repository documentation: Add README, comments, examples
  2. Use higher-quality models: Switch from Flash to GPT-4o
  3. Enable Deep Research: For complex analysis tasks
  4. Provide more context: Add configuration files, API specs
Before/After example:
Before: "This is a web application built with JavaScript"
After: "Express.js REST API with MongoDB integration, featuring JWT authentication, rate limiting, and comprehensive error handling middleware"
Symptoms:
  • Process hangs or takes extremely long
  • Browser timeout errors
  • Partial results only
Solutions:
  1. Break into smaller chunks: Process subdirectories separately
  2. Use faster models: Gemini Flash series for speed
  3. Increase timeout limits: In API configuration
  4. Optimize repository: Remove large binary files, generated code
// Increase timeout in API calls
{
  "timeout": 600000,  // 10 minutes
  "chunk_size": "small",
  "skip_large_files": true
}

Advanced Features

Multi-Language Support

DeepWiki automatically detects and optimizes for repository languages:
Automatic detection of:
  • Primary language (most files)
  • Secondary languages
  • Framework combinations
  • Build system integration
Smart documentation:
  • Language-specific setup instructions
  • Cross-language integration points
  • Build pipeline explanation
  • Dependency management per language

Integration Workflows

1

CI/CD Integration

Automate wiki generation in your development pipeline:
# GitHub Actions example
name: Generate Documentation
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  docs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Generate Wiki
        run: |
          curl -X POST "${{ secrets.DEEPWIKI_URL }}/wiki/generate" \
            -H "Content-Type: application/json" \
            -d '{
              "repo_url": "${{ github.server_url }}/${{ github.repository }}",
              "model_provider": "google",
              "force_regenerate": true
            }'
2

Webhook Integration

Automatically update documentation on repository changes:
// Webhook handler example
app.post('/webhook/repository', (req, res) => {
  const { repository, commits } = req.body;
  
  // Check if significant changes occurred
  const significantFiles = commits.some(commit => 
    commit.modified.some(file => 
      file.includes('src/') || 
      file.includes('README') ||
      file.includes('package.json')
    )
  );
  
  if (significantFiles) {
    // Trigger documentation regeneration
    generateWiki({
      repo_url: repository.html_url,
      force_regenerate: true
    });
  }
});

Next Steps