Security Guidelines

This document provides comprehensive security guidelines for deploying and operating DeepWiki-Open in production environments. Follow these best practices to ensure your deployment is secure and protects sensitive data.

Overview

DeepWiki-Open processes source code repositories and requires access to various APIs and services. This guide covers all security aspects from API key management to network security and vulnerability handling.

API Key and Token Management

Secure Storage

Use Environment Variables

Never hardcode API keys in your source code. Always use environment variables:

# .env file (never commit this to version control)
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
OPENROUTER_API_KEY=your_openrouter_api_key
AZURE_OPENAI_API_KEY=your_azure_api_key

Add .env to your .gitignore file immediately after creation

Use Secret Management Systems

For production deployments, use dedicated secret management:

# AWS Secrets Manager
aws secretsmanager create-secret \
  --name deepwiki/api-keys \
  --secret-string file://.env

# Kubernetes Secrets
kubectl create secret generic deepwiki-secrets \
  --from-env-file=.env

# HashiCorp Vault
vault kv put secret/deepwiki/api-keys @.env

Implement Key Rotation

Regularly rotate API keys and implement automated rotation:

# Example rotation script
import os
import time
from datetime import datetime, timedelta

def rotate_api_keys():
    # Check key age
    key_age = datetime.now() - datetime.fromtimestamp(
        os.path.getmtime('.env')
    )
    
    if key_age > timedelta(days=90):
        # Trigger rotation process
        print("API keys need rotation")

Access Token Security

GitHub Tokens
GitLab Tokens
Bitbucket Tokens

# Minimum required permissions for GitHub tokens:
- repo (for private repositories)
- read:org (for organization repositories)
- read:user (for user information)

# Create fine-grained personal access tokens when possible
# Set expiration dates (maximum 90 days recommended)

Authentication and Authorization

Wiki Authentication

DeepWiki-Open supports optional authentication for the wiki interface:

# Enable authentication in environment variables
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=your_secure_auth_code

# The auth code should be:
# - At least 20 characters long
# - Randomly generated
# - Changed regularly

import secrets
import string

def generate_auth_code(length=32):
    alphabet = string.ascii_letters + string.digits + string.punctuation
    return ''.join(secrets.choice(alphabet) for _ in range(length))

# Generate a secure code
auth_code = generate_auth_code()
print(f"DEEPWIKI_AUTH_CODE={auth_code}")

Role-Based Access Control (RBAC)

Implement RBAC for multi-user deployments:

# Example RBAC configuration
roles:
  admin:
    permissions:
      - wiki:create
      - wiki:read
      - wiki:update
      - wiki:delete
      - config:modify
  
  developer:
    permissions:
      - wiki:create
      - wiki:read
      - wiki:update
  
  viewer:
    permissions:
      - wiki:read

Data Privacy and Protection

Repository Data Handling

For GDPR compliance, implement:

Data Minimization: Only process necessary files
Right to Erasure: Provide cache clearing endpoints
Data Portability: Export processed wiki data
Privacy by Design: Default to secure configurations

# Example GDPR compliance endpoints
@app.delete("/api/user-data/{user_id}")
async def delete_user_data(user_id: str):
    """Implement right to erasure"""
    # Clear user's cached data
    # Remove from vector database
    # Delete processing history
    
@app.get("/api/user-data/{user_id}/export")
async def export_user_data(user_id: str):
    """Implement data portability"""
    # Export all user-related data

Network Security

Firewall Configuration

# Allow only necessary ports
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Frontend
sudo ufw allow 3000/tcp

# Backend API
sudo ufw allow 8001/tcp

# SSH (if needed)
sudo ufw allow 22/tcp

# Enable firewall
sudo ufw enable

Reverse Proxy Security

Use a reverse proxy for additional security:

# Nginx security headers
server {
    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline';" always;
    
    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req zone=api burst=20 nodelay;
    
    # Proxy settings
    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
    }
}

Private Repository Security

Access Control

Private repository access requires careful security consideration

Use Minimal Permissions

Create tokens with only necessary permissions:

Read-only access
Specific repository scope
Time-limited tokens

Implement Token Validation

def validate_repository_access(token: str, repo: str) -> bool:
    """Validate token has access to repository"""
    try:
        # Attempt to access repository
        response = requests.get(
            f"https://api.github.com/repos/{repo}",
            headers={"Authorization": f"token {token}"}
        )
        return response.status_code == 200
    except:
        return False

Audit Access Logs

Maintain detailed logs of private repository access:

import logging
from datetime import datetime

def log_repository_access(user: str, repo: str, action: str):
    logging.info({
        "timestamp": datetime.utcnow().isoformat(),
        "user": user,
        "repository": repo,
        "action": action,
        "ip_address": request.remote_addr
    })

Data Isolation

Ensure private repository data is isolated:

# Example data isolation
class RepositoryCache:
    def get_cache_path(self, repo: str, user: str) -> Path:
        """Generate isolated cache path per user/repo"""
        # Hash user ID to prevent directory traversal
        user_hash = hashlib.sha256(user.encode()).hexdigest()[:12]
        repo_hash = hashlib.sha256(repo.encode()).hexdigest()[:12]
        
        return Path(f"./cache/{user_hash}/{repo_hash}/")

Vulnerability Management

Security Scanning

Dependencies
Code Analysis
Container Security

# Python dependencies
pip install safety
safety check

# JavaScript dependencies
npm audit
npm audit fix

# Docker images
docker scan deepwiki-open:latest

Vulnerability Reporting

Found a security vulnerability? Please report it responsibly.

Reporting Process

Do NOT create public GitHub issues for security vulnerabilities
Email security details to: security@deepwiki-open.org
Include:
- Vulnerability description
- Steps to reproduce
- Potential impact
- Suggested fixes (if any)

Response Timeline

24 hours: Initial acknowledgment
72 hours: Vulnerability assessment
7 days: Fix development and testing
14 days: Patch release and disclosure

Security Headers Checklist

Ensure all security headers are properly configured:

Strict-Transport-Security: max-age=31536000; includeSubDomains

X-Content-Type-Options: nosniff

X-Frame-Options: SAMEORIGIN

X-XSS-Protection: 1; mode=block

Content-Security-Policy: default-src ‘self’

Referrer-Policy: strict-origin-when-cross-origin

Security Updates and Maintenance

Update Schedule

Automated Security Updates

# GitHub Actions for automated updates
name: Security Updates
on:
  schedule:
    - cron: '0 0 * * 1'  # Weekly on Monday

jobs:
  update-dependencies:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Update Python dependencies
        run: |
          pip install pip-audit
          pip-audit --fix
      
      - name: Update Node dependencies
        run: |
          npm audit fix
          npm update
      
      - name: Create Pull Request
        uses: peter-evans/create-pull-request@v5
        with:
          title: Automated Security Updates
          body: Automated dependency updates
          branch: automated-security-updates

Security Checklist

Before deploying to production, ensure:

All API keys are stored in environment variables or secret management systems

HTTPS is enabled for all endpoints

Authentication is configured for sensitive operations

Rate limiting is implemented

Security headers are properly configured

Logging and monitoring are enabled

Regular backups are configured

Incident response plan is documented

Security scanning is automated

Access controls are properly configured

Get Started

Configuration

Advanced

Support

Security Guidelines

Security Guidelines

Overview

API Key and Token Management

Secure Storage

Access Token Security

Authentication and Authorization

Wiki Authentication

Role-Based Access Control (RBAC)

Data Privacy and Protection

Repository Data Handling

Network Security

Firewall Configuration

Reverse Proxy Security

Private Repository Security

Access Control

Data Isolation

Vulnerability Management

Security Scanning

Vulnerability Reporting

Reporting Process

Response Timeline

Security Headers Checklist

Security Updates and Maintenance

Update Schedule

Automated Security Updates

Security Checklist

Additional Resources

Get Started

Configuration

Advanced

Support

​Security Guidelines

​Overview

​API Key and Token Management

​Secure Storage

​Access Token Security

​Authentication and Authorization

​Wiki Authentication

​Role-Based Access Control (RBAC)

​Data Privacy and Protection

​Repository Data Handling

​GDPR Compliance

​Network Security

​Firewall Configuration

​Reverse Proxy Security

​Private Repository Security

​Access Control

​Data Isolation

​Vulnerability Management

​Security Scanning

​Vulnerability Reporting

​Reporting Process

​Response Timeline

​Security Headers Checklist

​Security Updates and Maintenance

​Update Schedule

​Automated Security Updates

​Security Checklist

​Additional Resources

Security Guidelines

Overview

API Key and Token Management

Secure Storage

Access Token Security

Authentication and Authorization

Wiki Authentication

Role-Based Access Control (RBAC)

Data Privacy and Protection

Repository Data Handling

GDPR Compliance

Network Security

Firewall Configuration

Reverse Proxy Security

Private Repository Security

Access Control

Data Isolation

Vulnerability Management

Security Scanning

Vulnerability Reporting

Reporting Process

Response Timeline

Security Headers Checklist

Security Updates and Maintenance

Update Schedule

Automated Security Updates

Security Checklist

Additional Resources