Security Guidelines

This document provides comprehensive security guidelines for deploying and operating DeepWiki-Open in production environments. Follow these best practices to ensure your deployment is secure and protects sensitive data.

Overview

DeepWiki-Open processes source code repositories and requires access to various APIs and services. This guide covers all security aspects from API key management to network security and vulnerability handling.

API Key and Token Management

Secure Storage

1

Use Environment Variables

Never hardcode API keys in your source code. Always use environment variables:
# .env file (never commit this to version control)
GOOGLE_API_KEY=your_google_api_key
OPENAI_API_KEY=your_openai_api_key
OPENROUTER_API_KEY=your_openrouter_api_key
AZURE_OPENAI_API_KEY=your_azure_api_key
Add .env to your .gitignore file immediately after creation
2

Use Secret Management Systems

For production deployments, use dedicated secret management:
# AWS Secrets Manager
aws secretsmanager create-secret \
  --name deepwiki/api-keys \
  --secret-string file://.env

# Kubernetes Secrets
kubectl create secret generic deepwiki-secrets \
  --from-env-file=.env

# HashiCorp Vault
vault kv put secret/deepwiki/api-keys @.env
3

Implement Key Rotation

Regularly rotate API keys and implement automated rotation:
# Example rotation script
import os
import time
from datetime import datetime, timedelta

def rotate_api_keys():
    # Check key age
    key_age = datetime.now() - datetime.fromtimestamp(
        os.path.getmtime('.env')
    )
    
    if key_age > timedelta(days=90):
        # Trigger rotation process
        print("API keys need rotation")

Access Token Security

# Minimum required permissions for GitHub tokens:
- repo (for private repositories)
- read:org (for organization repositories)
- read:user (for user information)

# Create fine-grained personal access tokens when possible
# Set expiration dates (maximum 90 days recommended)

Authentication and Authorization

Wiki Authentication

DeepWiki-Open supports optional authentication for the wiki interface:
# Enable authentication in environment variables
DEEPWIKI_AUTH_MODE=true
DEEPWIKI_AUTH_CODE=your_secure_auth_code

# The auth code should be:
# - At least 20 characters long
# - Randomly generated
# - Changed regularly
import secrets
import string

def generate_auth_code(length=32):
    alphabet = string.ascii_letters + string.digits + string.punctuation
    return ''.join(secrets.choice(alphabet) for _ in range(length))

# Generate a secure code
auth_code = generate_auth_code()
print(f"DEEPWIKI_AUTH_CODE={auth_code}")

Role-Based Access Control (RBAC)

Implement RBAC for multi-user deployments:
# Example RBAC configuration
roles:
  admin:
    permissions:
      - wiki:create
      - wiki:read
      - wiki:update
      - wiki:delete
      - config:modify
  
  developer:
    permissions:
      - wiki:create
      - wiki:read
      - wiki:update
  
  viewer:
    permissions:
      - wiki:read

Data Privacy and Protection

Repository Data Handling

GDPR Compliance

For GDPR compliance, implement:
  1. Data Minimization: Only process necessary files
  2. Right to Erasure: Provide cache clearing endpoints
  3. Data Portability: Export processed wiki data
  4. Privacy by Design: Default to secure configurations
# Example GDPR compliance endpoints
@app.delete("/api/user-data/{user_id}")
async def delete_user_data(user_id: str):
    """Implement right to erasure"""
    # Clear user's cached data
    # Remove from vector database
    # Delete processing history
    
@app.get("/api/user-data/{user_id}/export")
async def export_user_data(user_id: str):
    """Implement data portability"""
    # Export all user-related data

Network Security

Firewall Configuration

# Allow only necessary ports
sudo ufw default deny incoming
sudo ufw default allow outgoing

# Frontend
sudo ufw allow 3000/tcp

# Backend API
sudo ufw allow 8001/tcp

# SSH (if needed)
sudo ufw allow 22/tcp

# Enable firewall
sudo ufw enable

Reverse Proxy Security

Use a reverse proxy for additional security:
# Nginx security headers
server {
    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline';" always;
    
    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req zone=api burst=20 nodelay;
    
    # Proxy settings
    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $host;
    }
}

Private Repository Security

Access Control

Private repository access requires careful security consideration
1

Use Minimal Permissions

Create tokens with only necessary permissions:
  • Read-only access
  • Specific repository scope
  • Time-limited tokens
2

Implement Token Validation

def validate_repository_access(token: str, repo: str) -> bool:
    """Validate token has access to repository"""
    try:
        # Attempt to access repository
        response = requests.get(
            f"https://api.github.com/repos/{repo}",
            headers={"Authorization": f"token {token}"}
        )
        return response.status_code == 200
    except:
        return False
3

Audit Access Logs

Maintain detailed logs of private repository access:
import logging
from datetime import datetime

def log_repository_access(user: str, repo: str, action: str):
    logging.info({
        "timestamp": datetime.utcnow().isoformat(),
        "user": user,
        "repository": repo,
        "action": action,
        "ip_address": request.remote_addr
    })

Data Isolation

Ensure private repository data is isolated:
# Example data isolation
class RepositoryCache:
    def get_cache_path(self, repo: str, user: str) -> Path:
        """Generate isolated cache path per user/repo"""
        # Hash user ID to prevent directory traversal
        user_hash = hashlib.sha256(user.encode()).hexdigest()[:12]
        repo_hash = hashlib.sha256(repo.encode()).hexdigest()[:12]
        
        return Path(f"./cache/{user_hash}/{repo_hash}/")

Vulnerability Management

Security Scanning

# Python dependencies
pip install safety
safety check

# JavaScript dependencies
npm audit
npm audit fix

# Docker images
docker scan deepwiki-open:latest

Vulnerability Reporting

Found a security vulnerability? Please report it responsibly.

Reporting Process

  1. Do NOT create public GitHub issues for security vulnerabilities
  2. Email security details to: security@deepwiki-open.org
  3. Include:
    • Vulnerability description
    • Steps to reproduce
    • Potential impact
    • Suggested fixes (if any)

Response Timeline

  • 24 hours: Initial acknowledgment
  • 72 hours: Vulnerability assessment
  • 7 days: Fix development and testing
  • 14 days: Patch release and disclosure

Security Headers Checklist

Ensure all security headers are properly configured:
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Content-Security-Policy: default-src ‘self’
Referrer-Policy: strict-origin-when-cross-origin

Security Updates and Maintenance

Update Schedule

Automated Security Updates

# GitHub Actions for automated updates
name: Security Updates
on:
  schedule:
    - cron: '0 0 * * 1'  # Weekly on Monday

jobs:
  update-dependencies:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Update Python dependencies
        run: |
          pip install pip-audit
          pip-audit --fix
      
      - name: Update Node dependencies
        run: |
          npm audit fix
          npm update
      
      - name: Create Pull Request
        uses: peter-evans/create-pull-request@v5
        with:
          title: Automated Security Updates
          body: Automated dependency updates
          branch: automated-security-updates

Security Checklist

Before deploying to production, ensure:
All API keys are stored in environment variables or secret management systems
HTTPS is enabled for all endpoints
Authentication is configured for sensitive operations
Rate limiting is implemented
Security headers are properly configured
Logging and monitoring are enabled
Regular backups are configured
Incident response plan is documented
Security scanning is automated
Access controls are properly configured

Additional Resources