Skip to content

AI Guardian

AI Guardian is the core security component of the Local AI Cyber Lab, providing comprehensive protection for AI model interactions, access control, and security monitoring.

Architecture Overview

graph TB
    subgraph Security_Layer["Security Layer"]
        validator["Prompt Validator"]
        access["Access Control"]
        monitor["Security Monitor"]
        policy["Policy Engine"]
    end

    subgraph Protection["Protection Systems"]
        injection["Injection Prevention"]
        ratelimit["Rate Limiting"]
        firewall["AI Firewall"]
        audit["Audit System"]
    end

    subgraph Integration["Service Integration"]
        models["Model Services"]
        analytics["Security Analytics"]
        alerts["Alert System"]
        logging["Audit Logging"]
    end

    Security_Layer --> Protection
    Protection --> Integration

    classDef critical fill:#f66,stroke:#333,stroke-width:2px
    classDef important fill:#ff9,stroke:#333,stroke-width:2px
    class validator,firewall critical
    class access,monitor important

Security Workflow

sequenceDiagram
    participant Client
    participant Guardian as AI Guardian
    participant Validator as Prompt Validator
    participant Policy as Policy Engine
    participant Model as AI Model
    participant Audit as Audit System

    Client->>Guardian: Request with Prompt
    Guardian->>Validator: Validate Prompt
    Validator->>Policy: Check Security Rules

    alt Invalid Request
        Policy-->>Guardian: Violation Detected
        Guardian-->>Client: Request Denied
        Guardian->>Audit: Log Security Event
    else Valid Request
        Policy-->>Guardian: Request Approved
        Guardian->>Model: Forward Request
        Model-->>Guardian: Model Response
        Guardian->>Validator: Validate Response
        Guardian-->>Client: Return Response
        Guardian->>Audit: Log Interaction
    end

Installation

AI Guardian is deployed automatically with the Local AI Cyber Lab. For manual setup:

# Update AI Guardian
docker-compose pull ai-guardian

# Start the service
docker-compose up -d ai-guardian

Configuration

Environment Variables

# .env file
AI_GUARDIAN_PORT=8000
AI_GUARDIAN_API_KEY=your-secure-key
AI_GUARDIAN_LOG_LEVEL=INFO
AI_GUARDIAN_POLICIES_PATH=/etc/ai-guardian/policies

Security Policies

# policies/prompt-validation.yml
rules:
  - name: injection_prevention
    type: regex
    patterns:
      - '(?i)system\s*\('
      - '(?i)eval\s*\('
    action: block

  - name: sensitive_data
    type: pattern
    patterns:
      - '\b\d{16}\b'  # Credit card numbers
      - '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'  # Email addresses
    action: mask

Security Features

Prompt Validation

  1. Input Sanitization:

    def sanitize_prompt(prompt: str) -> str:
        # Remove potential command injection patterns
        sanitized = re.sub(r'system\s*\(.*?\)', '', prompt)
        # Remove potential SQL injection patterns
        sanitized = re.sub(r'(?i)(SELECT|INSERT|UPDATE|DELETE|DROP).*?;', '', sanitized)
        return sanitized
    

  2. Content Filtering:

    def filter_content(text: str, policies: List[Policy]) -> Tuple[bool, str]:
        for policy in policies:
            if policy.matches(text):
                return False, f"Content violates policy: {policy.name}"
        return True, text
    

Access Control

  1. API Authentication:

    def verify_api_key(api_key: str) -> bool:
        return hmac.compare_digest(api_key, settings.AI_GUARDIAN_API_KEY)
    

  2. Rate Limiting:

    from fastapi import FastAPI
    from slowapi import Limiter
    from slowapi.util import get_remote_address
    
    limiter = Limiter(key_func=get_remote_address)
    app = FastAPI()
    
    @app.post("/api/v1/validate")
    @limiter.limit("100/minute")
    async def validate_prompt(prompt: str):
        # Validation logic
        pass
    

Monitoring and Logging

Security Events

def log_security_event(event_type: str, details: dict):
    logger.info(f"Security event: {event_type}", extra={
        "event_type": event_type,
        "details": details,
        "timestamp": datetime.utcnow().isoformat(),
        "severity": "high" if event_type in HIGH_SEVERITY_EVENTS else "medium"
    })

Metrics Collection

# prometheus/config/ai-guardian.yml
scrape_configs:
  - job_name: 'ai-guardian'
    static_configs:
      - targets: ['ai-guardian:8000']
    metrics_path: '/metrics'

Integration

Model Integration

async def secure_model_request(prompt: str, model: str) -> dict:
    # Validate prompt
    valid, message = await validate_prompt(prompt)
    if not valid:
        raise SecurityException(message)

    # Make model request
    response = await call_model_api(prompt, model)

    # Validate response
    valid, message = await validate_response(response)
    if not valid:
        raise SecurityException(message)

    return response

Alert Integration

async def send_security_alert(alert_type: str, details: dict):
    await notifications.send_alert({
        "type": alert_type,
        "details": details,
        "timestamp": datetime.utcnow().isoformat(),
        "severity": get_severity(alert_type)
    })

Performance Optimization

Caching

from functools import lru_cache

@lru_cache(maxsize=1000)
def check_prompt_security(prompt: str) -> bool:
    return validate_prompt_patterns(prompt)

Parallel Processing

async def parallel_security_checks(prompt: str) -> List[bool]:
    tasks = [
        check_injection(prompt),
        check_sensitive_data(prompt),
        check_policy_compliance(prompt)
    ]
    return await asyncio.gather(*tasks)

Troubleshooting

Common Issues

  1. Authentication Failures:

    # Check API key configuration
    docker-compose exec ai-guardian env | grep AI_GUARDIAN
    
    # Verify API key in requests
    curl -v -H "Authorization: Bearer ${AI_GUARDIAN_API_KEY}" \
      http://localhost:8000/health
    

  2. Policy Issues:

    # Check policy configuration
    docker-compose exec ai-guardian cat /etc/ai-guardian/policies/default.yml
    
    # Test policy validation
    curl -X POST http://localhost:8000/api/v1/validate \
      -H "Authorization: Bearer ${AI_GUARDIAN_API_KEY}" \
      -d '{"prompt": "test prompt"}'
    

Additional Resources

  1. Security Policies
  2. Integration Guide
  3. Monitoring Guide
  4. Policy Development

Best Practices

  1. Regular Updates:
  2. Keep policies updated
  3. Monitor security events
  4. Review access patterns
  5. Update security rules

  6. Monitoring:

  7. Set up alerts
  8. Review logs regularly
  9. Track metrics
  10. Monitor performance

  11. Integration:

  12. Use secure connections
  13. Implement rate limiting
  14. Validate all inputs
  15. Log all events