AI Guardian¶
AI Guardian is the core security component of the Local AI Cyber Lab, providing comprehensive protection for AI model interactions, access control, and security monitoring.
Architecture Overview¶
graph TB
subgraph Security_Layer["Security Layer"]
validator["Prompt Validator"]
access["Access Control"]
monitor["Security Monitor"]
policy["Policy Engine"]
end
subgraph Protection["Protection Systems"]
injection["Injection Prevention"]
ratelimit["Rate Limiting"]
firewall["AI Firewall"]
audit["Audit System"]
end
subgraph Integration["Service Integration"]
models["Model Services"]
analytics["Security Analytics"]
alerts["Alert System"]
logging["Audit Logging"]
end
Security_Layer --> Protection
Protection --> Integration
classDef critical fill:#f66,stroke:#333,stroke-width:2px
classDef important fill:#ff9,stroke:#333,stroke-width:2px
class validator,firewall critical
class access,monitor important
Security Workflow¶
sequenceDiagram
participant Client
participant Guardian as AI Guardian
participant Validator as Prompt Validator
participant Policy as Policy Engine
participant Model as AI Model
participant Audit as Audit System
Client->>Guardian: Request with Prompt
Guardian->>Validator: Validate Prompt
Validator->>Policy: Check Security Rules
alt Invalid Request
Policy-->>Guardian: Violation Detected
Guardian-->>Client: Request Denied
Guardian->>Audit: Log Security Event
else Valid Request
Policy-->>Guardian: Request Approved
Guardian->>Model: Forward Request
Model-->>Guardian: Model Response
Guardian->>Validator: Validate Response
Guardian-->>Client: Return Response
Guardian->>Audit: Log Interaction
end
Installation¶
AI Guardian is deployed automatically with the Local AI Cyber Lab. For manual setup:
# Update AI Guardian
docker-compose pull ai-guardian
# Start the service
docker-compose up -d ai-guardian
Configuration¶
Environment Variables¶
# .env file
AI_GUARDIAN_PORT=8000
AI_GUARDIAN_API_KEY=your-secure-key
AI_GUARDIAN_LOG_LEVEL=INFO
AI_GUARDIAN_POLICIES_PATH=/etc/ai-guardian/policies
Security Policies¶
# policies/prompt-validation.yml
rules:
- name: injection_prevention
type: regex
patterns:
- '(?i)system\s*\('
- '(?i)eval\s*\('
action: block
- name: sensitive_data
type: pattern
patterns:
- '\b\d{16}\b' # Credit card numbers
- '\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b' # Email addresses
action: mask
Security Features¶
Prompt Validation¶
-
Input Sanitization:
-
Content Filtering:
Access Control¶
-
API Authentication:
-
Rate Limiting:
Monitoring and Logging¶
Security Events¶
def log_security_event(event_type: str, details: dict):
logger.info(f"Security event: {event_type}", extra={
"event_type": event_type,
"details": details,
"timestamp": datetime.utcnow().isoformat(),
"severity": "high" if event_type in HIGH_SEVERITY_EVENTS else "medium"
})
Metrics Collection¶
# prometheus/config/ai-guardian.yml
scrape_configs:
- job_name: 'ai-guardian'
static_configs:
- targets: ['ai-guardian:8000']
metrics_path: '/metrics'
Integration¶
Model Integration¶
async def secure_model_request(prompt: str, model: str) -> dict:
# Validate prompt
valid, message = await validate_prompt(prompt)
if not valid:
raise SecurityException(message)
# Make model request
response = await call_model_api(prompt, model)
# Validate response
valid, message = await validate_response(response)
if not valid:
raise SecurityException(message)
return response
Alert Integration¶
async def send_security_alert(alert_type: str, details: dict):
await notifications.send_alert({
"type": alert_type,
"details": details,
"timestamp": datetime.utcnow().isoformat(),
"severity": get_severity(alert_type)
})
Performance Optimization¶
Caching¶
from functools import lru_cache
@lru_cache(maxsize=1000)
def check_prompt_security(prompt: str) -> bool:
return validate_prompt_patterns(prompt)
Parallel Processing¶
async def parallel_security_checks(prompt: str) -> List[bool]:
tasks = [
check_injection(prompt),
check_sensitive_data(prompt),
check_policy_compliance(prompt)
]
return await asyncio.gather(*tasks)
Troubleshooting¶
Common Issues¶
-
Authentication Failures:
-
Policy Issues:
Additional Resources¶
Best Practices¶
- Regular Updates:
- Keep policies updated
- Monitor security events
- Review access patterns
-
Update security rules
-
Monitoring:
- Set up alerts
- Review logs regularly
- Track metrics
-
Monitor performance
-
Integration:
- Use secure connections
- Implement rate limiting
- Validate all inputs
- Log all events