Skip to content

WaddleAI Integration Guide for Claude Code

Overview

WaddleAI is an AI proxy and management system that provides OpenAI-compatible APIs with advanced routing, security, and token management. This guide shows how to integrate WaddleAI with your applications and use it through Claude Code.

Quick Start for Applications

Using OpenAI-Compatible API

WaddleAI provides a fully compatible OpenAI API that can be used as a drop-in replacement:

import openai

# Configure client to use WaddleAI proxy
client = openai.OpenAI(
    api_key="wa-your-api-key-here",
    base_url="https://your-waddleai-proxy.com/v1"
)

# Use exactly like OpenAI API
response = client.chat.completions.create(
    model="gpt-4",  # Will be routed by WaddleAI
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ]
)

print(response.choices[0].message.content)

Using with Different Languages

Node.js

import OpenAI from 'openai';

const openai = new OpenAI({
    apiKey: 'wa-your-api-key-here',
    baseURL: 'https://your-waddleai-proxy.com/v1'
});

const completion = await openai.chat.completions.create({
    messages: [{ role: 'user', content: 'Hello!' }],
    model: 'gpt-4',
});

console.log(completion.choices[0].message.content);

cURL

curl https://your-waddleai-proxy.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wa-your-api-key-here" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Management API for Administration

The management server provides comprehensive APIs for system administration:

Authentication

import requests

# Login to get JWT token
auth_response = requests.post(
    "https://your-waddleai-mgmt.com/auth/login",
    json={
        "username": "admin",
        "password": "your-password"
    }
)
token = auth_response.json()["access_token"]

headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json"
}

Token Usage Analytics

# Check WaddleAI token usage
usage = requests.get(
    "https://your-waddleai-mgmt.com/analytics/tokens/waddleai",
    headers=headers
).json()

print(f"Total WaddleAI tokens used: {usage['total_waddleai_tokens']}")
print(f"LLM breakdown: {usage['llm_breakdown']}")

# Check specific user usage
user_usage = requests.get(
    "https://your-waddleai-mgmt.com/analytics/tokens/123",
    headers=headers
).json()

Quota Management

# Update user quotas
quota_update = requests.post(
    "https://your-waddleai-mgmt.com/analytics/quotas/user123",
    headers=headers,
    json={
        "monthly_limit": 200000,
        "daily_limit": 20000
    }
)

# Check quota utilization
quotas = requests.get(
    "https://your-waddleai-mgmt.com/analytics/quotas",
    headers=headers
).json()

API Key Management

# Create new API key
new_key = requests.post(
    "https://your-waddleai-mgmt.com/api-keys",
    headers=headers,
    json={
        "name": "Production API Key",
        "expires_days": 90,
        "permissions": {
            "models": ["gpt-4", "claude-3-opus"],
            "rate_limit": 100
        }
    }
).json()

print(f"New API key: {new_key['api_key']}")

# List API keys
keys = requests.get(
    "https://your-waddleai-mgmt.com/api-keys",
    headers=headers
).json()

Organization Management

# Create new organization
org = requests.post(
    "https://your-waddleai-mgmt.com/orgs",
    headers=headers,
    json={
        "name": "Acme Corp",
        "description": "Corporate AI usage",
        "token_quota_monthly": 1000000,
        "token_quota_daily": 100000
    }
).json()

# List organizations (scope depends on role)
orgs = requests.get(
    "https://your-waddleai-mgmt.com/orgs",
    headers=headers
).json()

Role-Based API Usage

WaddleAI implements comprehensive role-based access control:

Admin

  • Full system access via management API
  • All endpoints and functionality available
  • Cross-organization visibility and control
# Admin can access all analytics
system_stats = requests.get(
    "https://your-waddleai-mgmt.com/analytics/system",
    headers=admin_headers
).json()

# Configure security policies
security_config = requests.post(
    "https://your-waddleai-mgmt.com/config/security",
    headers=admin_headers,
    json={
        "policy": "strict",
        "max_prompt_length": 10000,
        "block_injection": True
    }
)

Resource Manager

  • Organization-scoped quota management
  • User management within assigned organizations
  • Token limit control for assigned organizations
# Resource managers see only assigned organizations
my_orgs = requests.get(
    "https://your-waddleai-mgmt.com/orgs",
    headers=resource_mgr_headers
).json()

# Update quotas for assigned organization users
quota_update = requests.post(
    "https://your-waddleai-mgmt.com/analytics/quotas/user456",
    headers=resource_mgr_headers,
    json={"monthly_limit": 150000}
)

Reporter

  • Read-only analytics and reporting for assigned organizations
  • Usage trend analysis and reporting
  • Security incident reporting
# Reporters can generate detailed usage reports
report = requests.get(
    "https://your-waddleai-mgmt.com/analytics/orgs/123",
    headers=reporter_headers,
    params={
        "period": "monthly",
        "include_users": True,
        "format": "detailed"
    }
).json()

# Security threat analytics
security_report = requests.get(
    "https://your-waddleai-mgmt.com/analytics/security",
    headers=reporter_headers
).json()

User

  • OpenAI-compatible API access only
  • Personal API key management
  • Own usage statistics
# Users can check their own usage
my_usage = requests.get(
    "https://your-waddleai-proxy.com/api/usage",
    headers={"Authorization": f"Bearer {user_api_key}"}
).json()

# Check remaining quota
quota = requests.get(
    "https://your-waddleai-proxy.com/api/quota",
    headers={"Authorization": f"Bearer {user_api_key}"}
).json()

print(f"Daily remaining: {quota['daily']['remaining']}")
print(f"Monthly remaining: {quota['monthly']['remaining']}")

Dual Token System

WaddleAI uses a sophisticated dual token system for accurate billing and analytics:

WaddleAI Tokens

  • Normalized billing units across all LLM providers
  • Used for quota enforcement and cost calculation
  • Consistent pricing regardless of underlying LLM

LLM Tokens

  • Raw provider token counts (input/output)
  • Used for detailed analytics and optimization
  • Provider-specific insights and debugging
# Usage response includes both token types
{
    "usage": {
        "prompt_tokens": 100,      # Raw LLM input tokens
        "completion_tokens": 50,   # Raw LLM output tokens
        "total_tokens": 150,       # Total LLM tokens
        "waddleai_tokens": 15      # Normalized WaddleAI tokens
    }
}

# Detailed analytics show breakdown
{
    "total_waddleai_tokens": 1500,
    "llm_breakdown": {
        "openai_gpt4": {"input": 8000, "output": 4000},
        "anthropic_claude": {"input": 3000, "output": 1500},
        "ollama_llama2": {"input": 12000, "output": 8000}
    }
}

Advanced Features

Model Routing

# WaddleAI automatically routes based on your configuration
response = client.chat.completions.create(
    model="smart-router",  # Uses routing LLM to select best model
    messages=[{"role": "user", "content": "Complex reasoning task..."}]
)

# Force specific provider
response = client.chat.completions.create(
    model="ollama:llama2",  # Route to specific Ollama model
    messages=[{"role": "user", "content": "Local processing needed"}]
)

Memory Integration

# Enable conversation memory
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Remember my preferences"}],
    extra_headers={
        "X-WaddleAI-Memory": "user-session-123",
        "X-WaddleAI-Memory-Type": "conversation"
    }
)

Security Features

WaddleAI automatically scans all prompts for security threats:

  • Prompt injection detection
  • Jailbreak attempt prevention
  • Data extraction blocking
  • Credential harvesting protection

Security events are logged and can be monitored:

# Check recent security alerts (admin/reporter only)
alerts = requests.get(
    "https://your-waddleai-proxy.com/api/security/threats",
    headers=headers
).json()

Health Monitoring

Proxy Server Health

# Kubernetes-style health check
curl https://your-waddleai-proxy.com/healthz

# Detailed status
curl https://your-waddleai-proxy.com/api/status

Prometheus Metrics

# Get all metrics for monitoring
curl https://your-waddleai-proxy.com/metrics
curl https://your-waddleai-mgmt.com/metrics

Configuration Examples

Environment Variables

# Proxy Server
export PROXY_HOST=0.0.0.0
export PROXY_PORT=8000
export DATABASE_URL=postgresql://user:pass@localhost/waddleai
export JWT_SECRET=your-jwt-secret
export SECURITY_POLICY=balanced
export MAX_CONCURRENT_REQUESTS=100

# Management Server  
export MGMT_HOST=0.0.0.0
export MGMT_PORT=8001
export ADMIN_PASSWORD=secure-admin-password

Docker Compose

version: '3.8'
services:
  waddleai-proxy:
    build: ./proxy
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql://postgres:password@db:5432/waddleai
      - MANAGEMENT_SERVER_URL=http://waddleai-mgmt:8001
    depends_on:
      - db
      - waddleai-mgmt

  waddleai-mgmt:
    build: ./management
    ports:
      - "8001:8001"
    environment:
      - DATABASE_URL=postgresql://postgres:password@db:5432/waddleai
    depends_on:
      - db

  db:
    image: postgres:15
    environment:
      - POSTGRES_DB=waddleai
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Error Handling

Common Error Codes

  • 401 - Invalid or expired API key/token
  • 403 - Insufficient permissions for operation
  • 429 - Rate limit or quota exceeded
  • 400 - Blocked by security scanning
  • 503 - Service temporarily unavailable

Example Error Response

{
    "error": {
        "type": "quota_exceeded",
        "message": "Daily token quota exceeded",
        "details": {
            "daily_used": 10000,
            "daily_limit": 10000,
            "monthly_used": 150000,
            "monthly_limit": 200000
        }
    }
}

Best Practices

API Key Security

  • Use environment variables for API keys
  • Rotate keys regularly
  • Use minimal required permissions
  • Monitor usage patterns

Performance Optimization

  • Implement connection pooling
  • Cache frequently used data
  • Monitor response times
  • Use appropriate models for tasks

Cost Management

  • Monitor WaddleAI token consumption
  • Set appropriate quotas
  • Use cheaper models when possible
  • Implement usage alerts

Support

For additional help: - Check the full documentation at /docs/ - Review troubleshooting guides - Monitor health endpoints - Check security logs for issues


This guide covers the core integration patterns for WaddleAI. For complete API documentation, see the full documentation site.