Skip to content

OpenAI-Compatible API Reference

WaddleAI provides a fully compatible OpenAI API that can be used as a drop-in replacement for OpenAI's API. All requests include additional WaddleAI features like security scanning, token management, and routing.

Base URL

https://your-waddleai-proxy.com/v1

Authentication

Use your WaddleAI API key in the Authorization header:

Authorization: Bearer wa-your-api-key-here

Chat Completions

POST /v1/chat/completions

Create a chat completion response. Identical to OpenAI's API with additional WaddleAI features.

Request

curl https://your-waddleai-proxy.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wa-your-api-key" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "max_tokens": 150
  }'

Request Body Parameters

Parameter Type Required Description
model string Yes Model to use (e.g., "gpt-4", "claude-3-opus", "llama2")
messages array Yes Array of message objects
temperature number No Sampling temperature (0-2)
max_tokens integer No Maximum tokens to generate
top_p number No Nucleus sampling parameter
frequency_penalty number No Frequency penalty (-2 to 2)
presence_penalty number No Presence penalty (-2 to 2)
stop string/array No Stop sequences
stream boolean No Whether to stream responses

WaddleAI-Specific Headers

Header Description
X-WaddleAI-Route Force routing to specific provider (e.g., "openai", "anthropic")
X-WaddleAI-Memory Enable conversation memory with session ID
X-WaddleAI-Security Override security policy ("strict", "balanced", "permissive")

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699896916,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 19,
    "total_tokens": 31,
    "waddleai_tokens": 8
  },
  "waddleai": {
    "provider": "openai",
    "model_used": "gpt-4",
    "security_passed": true,
    "routing_rule": "default",
    "cost_waddleai": 8,
    "cost_usd": 0.008
  }
}

Error Responses

{
  "error": {
    "type": "quota_exceeded",
    "message": "Daily token quota exceeded",
    "code": "quota_exceeded",
    "details": {
      "daily_used": 10000,
      "daily_limit": 10000,
      "monthly_used": 50000,
      "monthly_limit": 100000
    }
  }
}

Streaming Responses

Set "stream": true to receive server-sent events:

curl https://your-waddleai-proxy.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wa-your-api-key" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Count to 5"}],
    "stream": true
  }'

Response:

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1699896916,"model":"gpt-4","choices":[{"index":0,"delta":{"role":"assistant","content":"1"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1699896916,"model":"gpt-4","choices":[{"index":0,"delta":{"content":", 2"},"finish_reason":null}]}

...

data: [DONE]

Models

GET /v1/models

List available models across all configured providers.

Request

curl https://your-waddleai-proxy.com/v1/models \
  -H "Authorization: Bearer wa-your-api-key"

Response

{
  "object": "list",
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1699896916,
      "owned_by": "openai",
      "provider": "openai",
      "capabilities": ["chat", "completion"],
      "context_length": 8192,
      "cost_per_waddleai_token": 0.001
    },
    {
      "id": "claude-3-opus",
      "object": "model", 
      "created": 1699896916,
      "owned_by": "anthropic",
      "provider": "anthropic",
      "capabilities": ["chat"],
      "context_length": 200000,
      "cost_per_waddleai_token": 0.0015
    },
    {
      "id": "llama2",
      "object": "model",
      "created": 1699896916,
      "owned_by": "meta",
      "provider": "ollama",
      "capabilities": ["chat", "completion"],
      "context_length": 4096,
      "cost_per_waddleai_token": 0.0001
    }
  ]
}

Completions (Legacy)

POST /v1/completions

Generate text completions (legacy endpoint, chat completions recommended).

Request

curl https://your-waddleai-proxy.com/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wa-your-api-key" \
  -d '{
    "model": "gpt-3.5-turbo",
    "prompt": "Once upon a time",
    "max_tokens": 100,
    "temperature": 0.7
  }'

Response

{
  "id": "cmpl-abc123",
  "object": "text_completion",
  "created": 1699896916,
  "model": "gpt-3.5-turbo",
  "choices": [
    {
      "text": " there was a small village nestled in the mountains...",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 4,
    "completion_tokens": 100,
    "total_tokens": 104,
    "waddleai_tokens": 12
  }
}

Embeddings

POST /v1/embeddings

Create embeddings for text inputs (if supported by target model).

Request

curl https://your-waddleai-proxy.com/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer wa-your-api-key" \
  -d '{
    "model": "text-embedding-ada-002",
    "input": "The food was delicious and the waiter was friendly."
  }'

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0023064255, -0.009327292, ...],
      "index": 0
    }
  ],
  "model": "text-embedding-ada-002",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8,
    "waddleai_tokens": 2
  }
}

WaddleAI Extensions

Usage Information

Get current usage and quota information:

GET /api/usage

curl https://your-waddleai-proxy.com/api/usage \
  -H "Authorization: Bearer wa-your-api-key"

Response:

{
  "total_waddleai_tokens": 1500,
  "total_llm_input_tokens": 8000,
  "total_llm_output_tokens": 4000,
  "total_requests": 45,
  "llm_breakdown": {
    "openai_gpt4": {"input": 5000, "output": 2500},
    "anthropic_claude": {"input": 2000, "output": 1000},
    "ollama_llama2": {"input": 1000, "output": 500}
  },
  "daily_usage": {
    "2024-01-15": {"waddleai_tokens": 500, "requests": 15},
    "2024-01-14": {"waddleai_tokens": 750, "requests": 20}
  }
}

GET /api/quota

curl https://your-waddleai-proxy.com/api/quota \
  -H "Authorization: Bearer wa-your-api-key"

Response:

{
  "quota_ok": true,
  "daily": {
    "used": 1200,
    "limit": 10000,
    "remaining": 8800,
    "ok": true
  },
  "monthly": {
    "used": 15000,
    "limit": 100000,
    "remaining": 85000,
    "ok": true
  }
}

Security Alerts

Get recent security alerts (if you have appropriate permissions):

GET /api/security/threats

curl https://your-waddleai-proxy.com/api/security/threats \
  -H "Authorization: Bearer wa-your-api-key"

Response:

{
  "recent_threats": [
    {
      "timestamp": "2024-01-15T10:30:00Z",
      "threat_type": "prompt_injection",
      "severity": "high",
      "blocked": true,
      "description": "Detected instruction override attempt"
    }
  ],
  "stats": {
    "last_24h": {
      "total_threats": 3,
      "blocked": 3,
      "allowed": 0
    }
  }
}

Rate Limits

WaddleAI enforces multiple types of limits:

Limit Type Default Description
Requests per minute 60 API calls per minute
Daily tokens 10,000 WaddleAI tokens per day
Monthly tokens 100,000 WaddleAI tokens per month

Rate limit information is included in response headers:

X-RateLimit-Limit-RPM: 60
X-RateLimit-Remaining-RPM: 45
X-RateLimit-Reset-RPM: 1699896976
X-RateLimit-Limit-Daily: 10000
X-RateLimit-Remaining-Daily: 8800

Error Codes

Code Type Description
400 invalid_request Invalid request format
400 security_blocked Request blocked by security scanning
401 invalid_api_key Invalid or expired API key
403 insufficient_permissions Insufficient permissions
429 rate_limit_exceeded Rate limit exceeded
429 quota_exceeded Token quota exceeded
500 server_error Internal server error
502 provider_error Upstream LLM provider error
503 service_unavailable Service temporarily unavailable

Best Practices

Authentication

  • Store API keys securely in environment variables
  • Use different keys for different environments
  • Rotate keys regularly

Error Handling

import openai
from openai import OpenAIError

try:
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello"}]
    )
except openai.RateLimitError as e:
    # Handle quota/rate limit exceeded
    print(f"Rate limited: {e}")
    # Implement exponential backoff
except openai.APIError as e:
    # Handle API errors
    print(f"API error: {e}")

Performance

  • Use connection pooling for high-volume applications
  • Implement request caching where appropriate
  • Monitor usage patterns and optimize model selection

Cost Optimization

  • Choose appropriate models for each task
  • Monitor WaddleAI token consumption
  • Use cheaper models for simple tasks
  • Implement usage budgets and alerts

For more advanced features, see the Claude Integration guide which covers the Management API.