Introduction

ARES is a multi-provider LLM platform that gives you a single, unified API to route requests across Groq, Anthropic, NVIDIA DeepSeek, and Ollama. It handles tool calling, retrieval-augmented generation (RAG), multi-step workflows, streaming, usage metering, and multi-tenant isolation out of the box — so you can focus on building your AI application instead of stitching together provider SDKs.

Key capabilities

Multi-provider LLM routing — Send requests to Groq, Anthropic, NVIDIA, or Ollama through one API. Switch models without changing your integration.
Tool calling — Define tools your agents can invoke. ARES manages the tool-call loop, execution, and response assembly.
Retrieval-augmented generation (RAG) — Ground LLM responses in your own data with built-in retrieval pipelines.
Workflows — Chain multiple agents and processing steps into deterministic, multi-step workflows.
Multi-tenant enterprise support — Tenant isolation, per-tenant agent configuration, API key scoping, and usage tracking at the tenant level.
Streaming — Server-Sent Events (SSE) streaming for real-time, token-by-token responses.
Usage metering — Track tokens, requests, and costs per tenant with built-in rate limiting and quota enforcement.
Skills — SKILL.md file discovery and loading via thulp-skill-files. Scope-based priority resolution (project > personal > plugin).
MCP integration — Bridge external MCP servers as agent-callable tools. Connect Eruka, Daedra, or any MCP-compatible service.
Loop detection — Sliding-window hash tracking with 3-tier escalation (warn, force alternative, halt) prevents agents from getting stuck in infinite loops.
Crash recovery — Checkpoint-based state serialization lets agents resume from the last saved state after failures.
Agent versioning — Version history, rollback, and emergency stop (kill switch) for all agent requests.
Research coordination — Deep research agent with configurable depth and max iterations for multi-step investigation tasks.
Deployment automation — Built-in deploy/rollback endpoints with service health monitoring and log streaming.

Who is ARES for?

Platform teams building internal AI infrastructure who need a reliable, multi-provider abstraction layer.
Enterprise clients who want managed AI agents with tenant isolation, usage visibility, and SLA guarantees.
Developers building AI applications who want a clean API without managing provider credentials, rate limits, and failover logic themselves.

Base URL

All API requests are made to:

http://localhost:3000

Quick links

Resource	Description
Quickstart	Zero to first API call in 5 minutes
Authentication	API keys, JWT tokens, and admin auth
Models & Providers	Available models, tiers, and provider configuration
Changelog	Release history and breaking changes

ARES Documentation

Introduction

Key capabilities

Who is ARES for?

Base URL

Quick links