npm · throttleai

Stop AI stampedes before they start.

Token-based lease governor for AI calls — small enough to embed anywhere, strict enough to enforce real limits on concurrency, tokens, and spend.

Get started Read the Handbook

Install

npm install throttleai

Govern

import { ThrottleAI } from 'throttleai'; const gov = new ThrottleAI({ rpm: 60, tpm: 100_000 }); await gov.acquire(estimatedTokens);

Wrap

import { withThrottle } from 'throttleai/adapters/openai'; const openai = withThrottle(new OpenAI(), gov);

Features

Governance that actually holds.

Lease-Based Flow

Callers acquire a lease before any call is made. No lease, no call. Stampedes are structurally impossible, not just unlikely.

Token + Rate Aware

Tracks RPM, TPM, and concurrent request counts independently. Enforce all three, any two, or just one — your choice.

Zero Dependencies

Pure TypeScript, ships as ESM + CJS, runs in Node 18+ or any fetch-capable runtime. Nothing to install but the package itself.

Adapters

Drop-in wrappers for the tools you already use.

Adapter

Import

What it wraps

fetch

throttleai/adapters/fetch

Global fetch — any HTTP call

openai

throttleai/adapters/openai

OpenAI SDK client instance

tools

throttleai/adapters/tools

MCP / tool-call dispatch functions

express

throttleai/adapters/express

Express middleware — per-route or global

hono

throttleai/adapters/hono

Hono middleware — edge-compatible

Usage

Core governor

import { ThrottleAI } from 'throttleai';

const gov = new ThrottleAI({
  rpm: 60,        // max requests per minute
  tpm: 100_000,   // max tokens per minute
  concurrency: 5, // max in-flight at once
});

// Acquire before every call
const lease = await gov.acquire(estimatedTokens);
const result = await myAICall();
lease.release(actualTokensUsed);

OpenAI adapter

import { withThrottle } from 'throttleai/adapters/openai';

const client = withThrottle(new OpenAI(), gov);

// Use exactly like the normal OpenAI client
const res = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello' }],
});