API Reference
Main entry point
Section titled “Main entry point”Everything documented here is exported from @mcptoolshop/throttleai and is part of the stable public API. Breaking changes require a major version bump.
createGovernor(config): Governor
Section titled “createGovernor(config): Governor”Factory function. Creates and returns a Governor instance with the provided configuration.
import { createGovernor, presets } from "@mcptoolshop/throttleai";
const gov = createGovernor(presets.balanced());See Configuration for the full config reference.
governor.acquire(request): AcquireDecision
Section titled “governor.acquire(request): AcquireDecision”Request a lease from the governor. Returns synchronously with an immediate decision.
Request shape
Section titled “Request shape”interface AcquireRequest { actorId: string; // identifies the caller action: string; // what they are doing ("chat", "embed", etc.) priority?: "interactive" | "background"; // default: "interactive" estimate?: TokenEstimate; // token estimates and concurrency weight idempotencyKey?: string; // deduplicate concurrent identical requests}
interface TokenEstimate { promptTokens?: number; // estimated prompt tokens maxOutputTokens?: number; // estimated max output tokens weight?: number; // concurrency weight for this call (default: 1)}Return shape
Section titled “Return shape”// Granted{ granted: true, leaseId: string, expiresAt: number, // Unix timestamp (ms)}
// Denied{ granted: false, reason: DenyReason, retryAfterMs: number, recommendation: string, // human-readable suggestion limitsHint?: LimitsHint, // which limit was hit and current values}Deny reasons
Section titled “Deny reasons”| Reason | Meaning |
|---|---|
"concurrency" | All slots are in use (or interactive reserve blocks background) |
"rate" | requestsPerMinute or tokensPerMinute limit reached |
"budget" | Token budget exhausted |
"policy" | Fairness soft-cap prevents this actor from acquiring more slots |
governor.release(leaseId, report?): void
Section titled “governor.release(leaseId, report?): void”Release a previously acquired lease. Always call this, even on errors.
gov.release(decision.leaseId, { outcome: "success", // "success" | "error" | "timeout" | "cancelled" latencyMs: 1200, // optional: actual call duration usage: { promptTokens: 300, outputTokens: 200 }, // optional: actual token consumption});The report object is optional but strongly recommended. Adaptive tuning and stats collection use outcome and latency data to make better decisions.
Outcome values
Section titled “Outcome values”| Outcome | When to use |
|---|---|
"success" | The operation completed normally |
"error" | The operation failed (5xx, exception, etc.) |
"timeout" | The operation timed out |
"cancelled" | The operation was cancelled by the caller |
In strict mode (strict: true), releasing an unknown or already-released lease throws an error. In normal mode, it is silently ignored.
withLease(governor, request, fn, options?)
Section titled “withLease(governor, request, fn, options?)”Execute an async function under a lease with automatic acquire and release. This is the recommended way to use ThrottleAI for most cases.
import { withLease } from "@mcptoolshop/throttleai";
const result = await withLease(gov, request, async () => { return await callMyModel();});
if (result.granted) { console.log(result.result);} else { console.log("Denied:", result.decision.recommendation);}Options
Section titled “Options”interface WithLeaseOptions { strategy?: WithLeaseStrategy; // "deny" | "wait" | "wait-then-deny" maxWaitMs?: number; // max total wait (default: 10_000) maxAttempts?: number; // for "wait-then-deny" (default: 3) initialBackoffMs?: number; // starting backoff (default: 250)}Strategies
Section titled “Strategies”| Strategy | Behavior |
|---|---|
"deny" | Fail immediately if denied. Default. |
"wait" | Retry with exponential backoff until maxWaitMs. |
"wait-then-deny" | Retry up to maxAttempts within maxWaitMs, then deny. |
Return shape
Section titled “Return shape”// Granted{ granted: true, result: T }
// Denied{ granted: false, decision: AcquireDecision }governor.snapshot(): GovernorSnapshot
Section titled “governor.snapshot(): GovernorSnapshot”Returns a point-in-time view of the governor’s state. Useful for dashboards, health checks, and debugging.
const snap = gov.snapshot();Snapshot shape
Section titled “Snapshot shape”interface GovernorSnapshot { timestamp: number; // when the snapshot was taken activeLeases: number; // number of active leases concurrency: { // null if concurrency not configured inFlightWeight: number; inFlightCount: number; max: number; effectiveMax: number; // may be lower when adaptive is active available: number; } | null; requestRate: { // null if rate not configured current: number; limit: number; } | null; tokenRate: { // null if token rate not configured current: number; limit: number; } | null; fairness: boolean; // whether fairness is active adaptive: boolean; // whether adaptive is active lastDeny: { // most recent deny event, or null reason: DenyReason; timestamp: number; actorId?: string; } | null;}formatEvent(event): string
Section titled “formatEvent(event): string”One-line human-readable formatter for governor events.
import { formatEvent } from "@mcptoolshop/throttleai";
const gov = createGovernor({ onEvent: (e) => console.log(formatEvent(e)),});// [deny] actor=user-1 action=chat reason=concurrency retryAfterMs=500formatSnapshot(snap): string
Section titled “formatSnapshot(snap): string”Compact one-line formatter for snapshots.
import { formatSnapshot } from "@mcptoolshop/throttleai";
console.log(formatSnapshot(gov.snapshot()));// concurrency=3/5 rate=12/60 leases=3Status getters
Section titled “Status getters”Convenience properties on the governor instance for quick checks without calling snapshot().
gov.activeLeases // number — active lease countgov.concurrencyActive // number — in-flight weightgov.concurrencyAvailable // number — remaining capacitygov.concurrencyEffectiveMax // number — effective concurrency limit (may differ from max when adaptive is active)gov.rateCount // number — requests in current windowgov.rateLimit // number — configured request-rate limitgov.tokenRateCount // number — tokens in current windowgov.tokenRateLimit // number — configured token-rate limitgovernor.dispose(): void
Section titled “governor.dispose(): void”Stops the TTL reaper interval. Call on application shutdown.
process.on("SIGINT", () => { gov.dispose(); process.exit(0);});After dispose:
acquire()still works. The governor does not shut down — it just stops sweeping expired leases.- Expired leases will not be reaped. They sit until explicitly released or the governor is garbage-collected.
dispose()is idempotent. Calling it twice is safe.
If you do not call dispose, the reaper setInterval keeps the Node.js process alive.
Utility functions
Section titled “Utility functions”waitForRetry(decision): Promise<void>
Section titled “waitForRetry(decision): Promise<void>”Sleeps for the retryAfterMs duration from a deny decision. Useful for manual retry loops.
retryAcquire(governor, request, options): Promise<AcquireDecision>
Section titled “retryAcquire(governor, request, options): Promise<AcquireDecision>”Retry acquire with backoff. Used internally by withLease but available for custom retry logic.
createStatsCollector(): StatsCollector
Section titled “createStatsCollector(): StatsCollector”Returns a zero-dependency stats collector that you can wire to onEvent:
const stats = createStatsCollector();const gov = createGovernor({ onEvent: stats.handler,});
const summary = stats.snapshot();// { grants: 150, denies: 12, avgLatencyMs: 340, ... }createTestClock(startMs): Clock
Section titled “createTestClock(startMs): Clock”Creates a deterministic clock for testing. Inject it to control time in tests:
import { createTestClock } from "@mcptoolshop/throttleai";
const clock = createTestClock(100_000); // start at 100s// Advance time by 1 minuteclock.advance(60_000);Exported types
Section titled “Exported types”All types are exported from @mcptoolshop/throttleai:
- Config:
GovernorConfig,ConcurrencyConfig,RateConfig,FairnessConfig,AdaptiveConfig - Request/Response:
AcquireRequest,AcquireDecision,ReleaseReport - Enums:
DenyReason,LeaseOutcome,Priority,TokenEstimate - State:
Constraints,LimitsHint,GovernorSnapshot - Events:
GovernorEvent,GovernorEventType,GovernorEventHandler - WithLease:
WithLeaseOptions,WithLeaseResult,WithLeaseStrategy - Stats:
StatsCollector,StatsSnapshot - Clock:
Clock