Skip to content

API Reference

Everything documented here is exported from @mcptoolshop/throttleai and is part of the stable public API. Breaking changes require a major version bump.

Factory function. Creates and returns a Governor instance with the provided configuration.

import { createGovernor, presets } from "@mcptoolshop/throttleai";
const gov = createGovernor(presets.balanced());

See Configuration for the full config reference.

governor.acquire(request): AcquireDecision

Section titled “governor.acquire(request): AcquireDecision”

Request a lease from the governor. Returns synchronously with an immediate decision.

interface AcquireRequest {
actorId: string; // identifies the caller
action: string; // what they are doing ("chat", "embed", etc.)
priority?: "interactive" | "background"; // default: "interactive"
estimate?: TokenEstimate; // token estimates and concurrency weight
idempotencyKey?: string; // deduplicate concurrent identical requests
}
interface TokenEstimate {
promptTokens?: number; // estimated prompt tokens
maxOutputTokens?: number; // estimated max output tokens
weight?: number; // concurrency weight for this call (default: 1)
}
// Granted
{
granted: true,
leaseId: string,
expiresAt: number, // Unix timestamp (ms)
}
// Denied
{
granted: false,
reason: DenyReason,
retryAfterMs: number,
recommendation: string, // human-readable suggestion
limitsHint?: LimitsHint, // which limit was hit and current values
}
ReasonMeaning
"concurrency"All slots are in use (or interactive reserve blocks background)
"rate"requestsPerMinute or tokensPerMinute limit reached
"budget"Token budget exhausted
"policy"Fairness soft-cap prevents this actor from acquiring more slots

Release a previously acquired lease. Always call this, even on errors.

gov.release(decision.leaseId, {
outcome: "success", // "success" | "error" | "timeout" | "cancelled"
latencyMs: 1200, // optional: actual call duration
usage: { promptTokens: 300, outputTokens: 200 }, // optional: actual token consumption
});

The report object is optional but strongly recommended. Adaptive tuning and stats collection use outcome and latency data to make better decisions.

OutcomeWhen to use
"success"The operation completed normally
"error"The operation failed (5xx, exception, etc.)
"timeout"The operation timed out
"cancelled"The operation was cancelled by the caller

In strict mode (strict: true), releasing an unknown or already-released lease throws an error. In normal mode, it is silently ignored.

withLease(governor, request, fn, options?)

Section titled “withLease(governor, request, fn, options?)”

Execute an async function under a lease with automatic acquire and release. This is the recommended way to use ThrottleAI for most cases.

import { withLease } from "@mcptoolshop/throttleai";
const result = await withLease(gov, request, async () => {
return await callMyModel();
});
if (result.granted) {
console.log(result.result);
} else {
console.log("Denied:", result.decision.recommendation);
}
interface WithLeaseOptions {
strategy?: WithLeaseStrategy; // "deny" | "wait" | "wait-then-deny"
maxWaitMs?: number; // max total wait (default: 10_000)
maxAttempts?: number; // for "wait-then-deny" (default: 3)
initialBackoffMs?: number; // starting backoff (default: 250)
}
StrategyBehavior
"deny"Fail immediately if denied. Default.
"wait"Retry with exponential backoff until maxWaitMs.
"wait-then-deny"Retry up to maxAttempts within maxWaitMs, then deny.
// Granted
{ granted: true, result: T }
// Denied
{ granted: false, decision: AcquireDecision }

Returns a point-in-time view of the governor’s state. Useful for dashboards, health checks, and debugging.

const snap = gov.snapshot();
interface GovernorSnapshot {
timestamp: number; // when the snapshot was taken
activeLeases: number; // number of active leases
concurrency: { // null if concurrency not configured
inFlightWeight: number;
inFlightCount: number;
max: number;
effectiveMax: number; // may be lower when adaptive is active
available: number;
} | null;
requestRate: { // null if rate not configured
current: number;
limit: number;
} | null;
tokenRate: { // null if token rate not configured
current: number;
limit: number;
} | null;
fairness: boolean; // whether fairness is active
adaptive: boolean; // whether adaptive is active
lastDeny: { // most recent deny event, or null
reason: DenyReason;
timestamp: number;
actorId?: string;
} | null;
}

One-line human-readable formatter for governor events.

import { formatEvent } from "@mcptoolshop/throttleai";
const gov = createGovernor({
onEvent: (e) => console.log(formatEvent(e)),
});
// [deny] actor=user-1 action=chat reason=concurrency retryAfterMs=500

Compact one-line formatter for snapshots.

import { formatSnapshot } from "@mcptoolshop/throttleai";
console.log(formatSnapshot(gov.snapshot()));
// concurrency=3/5 rate=12/60 leases=3

Convenience properties on the governor instance for quick checks without calling snapshot().

gov.activeLeases // number — active lease count
gov.concurrencyActive // number — in-flight weight
gov.concurrencyAvailable // number — remaining capacity
gov.concurrencyEffectiveMax // number — effective concurrency limit (may differ from max when adaptive is active)
gov.rateCount // number — requests in current window
gov.rateLimit // number — configured request-rate limit
gov.tokenRateCount // number — tokens in current window
gov.tokenRateLimit // number — configured token-rate limit

Stops the TTL reaper interval. Call on application shutdown.

process.on("SIGINT", () => {
gov.dispose();
process.exit(0);
});

After dispose:

  • acquire() still works. The governor does not shut down — it just stops sweeping expired leases.
  • Expired leases will not be reaped. They sit until explicitly released or the governor is garbage-collected.
  • dispose() is idempotent. Calling it twice is safe.

If you do not call dispose, the reaper setInterval keeps the Node.js process alive.

Sleeps for the retryAfterMs duration from a deny decision. Useful for manual retry loops.

retryAcquire(governor, request, options): Promise<AcquireDecision>

Section titled “retryAcquire(governor, request, options): Promise<AcquireDecision>”

Retry acquire with backoff. Used internally by withLease but available for custom retry logic.

Returns a zero-dependency stats collector that you can wire to onEvent:

const stats = createStatsCollector();
const gov = createGovernor({
onEvent: stats.handler,
});
const summary = stats.snapshot();
// { grants: 150, denies: 12, avgLatencyMs: 340, ... }

Creates a deterministic clock for testing. Inject it to control time in tests:

import { createTestClock } from "@mcptoolshop/throttleai";
const clock = createTestClock(100_000); // start at 100s
// Advance time by 1 minute
clock.advance(60_000);

All types are exported from @mcptoolshop/throttleai:

  • Config: GovernorConfig, ConcurrencyConfig, RateConfig, FairnessConfig, AdaptiveConfig
  • Request/Response: AcquireRequest, AcquireDecision, ReleaseReport
  • Enums: DenyReason, LeaseOutcome, Priority, TokenEstimate
  • State: Constraints, LimitsHint, GovernorSnapshot
  • Events: GovernorEvent, GovernorEventType, GovernorEventHandler
  • WithLease: WithLeaseOptions, WithLeaseResult, WithLeaseStrategy
  • Stats: StatsCollector, StatsSnapshot
  • Clock: Clock