or

Skill from zeroae/zae-limiter

zae-limiter

PyPI version Conda version Python versions License Lint Tests codecov Docs

A rate limiting library backed by DynamoDB using the token bucket algorithm.

Installation

BASH
pip install zae-limiter
# or
conda install -c conda-forge zae-limiter

Usage

PYTHON
from zae_limiter import Repository, RateLimiter, SyncRepository, SyncRateLimiter, Limit

# Auto-provisions infrastructure if needed
repo = await Repository.open()
limiter = RateLimiter(repository=repo)

# Sync
sync_repo = SyncRepository.open()
sync_limiter = SyncRateLimiter(repository=sync_repo)

# Define default limits (can be overridden per-entity)
default_limits = [
    Limit.per_minute("rpm", 100),
    Limit.per_minute("tpm", 10_000),
]

async with limiter.acquire(
    entity_id="api-key-123",
    resource="gpt-4",
    limits=default_limits,  # Multiple limits in a single atomic transaction
    consume={"rpm": 1, "tpm": 500},  # Estimate tokens upfront
) as lease:
    response = await call_llm()
    # Reconcile actual usage (can go negative for post-hoc adjustment)
    await lease.adjust(tpm=response.usage.total_tokens - 500)
    # Tokens written to DynamoDB on enter | Rolled back on exception

# Hierarchical entities: create project with stored limits, then API key under it
await limiter.create_entity(entity_id="proj-1", name="Production")
await limiter.set_limits("proj-1", [Limit.per_minute("tpm", 100_000)])  # Project-level
await limiter.create_entity(entity_id="api-key-456", parent_id="proj-1", cascade=True)

# cascade is an entity property — acquire() auto-cascades to parent
with sync_limiter.acquire(
    entity_id="api-key-456",
    resource="gpt-4",
    limits=default_limits,
    consume={"rpm": 1, "tpm": 500},
):
    call_api()

# Multi-tenant: each tenant gets an isolated namespace
tenant_repo = await Repository.open("tenant-alpha")
tenant_limiter = RateLimiter(repository=tenant_repo)

# Cleanup (removes all data)
await repo.delete_stack()

Documentation

Full Documentation

GuideDescription
Getting StartedInstallation, first deployment
Basic UsageRate limiting patterns, error handling
Hierarchical LimitsParent/child entities, cascade mode
LLM IntegrationToken estimation and reconciliation
CLI ReferenceDeploy, status, delete commands
Multi-Tenant GuideNamespace isolation, per-tenant IAM
Production GuideSecurity, monitoring, cost

Production Deployment

The default deployment includes CloudWatch alarms and usage aggregation. For production, add data recovery and alert routing:

BASH
zae-limiter deploy --name my-app --region us-east-1 \
    --pitr-recovery-days 7 \
    --alarm-sns-topic arn:aws:sns:us-east-1:123456789012:alerts

For security best practices, multi-region considerations, and cost estimation, see the Production Guide.

Contributing

BASH
git clone https://github.com/zeroae/zae-limiter.git && cd zae-limiter
uv sync --all-extras
pytest

See the Contributing Guide for development setup, testing, and architecture details.

License

MIT