How Usage & Budgets Work

Apogee tracks your AI usage based on the tokens processed by the models that power your requests. This page explains how budgets work, what affects your usage, and how to get the most out of your plan.

How Usage Is Measured

Every time you interact with Apogee - whether through the chat client, via MCP, or through a scheduled digest - the underlying AI model processes tokens. Tokens are small units of text that represent words, parts of words, or punctuation. A typical English sentence is roughly 15-25 tokens.

Your usage is calculated based on two factors:

Number of tokens processed - both the input (your question plus context) and the output (the AI's response)
Cost of the model used - more capable models cost more per token

Both input and output tokens are metered separately because output tokens are significantly more expensive than input tokens across all model tiers.

What Counts as Usage

Different activities consume different amounts of your token budget:

Short factual lookups (e.g., "What's the status of HR 1234?") typically use ~3,000-5,000 tokens total
Detailed analysis (e.g., "Compare these two bills and summarize the differences") might use ~10,000-20,000 tokens because the model processes more text
Research queries that search multiple data sources can use 20,000-50,000+ tokens across multiple internal tool calls - the agent may call several tools in sequence, each adding tokens
Follow-up questions in the same conversation include prior messages as context, so longer conversations gradually use more input tokens per message

Model Pricing

Apogee uses multiple AI models with different capabilities and costs. The model used depends on the task complexity and your chat preset selection:

Model tier	Token cost (per 1M tokens)	When it's used
Haiku	~6x cheaper than Sonnet	Quick lookups, data retrieval, internal research tool calls
Sonnet	Mid-range (reference tier)	Most chat conversations, analysis, the "Smart" preset
Opus	~5x more than Sonnet	Complex reasoning, deep analysis (when available)

You don't need to choose a model manually - the system selects the right one based on the task. In the chat client, different presets (e.g., "Quick" vs. "Smart") use different default models, and the research tool uses Haiku for efficiency with automatic escalation to Sonnet for complex queries.

Practical example: A quick question like "Who sponsors HR 5678?" typically uses ~3,000-5,000 tokens (Haiku). A detailed research request like "Find all AI regulation bills this Congress and summarize their approaches" might use 30,000-50,000 tokens across multiple tool calls depending on how many sources are searched.

Where Usage Comes From

Your budget is shared across all the ways you interact with Apogee. Every source draws from the same session and weekly windows.

Chat Client

When you send a message in the chat client, the AI model processes your message and generates a response. Both the input tokens (your message plus conversation history) and output tokens (the AI's reply) are metered. If the AI calls research tools during a chat response, those internal tool calls are metered separately as MCP usage.

MCP Connections

If you connect to Apogee via MCP (through Claude Desktop, Cursor, or another MCP client), tool calls and AI processing count toward the same budget as the chat client. There's no separate allocation - all usage is pooled.

Digest Subscriptions

Digest subscriptions (scheduled monitoring alerts) consume budget when they run. Each digest execution triggers a research query on your behalf, using tokens to search data sources and generate a summary. Digests run on a schedule (daily or weekly) and draw from your regular budget windows.

If your budget is exhausted when a digest is scheduled to run, the digest is skipped for that cycle and you'll receive a notification. It will attempt to run again at the next scheduled time.

Tip: If you're on the Starter plan and use digests, keep in mind that each daily digest typically uses 5,000-20,000 tokens depending on query complexity. A single daily digest could consume a meaningful portion of a Starter plan's monthly token budget.

Budget Windows

Your plan's token budget is divided into time windows to ensure steady availability throughout the week. There are two windows:

Session Window (4 hours, rolling)

A continuously rolling 4-hour window that prevents burst usage from exhausting your entire weekly budget in a single sitting. Unlike the weekly window, the session window has no fixed reset time - it moves forward continuously. Each request ages out of the window exactly 4 hours after it was made, gradually freeing up session budget over time.

Weekly Window

Your full weekly token budget. Resets every Sunday at midnight UTC. This is the primary budget that determines how much AI you can use in a given week.

How Windows Interact

Both windows must have remaining budget for requests to go through. If either window reaches 100%, new requests are paused until that window resets.

Example: On the Starter plan with ~250K weekly tokens and ~25K session tokens (Sonnet equivalent):

You use ~25K tokens in an intensive 2-hour research session - session window hits 100%
New requests are paused, but the session window starts freeing up as your oldest usage ages past 4 hours
Your weekly budget still has ~225K tokens remaining

This design prevents a single heavy session from using up your entire week's budget while still allowing sustained usage over time.

What Happens When You Hit a Limit

When a budget window reaches 100%:

New AI requests are paused - you'll see a "Limit reached" indicator in the chat header and a banner above the chat input
Your existing conversations and data are unaffected - you can still browse previous chats and access the dashboard
The window resets automatically - session windows free up within hours; the weekly window resets Sunday at midnight UTC
You can upgrade your plan for a higher budget - click the upgrade link in the limit indicator or visit the Billing page

Need more capacity?

When you regularly hit your limit, upgrade to a higher tier for a larger budget and unlimited briefings. There are no add-ons or overage caps to manage - you simply move up a plan from the Billing page.

Plans & Token Budgets

Each plan comes with a monthly token budget that translates to weekly and session allowances. Token amounts shown are Sonnet-equivalent - Haiku-heavy usage (like research tool calls) stretches roughly 6x further.

Plan	Monthly Token Budget	Weekly Tokens	Session Tokens (4hr)
Starter (free to start)	~800K tokens	~250K	~25K
Pro	~3M tokens	~1M	~100K
Teams & Enterprise	~17M tokens (shared)	~5M	~500K

Weekly and session budgets are derived as percentages of the monthly budget (approximately 30% for weekly, 3% for session). This means you can use your full monthly budget if spread evenly, while the windows prevent it from being consumed all at once.

To put these numbers in context, here's roughly how many queries each plan supports per month:

Plan	Simple lookups (~5K tokens)	Research queries (~30K tokens)
Starter	~160/month	~25/month
Pro	~600/month	~100/month
Teams & Enterprise	~3,400/month (shared)	~550/month (shared)

These are estimates assuming Sonnet-equivalent token costs. Queries using the "Quick" preset (Haiku) stretch roughly 6x further.

Team plans share the budget across all team members. The larger budget means the team collectively has significantly more capacity, but very heavy individual usage can still affect the shared pool.

Tips for Maximizing Your Budget

Be specific in your questions

Vague questions like "Tell me about healthcare policy" require the AI to search broadly and generate long responses (often 20,000+ tokens). Specific questions like "What committees is the BIOSECURE Act assigned to?" use far fewer tokens (~3,000-5,000).

Use the right preset

If your chat client offers multiple presets (e.g., "Quick" and "Smart"), use "Quick" for simple lookups and save "Smart" for complex analysis that benefits from a more capable model.

Check your usage

Visit the Usage page to see your current budget consumption across both windows. The header indicator shows your usage percentage when it's above 50%.

Leverage tool results

When the AI retrieves data from a tool (like bill status or news articles), the retrieved data is included in context. Follow-up questions about the same data are cheaper than starting a new search.

Viewing Your Usage

You can check your current usage in several places:

Chat header - A usage indicator appears when you've used more than 50% of any window. It shows your overall usage percentage and turns red when approaching the limit.
Usage page - Visit chat.apog.ai/usage for a detailed breakdown of both time windows, including exact percentages and reset times.
Dashboard - Your dashboard usage page shows the same information if you prefer the web dashboard.
Billing page - Visit chat.apog.ai/billing to see your current plan, weekly budget summary, and upgrade options.

Upgrading Your Plan

If you regularly hit your usage limit, consider upgrading:

From Starter to Individual - ~4x the token budget, enough for daily professional use
From Individual to Team - ~5x the token budget (shared), plus unlimited seats and collaborative features
Enterprise - custom token budget sized to your organization's needs

Visit the Billing page to compare plans and upgrade, or contact sales for Enterprise pricing.

Frequently Asked Questions

Does MCP usage count against my budget?

Yes. Whether you use the hosted chat client or connect via MCP (Claude Desktop, Cursor, etc.), tool calls and AI processing count toward the same budget.

Do data lookups cost anything?

Simple data retrieval (searching bills, fetching news) has minimal token cost. The bulk of usage comes from AI model processing - generating analysis, summaries, and research synthesis.

What happens to my budget if I don't use it?

Unused budget does not roll over. The weekly window resets every Sunday regardless of how much was used. This keeps pricing simple and predictable.

Do digest subscriptions count against my budget?

Yes. Each digest run triggers a research query that consumes tokens from your regular budget. If your budget is exhausted when a digest is scheduled, it will be skipped and you'll be notified. A typical digest uses 5,000-20,000 tokens per run depending on query complexity.

Why does a long conversation cost more over time?

Each message in a conversation includes the prior messages as context so the AI can maintain coherence. As the conversation grows, each new message sends more input tokens. Starting a new conversation resets this context and reduces per-message token usage.

Can I see exactly what used my budget?

The usage page shows your overall consumption percentage. Detailed per-request token breakdowns are not currently exposed in the UI, but usage is tracked internally and reflected in your window percentages in real time.