How Usage & Budgets Work
How Apogee pricing works - AI budget windows, model cost tiers, session and weekly limits, and tips for maximizing your plan. Free tier through Enterprise.
How Usage & Budgets Work
Apogee tracks your AI usage based on the tokens processed by the models that power your requests. This page explains how budgets work, what affects your usage, and how to get the most out of your plan.
How Usage Is Measured
Every time you interact with Apogee - whether through the chat client, via MCP, or through a scheduled digest - the underlying AI model processes tokens. Tokens are small units of text that represent words, parts of words, or punctuation. A typical English sentence is roughly 15-25 tokens.
Your usage is calculated based on two factors:
- Number of tokens processed - both the input (your question plus context) and the output (the AI's response)
- Cost of the model used - more capable models cost more per token
Both input and output tokens are metered separately because output tokens are significantly more expensive than input tokens across all model tiers.
What Counts as Usage
Different activities consume different amounts of your token budget:
- Short factual lookups (e.g., "What's the status of HR 1234?") typically use ~3,000-5,000 tokens total
- Detailed analysis (e.g., "Compare these two bills and summarize the differences") might use ~10,000-20,000 tokens because the model processes more text
- Research queries that search multiple data sources can use 20,000-50,000+ tokens across multiple internal tool calls - the agent may call several tools in sequence, each adding tokens
- Follow-up questions in the same conversation include prior messages as context, so longer conversations gradually use more input tokens per message
Model Pricing
Apogee uses multiple AI models with different capabilities and costs. The model used depends on the task complexity and your chat preset selection:
| Model tier | Token cost (per 1M tokens) | When it's used |
|---|---|---|
| Haiku | ~6x cheaper than Sonnet | Quick lookups, data retrieval, internal research tool calls |
| Sonnet | Mid-range (reference tier) | Most chat conversations, analysis, the "Smart" preset |
| Opus | ~5x more than Sonnet | Complex reasoning, deep analysis (when available) |
You don't need to choose a model manually - the system selects the right one based on the task. In the chat client, different presets (e.g., "Quick" vs. "Smart") use different default models, and the research tool uses Haiku for efficiency with automatic escalation to Sonnet for complex queries.
Practical example: A quick question like "Who sponsors HR 5678?" typically uses ~3,000-5,000 tokens (Haiku). A detailed research request like "Find all AI regulation bills this Congress and summarize their approaches" might use 30,000-50,000 tokens across multiple tool calls depending on how many sources are searched.
Where Usage Comes From
Your budget is shared across all the ways you interact with Apogee. Every source draws from the same session and weekly windows.
Chat Client
When you send a message in the chat client, the AI model processes your message and generates a response. Both the input tokens (your message plus conversation history) and output tokens (the AI's reply) are metered. If the AI calls research tools during a chat response, those internal tool calls are metered separately as MCP usage.
MCP Connections
If you connect to Apogee via MCP (through Claude Desktop, Cursor, or another MCP client), tool calls and AI processing count toward the same budget as the chat client. There's no separate allocation - all usage is pooled.
Digest Subscriptions
Digest subscriptions (scheduled monitoring alerts) consume budget when they run. Each digest execution triggers a research query on your behalf, using tokens to search data sources and generate a summary. Digests run on a schedule (daily or weekly) and draw from your regular budget windows.
If your budget is exhausted when a digest is scheduled to run, the digest is skipped for that cycle and you'll receive a notification. It will attempt to run again at the next scheduled time.
Tip: If you're on the Starter plan and use digests, keep in mind that each daily digest typically uses 5,000-20,000 tokens depending on query complexity. A single daily digest could consume a meaningful portion of a Starter plan's monthly token budget.
Budget Windows
Your plan's token budget is divided into time windows to ensure steady availability throughout the week. There are two windows:
Session Window (4 hours, rolling)
A continuously rolling 4-hour window that prevents burst usage from exhausting your entire weekly budget in a single sitting. Unlike the weekly window, the session window has no fixed reset time - it moves forward continuously. Each request ages out of the window exactly 4 hours after it was made, gradually freeing up session budget over time.
Weekly Window
Your full weekly token budget. Resets every Sunday at midnight UTC. This is the primary budget that determines how much AI you can use in a given week.
How Windows Interact
Both windows must have remaining budget for requests to go through. If either window reaches 100%, new requests are paused until that window resets.
Example: On the Starter plan with ~250K weekly tokens and ~25K session tokens (Sonnet equivalent):
- You use ~25K tokens in an intensive 2-hour research session - session window hits 100%
- New requests are paused, but the session window starts freeing up as your oldest usage ages past 4 hours
- Your weekly budget still has ~225K tokens remaining
This design prevents a single heavy session from using up your entire week's budget while still allowing sustained usage over time.
What Happens When You Hit a Limit
When a budget window reaches 100%:
- New AI requests are paused - you'll see a "Limit reached" indicator in the chat header and a banner above the chat input
- Your existing conversations and data are unaffected - you can still browse previous chats and access the dashboard
- The window resets automatically - session windows free up within hours; the weekly window resets Sunday at midnight UTC
- You can upgrade your plan for a higher budget - click the upgrade link in the limit indicator or visit the Billing page
Overage (Paid Plans)
If you're on a paid plan, you can enable overage to continue using Apogee beyond your weekly budget. When overage is enabled:
- Requests continue to be processed even after your weekly budget is exhausted
- Overage usage accumulates and is billed at the end of your billing cycle
- You can set a spending cap to limit how much overage you're willing to incur
- You can enable, disable, or adjust your overage cap from the Billing page
Overage is not available on the Starter (free) plan.
Plans & Token Budgets
Each plan comes with a monthly token budget that translates to weekly and session allowances. Token amounts shown are Sonnet-equivalent - Haiku-heavy usage (like research tool calls) stretches roughly 6x further.
| Plan | Monthly Token Budget | Weekly Tokens | Session Tokens (4hr) |
|---|---|---|---|
| Starter (Free) | ~800K tokens | ~250K | ~25K |
| Individual | ~3M tokens | ~1M | ~100K |
| Team | ~17M tokens (shared) | ~5M | ~500K |
| Enterprise | Custom | Custom | Custom |
Weekly and session budgets are derived as percentages of the monthly budget (approximately 30% for weekly, 3% for session). This means you can use your full monthly budget if spread evenly, while the windows prevent it from being consumed all at once.
To put these numbers in context, here's roughly how many queries each plan supports per month:
| Plan | Simple lookups (~5K tokens) | Research queries (~30K tokens) |
|---|---|---|
| Starter | ~160/month | ~25/month |
| Individual | ~600/month | ~100/month |
| Team | ~3,400/month (shared) | ~550/month (shared) |
These are estimates assuming Sonnet-equivalent token costs. Queries using the "Quick" preset (Haiku) stretch roughly 6x further.
Team plans share the budget across all team members. The larger budget means the team collectively has significantly more capacity, but very heavy individual usage can still affect the shared pool.
Tips for Maximizing Your Budget
Be specific in your questions
Vague questions like "Tell me about healthcare policy" require the AI to search broadly and generate long responses (often 20,000+ tokens). Specific questions like "What committees is the BIOSECURE Act assigned to?" use far fewer tokens (~3,000-5,000).
Use the right preset
If your chat client offers multiple presets (e.g., "Quick" and "Smart"), use "Quick" for simple lookups and save "Smart" for complex analysis that benefits from a more capable model.
Check your usage
Visit the Usage page to see your current budget consumption across both windows. The header indicator shows your usage percentage when it's above 50%.
Leverage tool results
When the AI retrieves data from a tool (like bill status or news articles), the retrieved data is included in context. Follow-up questions about the same data are cheaper than starting a new search.
Viewing Your Usage
You can check your current usage in several places:
- Chat header - A usage indicator appears when you've used more than 50% of any window. It shows your overall usage percentage and turns red when approaching the limit.
- Usage page - Visit chat.apog.ai/usage for a detailed breakdown of both time windows, including exact percentages and reset times.
- Dashboard - Your dashboard usage page shows the same information if you prefer the web dashboard.
- Billing page - Visit chat.apog.ai/billing to see your current plan, weekly budget summary, and upgrade options.
Upgrading Your Plan
If you regularly hit your usage limit, consider upgrading:
- From Starter to Individual - ~4x the token budget, enough for daily professional use
- From Individual to Team - ~5x the token budget (shared), plus unlimited seats and collaborative features
- Enterprise - custom token budget sized to your organization's needs
Visit the Billing page to compare plans and upgrade, or contact sales for Enterprise pricing.
Frequently Asked Questions
Does MCP usage count against my budget?
Yes. Whether you use the hosted chat client or connect via MCP (Claude Desktop, Cursor, etc.), tool calls and AI processing count toward the same budget.
Do data lookups cost anything?
Simple data retrieval (searching bills, fetching news) has minimal token cost. The bulk of usage comes from AI model processing - generating analysis, summaries, and research synthesis.
What happens to my budget if I don't use it?
Unused budget does not roll over. The weekly window resets every Sunday regardless of how much was used. This keeps pricing simple and predictable.
Do digest subscriptions count against my budget?
Yes. Each digest run triggers a research query that consumes tokens from your regular budget. If your budget is exhausted when a digest is scheduled, it will be skipped and you'll be notified. A typical digest uses 5,000-20,000 tokens per run depending on query complexity.
Why does a long conversation cost more over time?
Each message in a conversation includes the prior messages as context so the AI can maintain coherence. As the conversation grows, each new message sends more input tokens. Starting a new conversation resets this context and reduces per-message token usage.
Can I see exactly what used my budget?
The usage page shows your overall consumption percentage. Detailed per-request token breakdowns are not currently exposed in the UI, but usage is tracked internally and reflected in your window percentages in real time.