The honest answer to "how much server does my AI agent need?" is: less than you think — right up until it suddenly isn't. The trick is knowing which side of that line your workload sits on. So instead of guessing, here's what each kind of agent actually uses.
The one question that decides everything
Does the model run on your server, or does your agent call a model over an API?
If your agent talks to a hosted LLM (the common case), the smart, expensive part happens on someone else's hardware. Your box just runs an orchestration loop: take input, call the API, parse the result, maybe hit a database, repeat. That's light. Genuinely light — a 1 GB box does it without breaking a sweat.
If you run the model locally, everything changes: now RAM is dominated by the model weights, and you'll want every core you can get. Most people don't need this. The ones who do usually know exactly why (privacy, no rate limits, offline). If that's you, we wrote a separate guide on self-hosting a local LLM.
Sizing by workload
Real numbers, the kind you can act on:
| Workload | RAM | vCPU | Disk | Notes |
|---|---|---|---|---|
| Chat / assistant agent (API-backed) | 1 GB | 2 | 15 GB | The loop is tiny; the model is remote |
| Scraper / data agent | 2 GB | 2 | 25 GB | Headroom for parsing + scraped data |
| Trading bot | 1–2 GB | 2 | 15–25 GB | Latency matters more than size — see the trading bot guide |
| Several agents in parallel | 4 GB | 4 | 35 GB | Each agent is cheap; concurrency adds up |
| Local LLM, 3B–7B (quantized) | 4–6 GB | 4–6 | 25–45 GB | CPU-only, runs at a readable pace, not bulk |
The pattern: API-backed agents are tiny; local models are the only thing that's heavy.
Where a CPU box stops
Being straight about the ceiling: our plans top out at 6 GB RAM and 6 cores, CPU-only — no GPU. That covers everything in the table above, including a 7B local model for personal use. What it does not cover: 13B+ models, high-throughput local inference, or anything that genuinely needs a GPU. If that's your workload, a CPU VPS — ours or anyone's — is the wrong tool, and it's better to know that now than after you've deployed.
For the 95% of agents that call an API, none of this is a problem. They're happy on the smallest box.
A practical rule of thumb
- Just an agent that calls an API? Start at 1 GB / 2 cores. You can resize later.
- Scraping or storing data? 2 GB and a bigger disk.
- Running several agents, or a small local model? 4–6 GB and 4+ cores.
- Needs a GPU? Different category — don't force it onto CPU.
Don't over-provision out of nervousness. Agents are lightweight by nature; the cost of a too-small box is one resize, while the cost of a too-big box is paying for idle RAM every month.
When you've picked a size, an agent can rent the box itself over MCP, or you can order it in a minute on the site. Either way, start small — you'll almost certainly need less than you expected.