You launched a t3a.medium with "2 vCPUs" but you're not getting 2 CPUs. Here's what you're actually paying for.
The Misconception
You go to the AWS console, launch a t3a.medium, and see this in the spec:
| Spec | Value |
|---|---|
| vCPUs | 2 |
| Memory | 4 GiB |
| Price | ~$0.047/hr |
Most engineers assume they're getting 2 full CPU cores, always available, for $0.047/hr. That's not what's happening.
What the "T" in T3 Means
AWS has several instance families:
| Family | Type | CPU Model | Example |
|---|---|---|---|
| T3/T3a/T4g | Burstable | Shared, credit-based | t3a.medium |
| M5/M6i/M7i | General purpose | Dedicated | m5.large |
| C5/C6i/C7i | Compute optimized | Dedicated | c5.large |
The "T" stands for burstable. When you buy a T-series instance, you're not buying dedicated CPU cores. You're buying a fraction of a CPU with the ability to temporarily use more.
A t3a.medium gives you 20% of each vCPU as a baseline — meaning you can continuously use 0.4 vCPUs (20% x 2). The other 80% is available on-demand, but only if you have CPU credits to spend.
Why is it cheaper?
This is the deal AWS offers: because most workloads don't use 100% CPU all the time, AWS can pack ~5 burstable instances onto the same physical hardware that would serve 1 dedicated instance. You get a discount; AWS gets better hardware utilization.
t3a.medium: 2 vCPUs (burstable, 20% baseline) → ~$0.047/hr
m5.large: 2 vCPUs (dedicated, 100% always) → ~$0.096/hr
The m5.large costs 2x more because those CPUs are reserved for you, always.
How CPU Credits Work
The credit system is how AWS meters your burst usage.
The Basic Math
1 CPU credit = 1 vCPU running at 100% for 1 minute
A t3a.medium:
- Earns: 24 credits per hour (12 per vCPU x 2 vCPUs)
- Baseline: 20% per vCPU (this is what 24 credits/hr translates to)
- Maximum balance: 576 credits (can bank up to 24 hours worth)
A Real-World Example
Say you're running a Kubernetes node with a service that normally uses 0.01 CPU (1% of one core). That's well under the 0.4 baseline:
Earning: 24 credits/hour
Spending: ~0.6 credits/hour (0.01 CPU ≈ 0.5% utilization)
Net: +23.4 credits/hour accumulating
Max balance: 576 credits
Your credit balance slowly fills up over 24 hours. Life is good.
Now imagine a traffic spike hits and the node needs full CPU:
Hour 1: 2.0 vCPUs used (100%) → spends 120 credits
Hour 2: 2.0 vCPUs used (100%) → spends 120 credits
Hour 3: 2.0 vCPUs used (100%) → spends 120 credits
Hour 4: 2.0 vCPUs used (100%) → spends 120 credits
...
After ~5 hours at full CPU: → 576 credits exhausted
The Prepaid Data Plan Analogy
Think of CPU credits like a prepaid mobile data plan:
- You get 1 GB/day at 4G speed (full CPU)
- After 1 GB is used up, you're throttled to 2G speed (20% baseline)
- You can still use the internet, but everything is painfully slow
- Next day, your quota starts accumulating again
What Happens When Credits Hit Zero
This is where things get serious.
With credits: 2.0 vCPUs available at full speed
Without credits: 2.0 vCPUs CAPPED at 20% → effectively 0.4 vCPUs
The AWS hypervisor literally limits how many CPU cycles your instance can execute. Your instance still shows 2 vCPUs, but each one can only do 20% of the work.
Impact on Kubernetes
On a Kubernetes node throttled to 0.4 vCPUs, everything competes for scraps:
kubelet → needs CPU for heartbeats every 10s
kube-proxy → needs CPU for network rules
containerd → container runtime
OS processes → systemd, journald, etc.
Your application → the thing you actually care about
If the kubelet can't send a heartbeat to the API server within 40 seconds (the default node-monitor-grace-period), the API server marks the node as NodeNotReady and starts evicting pods. Your application goes down — not because it was using too much CPU, but because the underlying node was throttled.
T3 Unlimited Mode
AWS offers a way out: T3 Unlimited mode.
# Check current mode
aws ec2 describe-instance-credit-specifications \
--instance-ids <instance-id>
# Enable unlimited mode
aws ec2 modify-instance-credit-specification \
--instance-credit-specification InstanceId=<id>,CpuCredits=unlimited
With Unlimited mode:
- Your instance never gets throttled
- When credits are exhausted, you keep bursting at full speed
- You pay a small surcharge for "surplus credits" (~$0.05 per vCPU-hour on t3a)
When Unlimited Mode Costs Extra
You only pay extra when:
- Your earned credits are exhausted, AND
- You're using more than baseline (20%)
If your average usage is below 20%, Unlimited mode costs nothing extra — you earn enough credits to cover the occasional burst.
Average 10% usage: Free — credits cover all bursts
Average 20% usage: Free — exactly at baseline
Average 50% usage: Extra cost — 30% surplus x $0.05/vCPU-hr
Average 100% usage: Expensive — just use a dedicated instance
Credit Balance: How to Check and What to Look For
Via CloudWatch
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUCreditBalance \
--dimensions Name=InstanceId,Value=<instance-id> \
--start-time $(date -u -d '6 hours ago' +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 300 --statistics Average
Key Metrics to Monitor
| Metric | What It Means | Alert When |
|---|---|---|
CPUCreditBalance |
Earned credits remaining | Drops below 50 |
CPUSurplusCreditBalance |
Surplus credits used (Unlimited mode) | Consistently above 0 |
CPUSurplusCreditsCharged |
Surplus credits you're paying for | Unexpected charges |
CPUCreditUsage |
Credits spent in the period | Sustained high usage |
Reading the Credit Balance
576 credits → Full (24 hours of baseline earned)
200 credits → Healthy — some bursting happening
50 credits → Warning — approaching exhaustion
0 credits → Standard mode: THROTTLED / Unlimited mode: paying surplus
Instance Comparison: When to Use What
| Scenario | Recommended | Why |
|---|---|---|
| Dev/staging environments | t3a.medium | Low baseline usage, cost-effective |
| Kubernetes worker nodes (production) | m5.large or m6i.large | Predictable performance, no throttling risk |
| CI/CD build agents | t3a.xlarge with Unlimited | Burst during builds, idle otherwise |
| Databases | m5/r5 series | Never throttle a database |
| Batch processing | c5/c6i series | Sustained compute needs dedicated CPU |
| Single dedicated-node workloads | m5.medium over t3a.medium | Same vCPU count, guaranteed performance, ~10% more cost |
The Hidden Cost of Burstable
A t3a.medium at $0.047/hr seems cheaper than an m5.large at $0.096/hr. But consider:
- When a t3a node gets throttled and your pod gets evicted, what's the cost of that downtime?
- When you spend 3 hours debugging why a pod keeps dying, what's the engineering cost?
- If you enable Unlimited and burst frequently, the surplus charges can approach dedicated instance pricing anyway
For production Kubernetes nodes, the small extra cost of dedicated instances often pays for itself in reliability and reduced debugging time.
Quick Reference: T3/T3a Instance Family
| Instance | vCPUs | RAM | Baseline/vCPU | Credits/hr | Max Balance | Price/hr (Mumbai) |
|---|---|---|---|---|---|---|
| t3a.micro | 2 | 1 GiB | 10% | 12 | 288 | ~$0.012 |
| t3a.small | 2 | 2 GiB | 20% | 24 | 576 | ~$0.024 |
| t3a.medium | 2 | 4 GiB | 20% | 24 | 576 | ~$0.047 |
| t3a.large | 2 | 8 GiB | 30% | 36 | 864 | ~$0.075 |
| t3a.xlarge | 4 | 16 GiB | 40% | 96 | 2304 | ~$0.150 |
Note: Baseline percentages are per vCPU. A t3a.medium with 20% baseline on 2 vCPUs gives you 0.4 vCPUs of sustained compute.
Key Takeaways
T-series instances are not dedicated compute. The "2 vCPUs" you see is the burst ceiling, not the sustained capacity. Your sustained capacity is the baseline percentage.
CPU credit exhaustion causes throttling, not failure. Your instance doesn't stop — it slows down. This is often worse than a crash because it causes cascading timeouts and hard-to-diagnose performance issues.
Enable Unlimited mode on all production T-series instances. There's no reason to risk throttling in production. The surplus cost is minimal for occasional bursts.
If you consistently need more than baseline, switch to a dedicated instance. T-series instances are designed for workloads that are mostly idle with occasional spikes — not for sustained high CPU usage.
Monitor
CPUCreditBalancein CloudWatch. Set up alerts before credits hit zero so you can react proactively.
This post is part of a series on debugging Kubernetes pod terminations. Read the full incident story: Why Your Kubernetes Pod Keeps Getting Killed — And It's Not an OOMKill
Top comments (0)