Query vs Scan in DynamoDB
Query reads a single by partition key (optionally narrowing the
sort key); Scan reads the whole table and filters afterwards. They look
similar in the API but they bill — and scale — completely differently.
When should I use Query vs Scan in DynamoDB?
Use Query whenever you can name the partition you need — it reads one and bills only for matched items. Reach for Scan only for one-off exports or tiny tables; it reads every item and bills the whole table before any FilterExpression runs. On real data, Query wins.
- Query is targeted: you pay for the items in the matched partition.
- Scan is exhaustive: you pay to read every item, then throw most away with a
FilterExpressionthat runs after the read is metered.
On a table of any real size, a Scan with a filter is the classic "why is my
bill huge and my latency worse than RDS" footgun.
Side by side
| Query | Scan | |
|---|---|---|
| Reads | One partition (by PK) | Every item in the table |
| Billed capacity | Items matched in the partition | Whole table, before filtering |
FilterExpression | Applied after the read — still billed for the read | Same — filtering never cuts cost |
| Latency | Flat as the table grows | Grows with table size |
| Pagination | 1 MB/page → LastEvaluatedKey | 1 MB/page; parallelisable |
| Use it for | Known access patterns | One-off exports, tiny config tables |
The key trap: a FilterExpression runs after DynamoDB meters the read, on
both operations. A Scan that "returns 10 rows" can bill for reading a million —
filtering is a convenience, never a cost control.
Use Query
Query PK = "USER#42" AND SK begins_with "ORDER#"
If you find yourself reaching for Scan to answer a common access pattern, that
is a modelling signal: add a Global Secondary Index so the
pattern becomes a Query.
The choice comes down to one question — can you name the partition you need?
If the key is known you Query; if not, add a GSI to make it one, and fall back
to Scan only when no key fits.
When Scan is fine
One-off exports, tiny config tables, and background jobs that page through the
whole table deliberately. Use Segment/TotalSegments to split a Scan across
workers (a ) when you genuinely must read everything.
A reflexive SELECT * FROM table over DynamoDB is the same anti-pattern in
PartiQL clothing — it compiles to a Scan. When you really do need cross-item
analytics (a GROUP BY, a JOIN, an aggregate), DynoTable's SQL Workbench runs
them client-side over a bounded result set instead of hammering the table.
Estimate the read units either pattern will cost with the capacity calculator, and try DynoTable to run and inspect these queries against your own tables.