Query vs Scan in DynamoDB

Query reads a single by partition key (optionally narrowing the sort key); Scan reads the whole table and filters afterwards. They look similar in the API but they bill — and scale — completely differently.

When should I use Query vs Scan in DynamoDB?

Use Query whenever you can name the partition you need — it reads one and bills only for matched items. Reach for Scan only for one-off exports or tiny tables; it reads every item and bills the whole table before any FilterExpression runs. On real data, Query wins.

Query is targeted: you pay for the items in the matched partition.
Scan is exhaustive: you pay to read every item, then throw most away with a FilterExpression that runs after the read is metered.

On a table of any real size, a Scan with a filter is the classic "why is my bill huge and my latency worse than RDS" footgun.

Side by side

	Query	Scan
Reads	One partition (by PK)	Every item in the table
Billed capacity	Items matched in the partition	Whole table, before filtering
`FilterExpression`	Applied after the read — still billed for the read	Same — filtering never cuts cost
Latency	Flat as the table grows	Grows with table size
Pagination	1 MB/page → `LastEvaluatedKey`	1 MB/page; parallelisable
Use it for	Known access patterns	One-off exports, tiny config tables

The key trap: a FilterExpression runs after DynamoDB meters the read, on both operations. A Scan that "returns 10 rows" can bill for reading a million — filtering is a convenience, never a cost control.

Use Query

Query  PK = "USER#42"  AND  SK begins_with "ORDER#"

If you find yourself reaching for Scan to answer a common access pattern, that is a modelling signal: add a Global Secondary Index so the pattern becomes a Query.

The choice comes down to one question — can you name the partition you need?

If the key is known you Query; if not, add a GSI to make it one, and fall back to Scan only when no key fits.

When Scan is fine

One-off exports, tiny config tables, and background jobs that page through the whole table deliberately. Use Segment/TotalSegments to split a Scan across workers (a ) when you genuinely must read everything.

A reflexive SELECT * FROM table over DynamoDB is the same anti-pattern in PartiQL clothing — it compiles to a Scan. When you really do need cross-item analytics (a GROUP BY, a JOIN, an aggregate), DynoTable's SQL Workbench runs them client-side over a bounded result set instead of hammering the table.

Estimate the read units either pattern will cost with the capacity calculator, and try DynoTable to run and inspect these queries against your own tables.