Advanced7 min read

How a DynamoDB GSI Is Stored Internally

A is not a pointer back into your table. It is a separate, internally managed table — its own partitions, its own key schema, its own capacity — that DynamoDB keeps in sync by copying writes into it asynchronously.

Coming from SQL, an index is a B-tree bolted onto the same physical table, updated inside the same transaction. A GSI breaks both of those assumptions, and almost every GSI surprise traces back to that one fact.

How is a DynamoDB GSI stored?

A DynamoDB GSI is stored as a separate, internally managed table — its own partitions, key schema, and capacity — not as a pointer into the base table. DynamoDB copies each write into the index asynchronously, storing only the GSI keys, the base table keys, and any attributes.

  • A GSI is its own table. It has a fully independent partition space keyed by the GSI's partition key, not the base table's.
  • Writes replicate asynchronously. Your write commits to the base table first, then DynamoDB fans it out to each GSI on a background path.
  • Only projected attributes are stored. The index holds the GSI keys, the base keys, plus whatever attributes you projected — nothing else.
  • The GSI key need not be unique. Multiple base items can share one GSI partition/sort key; the base is the tiebreaker that keeps them distinct.

Start with one base item

Take a SaaS audit log. Every privileged action in a workspace becomes an immutable event. The base table, WorkspaceEvents, is keyed so all of a workspace's events live in one , ordered by time:

WorkspaceEvents (base table)
EventPKEventSKactorIdverbtargetRef
WS#orbit-9TS#2026-06-23T14:02:11ZUSR#kpROLE_GRANTEDUSR#mara

EventPK = "WS#orbit-9" partitions by workspace; EventSK is an ISO timestamp so a Query returns one workspace's events in chronological order. That serves "show me this workspace's timeline" perfectly.

It serves nothing else. You can't ask "what did USR#kp do across every workspace?" — actorId isn't a key, so the only way to answer it on the base table is a full Scan. That is the access pattern a GSI exists to add.

Add a GSI and watch a second table appear

Define a GSI, ByActor, that re-partitions the same events by who performed them:

ByActor (GSI)
GSI1PK = actorId   ("USR#kp")
GSI1SK = EventSK   ("TS#2026-06-23T14:02:11Z")

DynamoDB now maintains a second physical structure. The same logical event is stored twice — once in the base table's WS#orbit-9 partition, and again in the GSI's USR#kp partition:

ByActor (GSI) — its own partition space
GSI1PKGSI1SKEventPKEventSKverb
USR#kpTS#2026-06-23T14:02:11ZWS#orbit-9TS#2026-06-23T14:02:11ZROLE_GRANTED

Note what rode along: the base table's keys (EventPK, EventSK) are stored in every GSI item automatically. That's how a GSI hit can point you back to the full item — and why a KEYS_ONLY index still costs storage.

What actually lives in the GSI

The index does not copy the whole item. Each GSI entry holds exactly three things, and you control only the third:

Stored in the GSIWhere it comes fromOptional?
GSI partition + sort keyThe attributes you named as GSI keysNo
Base table key(s)Copied from every base itemNo
Projected attributesYour Projection choiceYes

Projection is KEYS_ONLY, INCLUDE (a named list), or ALL. A Query on the GSI can only return attributes that are in the index.

Ask for one that isn't projected and DynamoDB does not transparently fetch it — you get nothing back for that field. (AWS GSI docs)

That's the relational trap reversed: SQL would join back to the heap for the missing column. A GSI never does. The is the whole contract.

How a write reaches the index

The replication is the part that breaks SQL intuition hardest. A base write and its index update are not one atomic operation.

When you PutItem, DynamoDB durably commits to the base table, acknowledges your write, and then propagates the change onto a background path that updates each GSI. The acknowledgment does not wait for the index.

Here's the order of events for our audit write, top to bottom:

PutItemWS#orbit-9 eventCommit tobase partition200 OKto callerAsync path:extract GSI keysRoute to ByActorpartition USR#kpWrite projectedattributes

The caller gets its 200 OK at step three, before steps four through six finish — so a Query on ByActor in the gap can miss a brand-new event.

That asynchrony is by design, not a defect: it's the lineage of the 2007 Amazon Dynamo paper, which chose availability over synchronous consistency. The full consequences live in why a GSI is eventually consistent.

The GSI key is not a unique key

In SQL, a non-unique secondary index is the default and a unique one is a constraint you opt into. A GSI is the opposite: it has no uniqueness guarantee, ever.

Two audit events from the same actor at timestamps that collide would share the same GSI1PK and GSI1SK. DynamoDB stores both — it disambiguates them internally by the base table's primary key, which is always carried along.

So a GSI Query for one actor at one instant can legitimately return several items. If you assumed one-row-per-key the way a SQL unique index would give you, that's the footgun.

When you query the index, the DynamoDB Expression Builder writes the KeyConditionExpression with names and values escaped correctly — e.g. matching one actor since a cutoff:

KeyConditionExpression: "#a = :actor AND #ts > :since"
ExpressionAttributeNames:  { "#a": "actorId", "#ts": "EventSK" }
ExpressionAttributeValues: {
  ":actor": { "S": "USR#kp" },
  ":since": { "S": "TS#2026-06-01T00:00:00Z" }
}

Capacity lives with the index, not the table

Because the GSI is its own table, it has its own read and write capacity, billed and throttled separately from the base table. A read off ByActor consumes the GSI's read units, never the table's.

The reverse coupling is the one that bites: every base-table write also writes the index, and if the GSI can't absorb that, it back-pressures the base write. That mechanism gets its own guide — when a GSI throttles base-table writes.

This is also why a GSI's partition key matters as much as the base table's. A low-cardinality GSI key clumps writes onto one index partition even when base writes are perfectly spread — a hot partition you created by re-keying.

Pitfalls and next steps

  • Don't expect un-projected attributes back. A GSI Query returns only what the index stores. If you need the full item, project it or fetch it from the base table by the carried-along keys.
  • Don't treat a GSI key as unique. Plan for a Query to return more than one item per key; the base primary key is the only real identity.
  • Don't read a GSI right after the write that fed it. The async path means the index may not show your write yet — read the base table when you need read-your-own-writes.
  • Size the GSI's capacity deliberately. It is independent on reads and a hidden dependency on writes.

The whole game is choosing key shapes that serve your patterns — single-table design overloads one GSI across many of them; GSI vs LSI covers when a local index fits instead.

Build and preview your GSI KeyConditionExpression in the DynamoDB Expression Builder, then try DynoTable to inspect an index's projected attributes and watch writes replicate into the GSI on your own tables.

Updated