A look at DynamoDB Key’s.

dynamodb-aws-keys.png

Today, coming primarily from a MySQL background, I realized that choosing the right DynamoDB data structure requires some thinking. Here are some things I wish had been clearer before I started:

The situation:

For the sake of this situation, let us imagine that we are creating a table in our DynamoDB database which will hold Tweets from different users.
We should be able to:

  1. Loop through the Tweets of a specific user in chronological order.
  2. Get a specific Tweet from a user.

PartitionKey
Image from the AWS blog.

What is the Primary Key? #

A unique identifier which identifies a single record in the database. This Primary Key is the combination of the Partition Key and Sort Key.
So, make sure you understand this: Primary Key = Partition Key + Sort Key

What is a Partition Key? #

Under the hood, DynamoDB spreads your data across provisioned areas so reads and writes are faster (this is known as having a wide cardinality). In our case, the Partition Key can be the Twitter users’ username. The Partition Key is required for every query. No query in the DynamoDB world is valid if you are not providing it a Partition Key.

What is a Sort Key? #

The Sort Key is an identifier which digs in to the specific record. For instance, we need a way to distinguish between different Tweets of a specific user. We can achieve this by letting our Sort Key be a timestamp of when the Tweet was added. The Sort Key is optional when it comes to doing queries.

What is the take away? #

Notice that we can produce a unique Primary Key by keeping the same Partition Key (username) but varying the Sort Key (timestamp). This means that the user can Tweet at two different times of the day, and the Partition Key will be the username, and the time of posting (timestamp) will be the Sort Key. Due to the changing nature of the timestamp, the Primary Key will be unique since the Primary Key consists of the combination of both the Partition Key (username) and Sort Key (timestamp).

The key take away for me was: think your data through. Ask yourself, how will the data be getting accessed? What access patterns will I have? How do I need to access specific records?

Further reading:
Choosing the Right DynamoDB Partition Key
Understand Access Patterns for Time Series Data

 
3
Kudos
 
3
Kudos

Now read this

Moving my blog from Svbtle to Substack

I was an early user on Svbtle back when I saw a “Show HN” thread about it on HackerNews. I love the minimal design. I love how it works really well on mobile. I dig the draft preview URLs. I think the composer makes it super easy to... Continue →