DynamoDB: An Inside Look Into NoSQL – Part 5

In Part 4, we talked about partitioning and replication in detail. We introduced consistent hashing, virtual nodes and the concept of coordinator nodes and preference list. In this article, we will discuss Data Versioning.

Data Versioning

Eventual consistency, introduced by DynamoDB, allows for the updates to be pushed to all storage nodes asynchronously. A put operation returns before the update is pushed to all replicas, which results in scenarios where a subsequent get operation may return a value that does not reflect the latest changes. Depending on network partitions and server outages, not all replicas might have the latest updates even after an extended period of time.

But there are certain requirements that do not need latest updates and can still tolerate certain inconsistencies. One such requirement is “Add To Cart” where put operation should always succeed and a get operation returning an old object is still tolerable. If a user makes a change to an earlier version of the shopping cart object, that change is still meaningful and should be preserved at all costs. However, the currently unavailable latest state of the shopping cart can have its own version of updates which should also be preserved. It is evident that data versioning has to be implemented to handle such scenarios.

In order to achieve such guarantees, DynamoDB treats the result of every modification as a new and immutable version of data. This allows for multiple versions of an object to be present in the system at the same time. If the newer version subsumes the earlier one, then the system can automatically determine the authoritative version (syntactic reconciliation). However, in some cases the versions conflict and the client has to manually perform the reconciliation (semantic reconciliation).

Data Versioning (Credit)

DynamoDB uses vector clocks in order to capture causality between multiple versions of an object. A vector clock is a list of (node, counter) pairs. One vector clock is associated with one version of every object. By analyzing the vector clocks, you can find out if the versions have a causal ordering or are on parallel branches. When a client wishes to perform an update, it must specify which version it is updating. This version can be got from an earlier get operation.

Version evolution of an object (Credit)

Let’s understand how vector clocks works: A client writes a new object. Node Sx handles this write and creates a vector clock [(Sx, 1)] for the object D1. If the client now updates the object and node Sx again handles the request, we get a new object D2 and its vector clock [(Sx, 2)]. The client updates the object again and this time node Sy handles the request leading to object D3 with the vector clock [(Sx, 2), (Sy, 1)]. When a different client tries to update the object after reading D2, the new vector clock entry for object D4 is [(Sx, 2), (Sz, 1)] where Sz is the node that handled the request. Now when a new write request is issued by the client, it sees that there are already D3 and D4 objects. If node Sx is handling the request, it performs the reconciliation process and the new data object D5 is created with the vector clock [(Sx, 3), (Sy, 1), (Sz, 1)].

One possible issue with vector clocks is that its size may grow rapidly if multiple servers coordinate the writes to a particular object. However, this issue is unlikely since in production only the nodes from the preference list handle the operation. It is still desirable to limit the size of vector clock. DynamoDB implements the following clock truncation scheme: A timestamp, which indicates the last time that node updated an item, is stored along with (node, counter) pair. When the number of (mode, counter) pairs reaches a threshold (say 15), the oldest pair is removed from the clock.

In the next article, we will look into how to Handle Failures.

Article authored by Vijay Olety.

X-Post from CloudAcademy.com

Advertisements
This entry was posted in Technical and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s