Introduction to Distributed System Design Patterns
Designing distributed system is not as straight forward as non distributed one. We should take on consideration on communication, coordination, and synchronization. There are some examples of common patterns could be applied when building distributed system. In this article, we will talk about 3 of those patterns.
Quorum
Quorum is the minimum number of servers which perform successful operation for declaring an operation as success. If the failure is less than threshold, the operation should be treated as fail. How do we choose the threshold for quorum ?
Where N is the number of our servers. For example if we had 5 duplicated Redis instance, and we want to write to all those instances, but we only get success response from 3 instances, since the majority of the instances available, we could treat the Redis status as available.
- If only 2 nodes in a cluster, all nodes should be always available
- If there are 5 nodes in a cluster, 3 should be available to have a majority
- If there are 7 nodes in a cluster, 4should be available to have a majority
Use case:
- In Cassandra, to ensure data consistency, each write request can be configured to be successful only if the data has been written to majority of replica nodes.
- Redis Sentinels use quorum to detect the failure.
Consistent Hashing
Based on Wikipedia, a hash function is any function that can be used to map data of arbitrary size to fixed-size values. Hash use one way algorithm, so it can’t be converted back to original value.
What is the relation between hashing and distributed system? Supposed we has multiple instances of our service, and to distribute the request to all our instances, we should use load balancer. Load balancer will route the request of different clients to all available instances. Hashing is one of common method to be used for load balancing. Partitioning, the process of splitting into chunks and spread it across multiple servers use hashing too for distributing the data.
The naive approach to distribute our data is by hashing specific key to a number, then find the targeted server by applying modulo to this number.
Now imagine if we use those approach to distributed storage, when we adding / removing the storage, we need to remap all the keys, since the modulo change from initial value, and we should redistributed all the data across storages. We don’t want to do that, since it can have performance and data correctness impact.
Here come the rescue, consistent hashing. Consistent hashing ensure only small amount of data need to be moved when we add / remove instance from our server. It manage our instances on ring and each segment of a ring responsible for certain value of hash.
Based on picture above, we use hash number with multiplication of 5, for example the S1 instance only responsible for hash with value 1 to 5, and the S4 only responsible for value 16 to 20.
Supposed S3 has been removed from our ring, the hash will be handled by next server (clock wise), so the hash ring will look like as bellow.
The S2 now responsible for distribute the data from 6 to 15, but S2 take more load than S1 and S4, the distribution is not balance across all servers. This problem can be solved by using virtual nodes.
Based on previous example, we will create 3 virtual nodes which is randomly distributed across the ring. And instead of mapping the hash to real server, we map it to the virtual node.
Now when S3 server goes down, we could remove all the virtual nodes, and the hash will be handled by next virtual node (clock wise).
To ensure the high availability we also could replicate each data on multiple N nodes where the value N is equivalent to the replication factor.
Use case:
- Dynamo and Cassandra use Consistent Hashing to distribute their data across nodes.
- Riak uses consistent hashing to organize its data storage and replication
Write Ahead Log
Write Ahead Log (WAL) is a mechanism for ensuring data durability when processing the data to be on our system. By using WAL, system need to log the complete action to the data before we make the modification to data.
There are some advantages of using WAL:
- WAL provides more concurrency as readers do not block writers and a writer does not block readers.
- WAL is significantly faster in most scenarios.
- Since WAL is appended sequentially, we can replay all data state easily. For example when the server crash, we can take a look on the log and restore the data based on its last state.
Use case:
- The WAL in Grafana Loki records incoming data and stores it on the local file system in order to guarantee persistence of acknowledged data in the event of a process crash.
- Storage implementation in Kafka follows similar structure as commit logs in databases.
Reference
- https://redis.io/topics/partitioning
- https://www.cs.princeton.edu/courses/archive/fall09/cos518/papers/chash.pdf
- https://martinfowler.com/articles/patterns-of-distributed-systems/quorum.html
- https://www.educative.io/courses/grokking-adv-system-design-intvw
- https://martinfowler.com/articles/patterns-of-distributed-systems/wal.html
- https://www.sqlite.org/wal.html