Introduction
Dynago is my hobby project to implement a minimal implementation of the distributed system described the paper Dynamo : Amazons highly available key-value store.
This is the first in a series of blogs when I describe how I go about building this and post updates on the progress.
The is my way of "#buildinpublic".
The github repository is at https://github.com/mdkhanga/dynago
Why am I doing it ?
The primary goal is learning. Learning not just for me but also for others who may read this blog and review the code.
A secondary goal is build something useful. Perhaps I may use it somewhere or someone else might.
It is possible that parts of the code might be reusable in another project.
Why a Dynamo clone ?
The Dynamo paper is the first paper that got me interested in distributed systems. It has been by desire to build something like that for a long time.
In the real world, DynamoDB is Amazon highly available key/value database. Apache Cassandra is also a project inspired by this paper.
Tech stack
I pick Go as the programming language. Why Go ? I have already done a couple of complex projects in Java (B+tree, Raft protocol). So I was not interested in doing it in Java. I am not yet that familiar with Rust. And C/C++ felt like going back in time. So this is an opportunity to do a complex project in Go.
For network communication / RPC, I use GRPC instead of programming to sockets. My Raft implementation is based on sockets. I wanted to try Grpc. If it does not work out, I can switch to sockets.
Storage engine is TBD. Early versions will be in-memory. In the future Rocksdb or another embeddable store is a possibility.
Some technologies we might learn here:
- Go programming
- Distributed systems
- Databases
- Network programming
- Programming servers from scratch
- Concurrency
- High availability
Implementation plan
1. Cluster membership
Be able to start multiple servers connect and form a cluster. Gossip to detects servers joining and leaving the cluster.
2. Client API
Simple Get and Put API
Replicate to all nodes.
3. Partitioning using consistent hashing
4. Partitioning with replicas
5. Quorum based read / writes
6. Versioning
7. Hinted handoff
...... and so on
Conclusion
I am open to suggestions and discussion. If you know more, I am happy to learn from you.
If you like what I am working on, please follow me on twitter/X or LinkedIn.
No comments:
Post a Comment