Distributed Data Guide

In this article we cover the basics of some modern distributed services and databases. This guide is by no means comprehensive or complete.

Distributed system generally fall into two main categories:

  • Shared Disk/Data: These systems share data resources with each other. This means that nodes are dependent upon each other for satisfying requests, which can cause scaling complications. Shared systems can experience downtime caused by a single point of failure (SPOF). An example of this system is a DB cluster with a single master node that has no standby.

  • Shared Nothing: Each update request is satisfied by a single node, independently of other nodes (for the most part). Thus a cluster can potential scale like a blob of nodes, allowing for easier growth and maintenance. It also eliminates SPOF. An example of a shared nothing system is the OpenStack cloud platform and ScyllaDB.

Distributed File Systems

A distributed filesystem is a data storage system that has distributed resources for resilience and performance.

In particular we will discuss distributed filesystem:

Distributed databases:

Supporting Actors:

These tools are used to enhance and enable data distribution:

Additional Information