Distributed systems and computer networks

Introduction and overview
- Metrics and order of magnitude estimations
CDN
DNS
Big storage: file systems
- GFS (Google File System)
- HDFS (Hadoop Distributed File System)
Big storage: databases
- NoSQL databases
- Transaction processing
  - Two-phase/three-phase commit protocol
  - Paxos
Computation
- MapReduce
- Unique ID generation
Distributed systems
Protocols
- Layer 4 protocols
  - UDP
  - TCP
- Layer 7 protocols
  - HTTP
  - History
- WebSocket
CAP theorem
Load balancing
- Layer 4 load balancing
- Layer 7 load balancing
Blockchain
Distributed graphs

Introduction and overview

🔗

S.Kozlovski. A thorough introduction to distributed systems – Overload 149 (2019)

🎥

T.Berglund. Distributed systems in one lesson – Devoxx Poland (2017)
D.Malan. Scalability – Harvard CS75: Building dynamic websitesgh (2012)
J.Dean. Building software systems at Google and lessons learned – Stanford (2010)

Metrics and order of magnitude estimations

🔗

S.Ignatchenko. The importance of back-of-envelope estimates – Overload 137 (2017)

CDN

🔗

Content delivery network – Wikipedia

🎥

S.Keshav. DNS and CDN – CS 436: Distributed Computer Systems (2013)
A.Bergman. What is a CDN and why developers should care about using one – GOTO (2016)

📄

J.Dilley et al. Globally distributed content delivery – IEEE Internet Computing 6, 50 (2002)

DNS

🔗

Domain Name System – Wikipedia
DNS technical reference – Microsoft Docs (2017)

🎥

S.Keshav. DNS and CDN – CS 436: Distributed Computer Systems (2013)

Big storage: file systems

GFS (Google File System)

🔗

GFS FAQ – MIT 6.824: Distributed systems (2020)

🎥

R.Morris. GFS (Notes) – MIT 6.824: Distributed systems (2020)

📄

K.McKusick, S.Quinlan. GFS: Evolution on fast-forward – ACM Queue 7 (2009)
S.Ghemawat, H.Gobioff, S.-T. Leung. The Google file system – ACM SIGOPS Operating Systems Review 37, 29 (2003)

HDFS (Hadoop Distributed File System)

📄

K.Shvachko, H.Kuang, S.Radia, R.Chansler. The Hadoop distributed file system – IEEE Symposium on Mass Storage Systems and Technologies (2010)

Big storage: databases

NoSQL databases

🔗

F.Gessert. NoSQL databases: A survey and decision guidance (2016)

🎥

M.Fowler. Introduction to NoSQL – GOTO (2012)

Google BigTable

🔗

Bigtable – Wikipedia

🎥

J.Dean. BigTable: A distributed structured storage system – CSE Colloquia (2005)

📄

F.Chang et al. BigTable: A distributed storage system for structured data – ACM Transactions on Computer Systems 26, 4 (2008)

Redis

🔗

Redis documentation
S.Sanfilippo. Redis persistence demystified (2012)

Memcached

🔗

Memcached

🎥

R.Nishtala. Scaling Memcache at Facebook – NSDI (2013)

Transaction processing

🔗

🎥

R.Barrett. Transactions across datacenters – Google I/O (2009)

Two-phase/three-phase commit protocol

🔗

Two-phase commit protocol – Wikipedia

❔

How does three-phase commit avoid blocking? – Stack Overflow

Paxos

🔗

Paxos – Wikipedia

🎥

H.Howard. Paxos agreement – Computerphile (2016)
C.Colohan. Paxos simplified – Distributed Systems Design (2017)

❔

Paxos vs two phase commit – Stack Overflow

Computation

MapReduce

🔗

H.Robinson. The elephant was a trojan horse: On the death of Map-Reduce at Google (2014)

🎥

B.Brumitt. MapReduce used on large data sets – Seattle Conference on Scalability (2007)
S.Ghemawat, J.Dean, J.Zhao, M.Austern. Google MapReduce by Google scientists – Google Technology RoundTable (2008)
J.Tedesco. MapReduce: Simplified data processing on large clusters (2012)

📄

J.Dean, S.Ghemawat. MapReduce: Simplified data processing on large clusters – Communications of the ACM 51 (2008)

Unique ID generation

🔗

Sharding & IDs at Instagram – Instagram Engineering (2012)
Twitter Snowflake (retired)

🎥

C.Colohan. Unique ID – Distributed Systems Design (2019)

Distributed systems

Autocomplete

🔗

W.Wahed, T.Han, J.Shenk. Building Prefixy

Facebook

🎥

H.Fisk. Large-scale low-latency storage for the social network – Data@Scale (2013)

Google Maps

🎥

Search

Google Search

📄

S.Brin, L.Page. The anatomy of a large-scale hypertextual Web search engine – Computer Networks and ISDN Systems 30, 107 (1998)
L.A.Barroso, J.Dean, U.Hölzle. Web search for a planet: The Google cluster architecture – IEEE Micro 23, 22 (2003)

Elasticsearch

🔗

C.Gormley, Z.Tong. Elasticsearch: The definitive guide (2014–2015)

Distributed search query execution

🔗

Sec.: Distributed search execution – C.Gormley, Z.Tong. Elasticsearch: The definitive guide (2014–2015)

Stack Overflow

🔗

How we do app caching – 2019 edition – Stack Overflow blog (2019)
The hardware – 2016 edition – Stack Overflow blog (2016)
The architecture – 2016 edition – Stack Overflow blog (2016)
A technical deconstruction – Stack Overflow blog (2016)
What it takes to run Stack Overflow – Stack Overflow blog (2013)
Stack Exchange Data explorer

🎥

O.Coster. Stack Overflow behind the scenes – how it’s made – Codemotion (2017)
M.Cecconi. High performance architecture of Stack Overflow – code.talks (2016)
M.Cecconi. High performance architecture of Stack Overflow – JSConf.Asia (2016)
M.Cecconi. The architecture of Stack Overflow – Infoshare (2014)
M.Cecconi. The architecture of Stack Overflow – code.talks (2013)
M.Cecconi. The architecture of Stack Overflow – Dev Day (2013)
S.Hanselman. StackExchange – MIX11 (2011)

Twitter

🔗

Tutorial: Design and implementation of a simple Twitter clone using PHP and the Redis key-value store

YouTube

🎥

C.Do. YouTube scalability – Seattle Conference on Scalability (2007)

Instagram

🔗

What powers Instagram: Hundreds of instances, dozens of technologies – Instagram Engineering (2011)

🎥

L.Guo. Scaling Instagram infrastructure – QCon (2017)
P.Hunt. How Instagram.com works – OSCON (2014)
R.Branson. Messaging at scale at Instagram (Async tasks at Instagram) – PyCon (2013)

Protocols

Layer 4 protocols

🔗

S.Ignatchenko. Once again on TCP vs UDP – Overload 130 (2015)
S.Ignatchenko. TCP/IP explained. A bit – Overload 115 (2013)

UDP

📖

Ch. 11: UDP transport – P.L.Dordal. An introduction to computer networks

TCP

🔗

Transmission Control Protocol – Wikipedia

❔

Why do we need a 3-way handshake? Why not just 2-way? – Network Engineering

📖

Ch. 12: TCP transport – P.L.Dordal. An introduction to computer networks

Layer 7 protocols

HTTP

📖

Sec.: HTTP – I.Grigorik. High performance browser networking

History

🔗

Evolution of HTTP – MDN

WebSocket

🔗

WebSocket – Wikipedia
When to use a HTTP call instead of a WebSocket (or HTTP 2.0) – Windows Apps Team (2016)

❔

Does HTTP/2 make websockets obsolete? – Stack Overflow
WebSockets protocol vs HTTP – Stack Overflow

CAP theorem

It is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees: consistency (every read receives the most recent write or an error), availability (every request receives a non-error response, without the guarantee that it contains the most recent write), and partition tolerance (the system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes).

🔗

H.Robinson. The CAP FAQ (2013)

🎥

C.Colohan. Lec. 16: The CAP theorem – Distributed Systems Design (2019)

Load balancing

Layer 4 load balancing

Layer 4 load balancing operates at the intermediate transport layer (Layer 4), which deals with delivery of messages with no regard to the content of the messages (a load-balancing decision is based on the source and destination IP addresses and ports recorded in the packet header). Nowadays, CPU and memory are sufficiently fast and cheap that the performance advantage for Layer 4 load balancing has become negligible or irrelevant in most situations.

🔗

What is layer 4 load balancing?

Layer 7 load balancing

Layer 7 load balancing operates at the high‑level application layer (Layer 7), which deals with the actual content of each message (a load‑balancing decision is based on the content of the message, like a URL or cookie).

🔗

What is layer 7 load balancing?

Blockchain

🎥

C.Colohan. What is a blockchain – Distributed Systems Design (2018)

Files

distributed_systems.md

Latest commit

History

distributed_systems.md

File metadata and controls

Distributed systems and computer networks

Table of contents

Introduction and overview

Metrics and order of magnitude estimations

CDN

DNS

Big storage: file systems

GFS (Google File System)

HDFS (Hadoop Distributed File System)

Big storage: databases

NoSQL databases

Google BigTable

Redis

Memcached

Transaction processing

Two-phase/three-phase commit protocol

Paxos

Computation

MapReduce

Unique ID generation

Distributed systems

Autocomplete

Facebook

Google Maps

Search

Google Search

Elasticsearch

Distributed search query execution

Stack Overflow

Twitter

YouTube

Instagram

Protocols

Layer 4 protocols

UDP

TCP

Layer 7 protocols

HTTP

History

WebSocket

CAP theorem

Load balancing

Layer 4 load balancing

Layer 7 load balancing

Blockchain

Distributed graphs