- Introduction and overview
- CDN
- DNS
- Big storage: file systems
- Big storage: databases
- Computation
- Distributed systems
- Protocols
- CAP theorem
- Load balancing
- Blockchain
- Distributed graphs
🔗
- S.Kozlovski. A thorough introduction to distributed systems – Overload 149 (2019)
🎥
- T.Berglund. Distributed systems in one lesson – Devoxx Poland (2017)
- D.Malan. Scalability – Harvard CS75: Building dynamic websitesgh (2012)
- J.Dean. Building software systems at Google and lessons learned – Stanford (2010)
🔗
- S.Ignatchenko. The importance of back-of-envelope estimates – Overload 137 (2017)
🔗
- Content delivery network – Wikipedia
🎥
- S.Keshav. DNS and CDN – CS 436: Distributed Computer Systems (2013)
- A.Bergman. What is a CDN and why developers should care about using one – GOTO (2016)
📄
- J.Dilley et al. Globally distributed content delivery – IEEE Internet Computing 6, 50 (2002)
🔗
- Domain Name System – Wikipedia
- DNS technical reference – Microsoft Docs (2017)
🎥
- S.Keshav. DNS and CDN – CS 436: Distributed Computer Systems (2013)
🔗
- GFS FAQ – MIT 6.824: Distributed systems (2020)
🎥
- R.Morris. GFS (Notes) – MIT 6.824: Distributed systems (2020)
📄
- K.McKusick, S.Quinlan. GFS: Evolution on fast-forward – ACM Queue 7 (2009)
- S.Ghemawat, H.Gobioff, S.-T. Leung. The Google file system – ACM SIGOPS Operating Systems Review 37, 29 (2003)
📄
- K.Shvachko, H.Kuang, S.Radia, R.Chansler. The Hadoop distributed file system – IEEE Symposium on Mass Storage Systems and Technologies (2010)
🔗
- F.Gessert. NoSQL databases: A survey and decision guidance (2016)
🎥
- M.Fowler. Introduction to NoSQL – GOTO (2012)
🔗
- Bigtable – Wikipedia
🎥
- J.Dean. BigTable: A distributed structured storage system – CSE Colloquia (2005)
📄
- F.Chang et al. BigTable: A distributed storage system for structured data – ACM Transactions on Computer Systems 26, 4 (2008)
🔗
- Redis documentation
- S.Sanfilippo. Redis persistence demystified (2012)
🔗
🎥
- R.Nishtala. Scaling Memcache at Facebook – NSDI (2013)
🔗
🎥
- R.Barrett. Transactions across datacenters – Google I/O (2009)
🔗
- Two-phase commit protocol – Wikipedia
❔
- How does three-phase commit avoid blocking? – Stack Overflow
🔗
- Paxos – Wikipedia
🎥
- H.Howard. Paxos agreement – Computerphile (2016)
- C.Colohan. Paxos simplified – Distributed Systems Design (2017)
❔
- Paxos vs two phase commit – Stack Overflow
🔗
- H.Robinson. The elephant was a trojan horse: On the death of Map-Reduce at Google (2014)
🎥
- B.Brumitt. MapReduce used on large data sets – Seattle Conference on Scalability (2007)
- S.Ghemawat, J.Dean, J.Zhao, M.Austern. Google MapReduce by Google scientists – Google Technology RoundTable (2008)
- J.Tedesco. MapReduce: Simplified data processing on large clusters (2012)
📄
- J.Dean, S.Ghemawat. MapReduce: Simplified data processing on large clusters – Communications of the ACM 51 (2008)
🔗
- Sharding & IDs at Instagram – Instagram Engineering (2012)
- Twitter Snowflake (retired)
🎥
- C.Colohan. Unique ID – Distributed Systems Design (2019)
🔗
- W.Wahed, T.Han, J.Shenk. Building Prefixy
🎥
- H.Fisk. Large-scale low-latency storage for the social network – Data@Scale (2013)
🎥
📄
- S.Brin, L.Page. The anatomy of a large-scale hypertextual Web search engine – Computer Networks and ISDN Systems 30, 107 (1998)
- L.A.Barroso, J.Dean, U.Hölzle. Web search for a planet: The Google cluster architecture – IEEE Micro 23, 22 (2003)
🔗
- C.Gormley, Z.Tong. Elasticsearch: The definitive guide (2014–2015)
🔗
- Sec.: Distributed search execution – C.Gormley, Z.Tong. Elasticsearch: The definitive guide (2014–2015)
🔗
- How we do app caching – 2019 edition – Stack Overflow blog (2019)
- The hardware – 2016 edition – Stack Overflow blog (2016)
- The architecture – 2016 edition – Stack Overflow blog (2016)
- A technical deconstruction – Stack Overflow blog (2016)
- What it takes to run Stack Overflow – Stack Overflow blog (2013)
- Stack Exchange Data explorer
🎥
- O.Coster. Stack Overflow behind the scenes – how it’s made – Codemotion (2017)
- M.Cecconi. High performance architecture of Stack Overflow – code.talks (2016)
- M.Cecconi. High performance architecture of Stack Overflow – JSConf.Asia (2016)
- M.Cecconi. The architecture of Stack Overflow – Infoshare (2014)
- M.Cecconi. The architecture of Stack Overflow – code.talks (2013)
- M.Cecconi. The architecture of Stack Overflow – Dev Day (2013)
- S.Hanselman. StackExchange – MIX11 (2011)
🔗
🎥
- C.Do. YouTube scalability – Seattle Conference on Scalability (2007)
🔗
- What powers Instagram: Hundreds of instances, dozens of technologies – Instagram Engineering (2011)
🎥
- L.Guo. Scaling Instagram infrastructure – QCon (2017)
- P.Hunt. How Instagram.com works – OSCON (2014)
- R.Branson. Messaging at scale at Instagram (Async tasks at Instagram) – PyCon (2013)
🔗
- S.Ignatchenko. Once again on TCP vs UDP – Overload 130 (2015)
- S.Ignatchenko. TCP/IP explained. A bit – Overload 115 (2013)
📖
- Ch. 11: UDP transport – P.L.Dordal. An introduction to computer networks
🔗
- Transmission Control Protocol – Wikipedia
❔
- Why do we need a 3-way handshake? Why not just 2-way? – Network Engineering
📖
- Ch. 12: TCP transport – P.L.Dordal. An introduction to computer networks
📖
- Sec.: HTTP – I.Grigorik. High performance browser networking
🔗
- Evolution of HTTP – MDN
🔗
- WebSocket – Wikipedia
- When to use a HTTP call instead of a WebSocket (or HTTP 2.0) – Windows Apps Team (2016)
❔
- Does HTTP/2 make websockets obsolete? – Stack Overflow
- WebSockets protocol vs HTTP – Stack Overflow
It is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees: consistency (every read receives the most recent write or an error), availability (every request receives a non-error response, without the guarantee that it contains the most recent write), and partition tolerance (the system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes).
🔗
- H.Robinson. The CAP FAQ (2013)
🎥
- C.Colohan. Lec. 16: The CAP theorem – Distributed Systems Design (2019)
Layer 4 load balancing operates at the intermediate transport layer (Layer 4), which deals with delivery of messages with no regard to the content of the messages (a load-balancing decision is based on the source and destination IP addresses and ports recorded in the packet header). Nowadays, CPU and memory are sufficiently fast and cheap that the performance advantage for Layer 4 load balancing has become negligible or irrelevant in most situations.
🔗
Layer 7 load balancing operates at the high‑level application layer (Layer 7), which deals with the actual content of each message (a load‑balancing decision is based on the content of the message, like a URL or cookie).
🔗
🎥
- C.Colohan. What is a blockchain – Distributed Systems Design (2018)