madd86 / awesome-system-design
- вторник, 11 августа 2020 г. в 00:22:24
A curated list of awesome system design mateiral
A curated list of awesome System Designing articles, videos and resources for distributed computing, AKA Big Data.
Whether you're preparing for an interview or you want to design a distributed/microservice oriented application, this list will definitely help you achieve that.
Attention: Stars on GitHub does not reflect usage or popularity for every item here listed.
Inspired By Awesome-BigData
Started By Gabriel Leon de Mattos
System Design Primer - [109k
System Design Interview Questions - Concepts you should know - A curated list of topics to introduce you to system design.
Grokking the System Design Interview - [Paid
System Design in Software Development - Basic article on the topics of system design and architecture.
System Design - Introductory interview preparation resources.
Design Pattern for Distributed Systems - Article talking about some patterns as well as some technologies to be considered.
Distributed Computing - Wikipedia article broadening the view of distributed system design.
Fallacies of Distributed Computing - Wikipedia article introducing the topic of fallacies of distributed computing and its effects.
Fallacies of Distributed Computing Explained - In depth explanation of the fallacies mentioned above.
CAP Theorem - IBM Article about CAP Theorem, Microservices and NoSQL DBs.
Pattern: Microservice Architecture - Good article talking about Microservice architecture as well as its drawbacks.
Taxonomy of Distributed Systems - 11 Page lecture classifying distributed systems and specifically why we need them.
Top 10 Secure Coding Practices - Brief article talking about good practices for code securities.
Scalable Web Architecture and Distributed Systems - Good article about distributed systems as well as some of the potential tools.
Designing Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services - [Paid
Designing Data Intensive Applications - [Paid
Building Microservices - [Free
Monolith to Microservices - [Free
A collection of videos based on distributed systems.
Gaurav Sen - System Design Series - Good resource for people who want to learn more about system design, introduces the topic in a very easy to understand way.
Tech Dummies - System Design Series - Another introduction to system design.
Mock System Design Interview at Google - Overview of what an interview on system design would look like from the perspective of a flawed but close fulfilling of the requirements. Key thing here is how the interaction with the interviewer goes.
Google Preparation Guide - A quick video explaining how they interview.
System Design Interview - YouTube channel focussed on content specific to system design interviews, with detailed explanation of a variety of problems.
Intro to Architecture and System Design Interviews - A youtube video with Jackson Gabbard with good info about system design interviews.
System Design Introduction for Interview - Tushar's intro to System Design.
Distributed Systems - This is an introductory course in Distributed Systems made by Chris Colohan. He got PhD from Carnegie Mellon, then spent 10 years working at Google building distributed systems.
MariaDB - MariaDB is a fork of MySQL server.
MySQL - Widely used relational database.
PostgresSQL - Relational database that has been gaining popularity.
SQLite - Another widely used database that is built into all mobile phones and most computers.
Sql Server - Widely used relational database.
Apache Ignite - [3.3k
Couchbase - Inspired by memcached, adding features such as replication and persistance.
Oracle Coherence - [126
Memcached - [10.2k
Redis - [44k
Apple FoundationDB - [10k
Cosmos DB - Microsoft's globally distributed, multi-model database service. Eastically and independently scale throughput and storage. SQL, MongoDB, Cassandra, Tables, Gremlin, and Spark APIs.
CouchDB - [4.6k
MongoDB - One of the most popular 'NoSQL' database for general purpose.
RethinkDB - [23.8k
ElasticSearch - [49.9k
Cosmos DB - Microsoft's globally distributed, multi-model database service. Eastically and independently scale throughput and storage. SQL, MongoDB, Cassandra, Tables, Gremlin, and Spark APIs.
Amazon DynamoDB - Key-Value and Document database, highly performant, scalable and secure.
Google Bigtable - Scalable and performant 'NoSQL' database for large analytical and operational workload.
Cassandra - Facebook-born project very fast, easily scalable, with option to include consistency with each operation.
Scylla - [4.9k
HBase - [3.6k
Cosmos DB - Microsoft's globally distributed, multi-model database service. Eastically and independently scale throughput and storage. SQL, MongoDB, Cassandra, Tables, Gremlin, and Spark APIs.
Amazon Neptune - Fast, reliable and fully managed graph database service.
ArangoDB - [10k
Neo4j - [7.9k
Cosmos DB - Microsoft's globally distributed, multi-model database service. Eastically and independently scale throughput and storage. SQL, MongoDB, Cassandra, Tables, Gremlin, and Spark APIs.
HDFS - Hadoop File System is a a widely popular choice among its big data competitors, providing high throughput access.
Lustre - File system for computer clusters.
CephFS - Unified, distributed storage system.
GlusterFS - Scale-out NAS file system.
MooseFS - POSIX-compliant distributed file system.
XtreemFS - Fault tolerant file system.
Apache Samza - Build stateful applications that process data in real time from multiple sources, including Kafka. Easy and inexpensive multi-subscriber model, can eliminate backpressure and has reliable persistency with low latency.
Apache Flink - Based on the concept of streams and transofrmations. Uses maven, handles batch tasks as data streams with finite boundaries. Low latency, high throughput.
Amazon Kinesis Streams - Durable, scalable, real-tme service. Collects gigabytes of data per second from hundreds of thousands of sources, including database event streams, website clickstreams, financial transactions, etc.
Azure Stream Analytics - Real-time analytics service that is designed for mission-critical workloads.
Amazon MQ - Open source message broker from Amazon.
Apache ActiveMQ - It's a multi-protocol, java based messaging server.
Apache Kafka - Widely popular message broker with low latency for data streaming.
RabbitMQ - Widely popular lightweight message broker written in erlang that also supports multiple messaging protocols.
IronMQ - Very fast and highly scalable messaging broker. (not open source)
Apache Pulsar - Created by yahoo, also highly scalable, low latency, geo-replication and multi-tenacy.
Kestrel - Written in Scala and speaks the memcached protocol. It works much like Kafka.
Azure Service Bus - A fully managed enterprise integration message broker.
SeeSaw - [5.1k
HAProxy - Widely popular option, provides high-availability, proxy, TCP/HTTP load balancing. Used by Reddit, Imgur, MaxCDN, GitHub, AirBNB.
Zevenet - Supports L3, L4 and L7. Easy install with a docker repo. Supports advanced health-check monitorining.
Neutrino - Used by eBay, built with Scala and Netty. Supports round-robin and least-connection algorithms.
Ngnix - Wait, isn't Nginx a web server? Yes, the open source does support basic level of content switching and request routing. Plus edition supports load balancing, WAF, monitoring, etc.
F5 - Robust hardware load balancer option, supporting multiple protocols (IP, TCP, FTP, UDP, HTTP).
TP-Link - Cheaper alternative that works as a load balancer.
Barracuda - One of the top choices for load balancing when it comes to in-house servers. Top security measures built in, comprehensive reports and monitoring outbound traffic for data loss prevention.
Amazon Elastic Load Balancing - Popular choice for amazon customers, supports lambda functions, highly scalable.
Google Load Balancing - Popular choice for google customers, comes with auto-scaling feature, very fast, has intergrated CDN.
Cloudflare Load Balancing - Scalable load balancing by Cloudflare, feature fast failover and a dashboard.
DigitalOcean Load Balancing - If you're a digitalocean customer, this is a good option, very cheap, regional availability, scalable, easy to deploy among your other droplets.
Azure Load Balancing - Popular choice for Microsoft's Azure customers. Supports internal and external traffics, ipv6, monitorining and the standard load balancing set of features.
Sqoop - Efficiently transfer data between Hadoop and structured datastores such as relational databases.
Flume - Distributed, highly available and efficient in collecting, aggregating and moving large amounts of log data.
Apache Kafka - Widely popular message broker with low latency for data streaming.
Gin - [40.6k
Phoenix - [15.5k
Express.js - [49.6k
Rails - [46.2k
Play Framework - [11.6k
Flask - [51.6k
Django REST - [18.4k
ASP.NET Core MVC - A rich framework for building web apps and APIs using the Model-View-Controller design pattern in C# or F#. Number 6 on TechEmpower Composite Benchmarks for web frameworks.