Original video comes from AWS “My Architect”, it introduce how slack to build their service on top of AWS.
This video is very easy to understand within 5 minutes, it make me to think about - Could we use this video to inference how to build slack-like service from scratch?
原始影片是在講解如何透過 AWS 的服務來架設 Slack ，不過裡面有許多思考脈絡很適合去思考 ”如何架設一個類似 Slack 的架構”． 這邊很歡迎大家去看原始影片，只要 05:43 而已，但是裡面有許多細節可以提供我們慢慢地思考．
How to build slack-like service from scratch?
Slack is a “online messaging service”, so if we want to build something like slack. Your choice might be
- webRTC (not only online messaging, but also video/audio streaming)
- Or we could build something just using websocket?
user use “websocket” to connect server and communicate with each other.
So, the basic architecture is very simple.
user -> websocket -> server <- websocket <- user
Challenge 1: More users
When the lots of user come-in use your slack-like sytem, you have to build a HAPROXY to load-balance your websocket connection.
user -> websocket -> HAPROXY -> server(s) <- HAPROXY <- websocket <- user
Challenge 2: Need sercurity of your transmission.
You need a way to protect the trasmission between each server. In that way, you might need IPSEC to increase your transmission security.
user -> websocket -> HAPROXY -> server <- IPSEC -> server <- HAPROXY <- websocket <- user
Challenge 3: Need reduce the huge amount request cause server heavy loading.
Although we using the HAPROXY to do load-balancing of websocket request, but if too much user it will have huge amount of database access connection and access request to server.
It have following problems(challenges):
- Start connection take too long and expensive
- Client data footprint become large for mobile user
- Reconnection storm are resource intensive
How could we try to optimize it? Slace have their own edge-cache system which call “Flannel” (Application-level edge cache)
“Flannel” provide following features:
- Keep persistent connection with server but provide slimmed down version client side data to client.
- So, client don’t need handle huge connection bootstrape data instead of using few data with flannel.
- Provide a lazily load from client side
- Reduce mobile user connection waiting time and data size.
EX: Client side will connect to flannel for “auto-complete” result from “at”(@) someone.
user -> websocket -> HAPROXY -> Flannel -> server <- IPSEC -> server <- Flannel <- HAPROXY <- websocket <- user
Challenge 4: Large pre-load message
When client is login after long-time, it might need take full bootstrape login process. It will also to retrival all up-to-date information to keep your slack channel data.
In that way, it need CDN from messaging server. We could use AWS EC2 for this case to reduce pre-load latency or connection intensive.
For easy way to access EC2, AWS Route 53 provide scalable, high-availability DNS service.
This article just trying to build a slack-like service from scratch. Then, we trying to improvde our service when we have some new challenge here.
It is a good practice for me to think about whole architecture for realtime messaging service. Hope you enjoy it.