Good architecture should last you a long time.
Just look at how the Romans built everything.
This is my attempt to recreate the Instagram architecture from 2011. Over the last decade, how would this architecture endure?
Instagram’s North Star
Keep it very simple
Don’t re-invent the wheel
Go with proven and solid technologies when you can
According to the original post published on Medium, this is the overarching theme that governed their choices of the tech stack.
This is crucial given the fact that Instagram had only 15 employees but managed to scale to millions of users.
Keeping it simple is also a motto of Steve Jobs, Michalengolo, and other great creative minds.
OS / Hosting
You could read their in depth analysis of different Linux versions. Basically, through empirical observation, they found Ubuntu Linux 11.04 (“Natty Narwhal”) to be the most reliable.
This is often the problem when a very skilled academic joins an engineering role. The conventional PhD thinking is finding out the reason why and sit on a problem for several years. In the real world, you simply figure out what works and keep it working.
Load Balancing
They had Amazon’s Elastic Load Balancer, with 3 NGINX instances. Nothing to see here. It still works today.
Application Servers
They are like the pillars of the Roman era buildings.
Again, pretty standard Amazon High-CPU Extra-Large machines, 25 instances.
They used a package called Gunicorn as their Web Server Gateway Interface (WSGI). This project is still being maintained today, according to the Github repo.
Data storage
This post didn’t specify or confirm. It looks like they used postgresql to store user data and other forms of metadata, and AWS S3 buckets to store the raw photos.
The engineers at Instagram mentioned that they found Christophe Pettus’s blog very helpful when navigating Django, PostgreSQL and Pgbouncer.
Tools used | exist in 2023?
Django | Yes, alive and well
Postgresql | Yes, alive and well
pgbouncer | yes, alive and well
AWS S3 | yes, alive and well
Memcached | yes, alive and well
They had mentioned other packages in the post. Many belong to the AWS family, so I’d assume those ones are all alive and well.
Task Queue & Push Notifications
Instagram used Gearman as the task queue, GitHub last maintained 7 days ago.
Their push notification tool, pyapns, was last maintained in 2017.
Monitoring
Instagram engineering mentioned many packages that I’m not aware of. All the links went to Tumblr, so that’s not helping.
The software I can recognize - Sentry and Pagerduty - certainly still exist.
Obligatory DIY Diagram
The post didn’t have any architecture diagram that accompanied the texts. I read and reread the post several times and draw this out myself.
I welcome anyone to point out my errors.
The blog post1 that I’m referencing only focused on the backend services, databases, and dev ops aspects of Instagram. For some reason, the engineers who wrote this post decided to link to Tumblr pages. Did folks used to publish software packages on Tumblr? I’m very confused here. Some links are lost and I’m not able to draw a conclusion on the durability of those design choices.
Overall, the only major failure would be the notification service that they relied on. It’s likely that they could find many replacements for it, should they keep the architecture design.
Are Instagram engineers like the Roman architects?
The OG post from 2011: https://instagram-engineering.com/what-powers-instagram-hundreds-of-instances-dozens-of-technologies-adf2e22da2ad