Three years at the Wikimedia Foundation (2022–2025) taught me that the systems serving roughly half the planet once a month are mostly unglamorous PHP, careful caching, and operational culture — not the architecture-astronaut stuff conference talks reward. This post is an earnest love letter to boring code at scale, from someone who spent three years inside one of the clearest working examples of it.
What Wikipedia is, operationally
Wikipedia is not one site. It’s hundreds of language editions, running on a shared codebase (MediaWiki), backed by a heavily-tuned MySQL fleet, fronted by Varnish and a custom cache layer, served across multiple global PoPs. The engineering org is smaller than most outsiders guess. The donor-funded foundation has a tight budget, a deep bench of volunteer contributors, and a culture that has been compounding for twenty-plus years.
You could draw an architecture diagram for it in five minutes. You could not build it in less than a decade.
What working on it felt like
I’ll say the dishonest thing first: going into Wikimedia, I was expecting to work on a system that looked like it should be in a museum. The reality was closer to a finely-tuned instrument whose tuning is the entire reason it works. The code itself is neither the oldest nor the newest I’ve touched. What’s unique is the accumulated operational wisdom baked into every decision about how the code runs.
Concretely:
- Caching is first-class. Every feature starts with “how does this cache?” — not as a follow-up performance concern, but as a gating design question. If a feature degrades cacheability, that’s a design conversation, not a performance tuning pass.
- MySQL is a tuned instrument. Query plans are reviewed. Indexes are earned. “Let’s just add an index” is not a thing — indexes have ongoing cost at this scale, and the conversation is about whether the cost is justified by the query volume, not whether the index makes the query faster. That’s a level of discipline I hadn’t seen at smaller companies.
- Deployments are rehearsed. The train process — how code ships to production each week — has been refined for over a decade. It’s a published document. Everyone knows what happens when. There are no surprise deploys.
- Observability is deeply internalised. When an incident happens, the muscle memory for which dashboard to open, which log source to query, which on-call to page, was built over years. It’s written down. It’s drilled. New engineers learn it as a rite, not as a reference.
None of those four things are architectural. They’re operational habits that have compounded. That’s the actual moat.
What I worked on
I won’t overstate my footprint. In three years I contributed to improvements around performance, reliability, and observability — mostly backend, mostly in areas where my prior experience with high-throughput PHP and MySQL transferred cleanly. I shipped smaller infrastructure-adjacent changes, participated in architectural reviews for systems bigger than my individual authorship, and contributed to standards work that rolled out across the engineering org.
Two patterns from that time that I’ll carry forever:
1. Architecture reviews as organisational glue. Wikimedia runs a structured architecture-review process. A proposal gets written, circulated, discussed, revised, and eventually approved. The review is slow — sometimes weeks — and the slowness is the feature. It forces proposals to mature, and it gives many contributors (not just staff engineers) a voice on the direction. At smaller companies I’ve seen similar attempts rot into rubber-stamping. At Wikimedia it worked because the culture expected real engagement. That’s the missing piece most design-review cultures lack.
2. Treating volunteer contributors as first-class engineers. Wikimedia’s codebase has a volunteer contributor base that dwarfs the paid staff. Every design decision, every API change, every code pattern has to be legible and welcoming to people who don’t have standups with you. This is a ridiculously healthy constraint on engineering decisions. Most of the pathologies of internal-only codebases — clever in-jokes, unwritten conventions, magical abstractions — don’t survive when the audience is the entire world.
What this taught me about scale
Going in, I thought “Wikipedia scale” meant the hard problem was the traffic volume. It isn’t. The hard problem is the half-million people who edit the content, the twenty-year-accumulated content graph itself, and the operational discipline required to let all of that keep working while still shipping features.
The traffic is served by Varnish. That part is solved.
The editing experience, the content workflow, the multi-lingual routing, the abuse-prevention system, the community tools, the deprecation handling, the API contract with hundreds of third-party tools — those are the problems that consume engineering attention. And none of them are made easier by ditching the boring-PHP codebase. They’d be made worse, because you’d lose the accumulated wisdom along with the codebase.
Why I left (and what I took)
I left Wikimedia in April 2025 to do more hands-on work on AI-assisted engineering and local-first agent systems — which, in a twist I appreciate, is the exact kind of boring-discipline problem Wikimedia trained me to think about well. An agent control plane (Fulcrum) is not a glamorous system architecturally. It’s a state machine, a log, a memory layer, and an orchestrator. It’s going to get interesting the way Wikipedia’s stack is interesting: in the operational discipline, not the architecture.
The biggest thing I took from Wikimedia is a conviction that most “boring” systems are boring because someone earned the boredom. Someone spent years paying down the interesting parts. Someone wrote the runbook. Someone argued for the slow architecture review.
If I could give one piece of advice to every engineer I work with for the rest of my career: be suspicious of exciting systems. Ask who’s paying for the excitement. Usually, it’s the on-call rotation, and it’s going to catch up.