Will Larson, the author of “The Elegant Puzzle” I’m reading now, tells an interesting story in his book. He recalls how he found himself in a tough spot after a major incident at his company. One of the engineers burnt out from long hours and alert fatigue, made some mistakes that led to a bad outage. Will understood it wasn’t entirely the engineer’s fault, though. The company had no decent process providing enough context about the disk space alerts to appropriately express their severity. They also hadn’t set up their central database in a way that could handle large spikes in traffic.

Will felt terrible for his engineer, knowing the brutality of the on-call schedule. Engineers’ phones would buzz nonstop through their 12-hour overnight shifts, leaving them totally drained. He knew it wasn’t fair to pin all the blame on one person. Plus, the team morale would take a huge hit if the engineer would be fired. People were frustrated, and the last thing they needed to see was their colleague losing his job after making a mistake under stress and without the right process.

The Junior Dev who Wiped Production Database

Unfortunately, the story Will told in his book is quite common. Often, the engineers take the blame for problems because managers don’t see the forest behind the trees or address the root causes of the issues.

A more dramatic example would be this Reddit post, where a junior developer who recently joined a new company tells a story of how he accidentally made a significant mistake on the first day at a new job. He was given instructions for setting up the local development environment, which involved creating a test database instance using the provided credentials. The fun part is that the credentials in the instructions were real production credentials instead of the test ones that should have been there.

As you probably already guessed, when the junior dev ran the tests, they cleared all the data from the production database. The CTO told the developer to leave and blamed the junior dev for the severe data loss. As a cherry on top, the junior dev said there was a panic as the backups were not restoring correctly.

The problem here wasn’t in the actions of the developer. The engineering team clearly had a more systemic problem, with the team likely being stressed out and utterly overwhelmed. Otherwise, they wouldn’t add production credentials to a guide for junior devs, and of course, they would have set up a proper backup routine.

So, what can you do as an engineering leader to prevent these situations?

Say No to Protect Your Team

Saying “no” is an engineering manager’s most challenging yet vital skill. Your team has limited bandwidth, so taking on too much work leads to rushed delivery, technical debt, and burnout. If stakeholders demand a complex new feature in an unrealistic timeframe, you must firmly push back.

Explain the engineering effort required and the risks of cramming it in. Make clear your priority is releasing quality products sustainably. This prevents your team from cutting corners and working extreme hours to hit arbitrary deadlines. Protecting your team should be the priority.

Communicate Your Team’s Constraints

Part of protecting your team is making sure other groups understand your constraints. Many non-engineers need to realize the effort involved in developing quality software and systems. They may see engineers as magic workers who can whip up any feature overnight. It’s your job to educate them.

Explain carefully what your team can and cannot deliver within a specific timeframe. If you need more headcount to hit throughput targets, make that case. Conversely, if you are at capacity, ensure other departments know you can only take on more with tradeoffs. Transparency about team constraints, backed by data, helps set realistic expectations across the company.

When Pressured to Overcommit, Provide a Compelling Explanation

Even when constraints are explicit, you’ll likely face pressure. The CEO may push for an earlier launch date. The product team may promise users features you can’t build yet. In these cases, you must hold your ground.

Counter with a realistic timeline and explain the risks of overpromising. Share examples of when rushed delivery led to technical debt, unstable products, and customer issues. Note how much time it takes to do things right.

Help them understand why taking on less is better — for the product, engineers, and company. Be calm and rational, not combative. You can often get stakeholders to see reason with compelling data and examples.

Focus on What Prevents You from Solving the Core Constraints

When your team has too much on their plate, take time to identify the core constraint. Often, your team needs more engineers, specifically senior engineers. Or, critical roles like product managers or UX designers could be understaffed. Your workflows or tools may make engineers waste time instead of focusing on delivering value.

Determine where the biggest bottleneck is through data, talking to engineers, and observing. Then, devise solutions tailored to address that constraint directly. Adding headcount or shipping features more slowly may offer relief. But dig into root causes for longer-term solutions.

Try to Solve the Constraints

There are really only two ways to solve your team’s constraints:

1. Add resources by pushing people inside your organizations or hiring. This includes training more junior team members.

2. Prioritize what your team works on and hope to address everything else sometime later. Often times this means saying no to additional work and cutting scope.

In either case, you need to be more realistic about the situation. Consider both short-term relief and longer-term capacity planning. Adding headcount takes time, so you may need an interim solution anyway. With constraints mapped thoroughly, you can plan to balance the team’s health with business needs.

Develop Guiding Principles

For the team to prioritize, develop guiding principles, and specify quotas for immediate and long-term work.
To balance competing priorities, create scheduling principles. For example, allocate 20% of the time to tech debt and platform investment, 50% to current project delivery based on priority, and 30% to unplanned work and minor enhancements.

These quotas distribute effort appropriately between urgent requests, ongoing product work, and engineering health. Complement percentages with principles like focusing on high customer value features first. Establish these guidelines together with engineers and stakeholders to get buy-in.

Update them periodically as needs and your team’s capacity change. With deliberate principles guiding decisions, your team can focus on delivering what truly matters most. Less urgent asks will wait their turn, protecting engineers from distraction while they can work steadily on a healthy mix of short-term execution and long-term improvement.

The Importance of Systems Thinking

Blaming individuals is shortsighted — mistakes happen within complex systems shaped by many forces. Managers must identify root causes like unrealistic schedules, poor training, and flawed processes. This helps prevent recurrences rather than just blaming people after the fact.

Beware of hindsight bias — assuming one person’s at fault when many factors are at play. Blame should motivate fixing systems, not punishing people.

The path forward isn’t quick fixes or scapegoating — it’s aligning incentives, improving communication, and investing in people. We need a learning culture focused on collective improvement, not a blame culture stuck in crises.

Systems thinking trumps blind finger-pointing. By stepping back and taking a holistic view, we can work together to improve things.


Originally published on Medium.com