Leading Through Storms: How To Handle Crises As An Engineering Manager

The year is 2016. A developer unpublishes a small piece of JavaScript code, left-pad, from the npm package repository.

The Internet explodes.

11 lines of code. That’s it. That’s all it took to trigger a ripple effect of system failures across the globe.

Despite its simplicity, left-pad was widely used and many systems depended on it to function. Who would’ve thought such a small thing could cause such a huge mess?

Everyone faces crises at some point. You can have large teams working around the clock to prevent issues, but outages still happen. You’ll know from the global news coverage that even some of the biggest platforms like TikTok, Instagram, and Google experience them.

That’s the thing with our industry: it can go from calm to chaos in the blink of an eye.

While not every crisis is as dramatic, there’s no doubt you’ll encounter them in some form or another – if you haven’t already. As an engineering manager (EM), the constant flux of unexpected situations can be exciting and keep you on your toes. Other times, it’s just plain stressful.

Throughout your career, you'll face intense moments where the things that can go wrong will go wrong. Ultimately, how you navigate these difficulties shapes the kind of leader you are.

Storms you’ll encounter

Factors like human error, hardware failures, or network issues are hard – if not impossible – to remove from the equation when trying to ensure absolute reliability on a system. These are a few ways crises can manifest (and key lessons you can take from them).

Human error

Take Equifax’s case, for example. The company suffered a data breach that exposed the personal details of millions of people, such as social security numbers, birth dates, and credit card information.

Equifax was hacked because of an unpatched Apache Struts vulnerability. A fix had been available for two months, but no one had applied it. This is a case of human error and an EM failure.

There are lessons here. And they go beyond just “patch your software.”

Security is not a checklist item: It’s an active engineering responsibility.
Technical debt can become a crisis: Equifax’s systems were outdated and full of dependencies. Somebody didn’t just miss security updates – they were hard to implement. An EM needs to balance feature development with keeping infrastructure healthy.
Incident response plans are crucial: Equifax didn’t detect the breach for months and hackers had free access to their systems the entire time. Your team needs a tested, well-documented plan for handling security incidents.

As an EM, it’s dangerous to assume someone else is handling security. It’s on you to ensure your team has a process for staying ahead. A missed update is all it takes to trigger a crisis.

In addition to human mistakes, there are other factors, like hardware failures, that can trigger crises.

Hardware failures

In 2017, Intel processors had a fatal flaw. These chips powered routers, NAS systems, and networking gear, and the devices failed – permanently.

Companies worldwide suffered. Their hardware relied on these chips. This failure was a wake-up call: unseen risks may sit in critical infrastructure.

Key lessons:

Dependencies matter: You don’t control every component in your system. One faulty chip can kill an entire product line.
Monitor constantly: Spotting issues early lets you respond with more control.
Vendors aren’t invincible: Even industry giants fail. Trust, but verify.

As an EM, you may focus on software. But you need to understand the hardware behind it – and have a plan for when it fails.

Hacker attacks

Hacker attacks hit fast and hard. They exploit weak spots in software, systems, or even human behavior. If you’ve been through one, you know the damage it might cause: downtime, legal trouble, and lost trust.

What you need to remember:

Security never stops: it’s an ongoing process, not a one-time fix
Think like an attacker: hackers are incredibly creative – your defenses should be, too
Again, monitor constantly: this cannot be stressed enough

Assume breaches will happen. Build fast detection and response plans. Security isn’t just IT’s job – everyone must protect data, systems, and the company.

The ripple effect

Incidents like these don’t just impact systems. When a crisis hits, you’re looking at:

Damage to company reputation
Financial losses
Crisis management and damage control
And increased pressure to improve security protocols and employee training

The real impact lasts long after the crisis ends. It forces companies into reactive mode and strains resources. But with the right plan, you can stay ahead and handle crises with (more) control.

Navigating crises

First, try to calm down. You can’t do anything useful when you’re panicking. Then, develop a strategy.

The following steps can help you handle whatever crisis comes your way.

1 - Assess the situation

Take a step back and look at the bigger picture. What exactly has gone wrong?

Whatever you’re facing, sit down and avoid jumping to conclusions.

Collect input from key stakeholders
Analyze logs or reports
Verify any assumptions before taking action

Misinformation and rushed decisions make crises worse. Having a clear understanding before proceeding leads to better decision-making.

**2 - Understand what's really happening**

Consider a situation in which a key member is leaving the team right when you need them. Have they made the decision yet? Are they actually leaving? Maybe they’re telling you this so you can do something.

Dig deeper into the root cause of the problem, whatever it may be. Perhaps it was a system failure, human error, or something else? Are there anomalies or triggers that may have contributed to the situation:

Did a recent change introduce new vulnerabilities?
Were best practices overlooked?
Was key information not communicated?
Or any external factors involved?

Finding the cause of the issue prevents future problems and helps manage the current crisis.

**3 - Understand why it is happening**

Returning to the situation of a key team member leaving. Is it money-related? Is there room for a counteroffer? Balance the budget with being short-staffed and consider the consequences of the time it takes to hire and train a new person. There may be a chance to avoid a looming crisis.

Whatever situation you’re facing, reflect on the reasons behind the problem:

Were warning signs missed?
Are you not being proactive enough?
Are there knowledge gaps?
Is the system or people involved overburdened?
Or is there a pattern?

Understanding the “why” will help you address the problem’s cause, not just its symptoms.

4 - Break the problem into smaller parts

Complex problems with many dependencies are overwhelming.

Imagine you’re facing a critical system failure. Your team is panicking, upper management has its eyes on you, what do you do? Well, follow the previous three steps, then look at the smaller parts of the issue, for example:

Database: look at integrity and connectivity
APIs: are they all functioning and responding as they should?
Frontend: is the interface rendering correctly?

When you’re facing a crisis, sometimes it can feel like you can’t see the woods for the trees. Breaking the problem into manageable pieces makes it easier to digest.

Start by:

Identifying each issue
Prioritizing the most pressing ones
Delegating what you can
And begin solving them

Doing this prevents confusion and ensures that you progress step by step rather than getting stuck in chaos.

5 - Isolate

Once you’ve broken the problem into smaller components, isolate each part. Take a security breach, for example, this might mean:

Taking systems offline
Limiting accesses
And implementing temporary safeguards while working on a resolution

These steps allow you to focus on one aspect at a time, preventing the situation from becoming “too much.”

Here, you can delegate certain components to different team members. Each person will be able to focus directly on their issue and not get caught up in the panic of the bigger crisis. With everyone (calmly) working in parallel, you’re more likely to reach a resolution sooner.

6 - Practice accountability

Acknowledge if the mistake is yours and communicate it properly. Doing so:

Builds trust
It shows how you lead
And demonstrates integrity and transparency

It also sets a good example for the team and encourages a culture of responsibility. And you’ll feel better than if you carried a lie or omission.

Oftentimes, problems in our work lives (or personal!) come from ignoring something or "hiding it under the rug." If we confront problems and issues, tell each other what we think, if we're candid and honest, it makes everything much easier. And it’s okay to make mistakes – everybody does.

Steering through shifting emotions

When you’re facing crises, it’s easy to feel the weight of the world on your shoulders because:

As an EM, you’re responsible for the situation
Your team is facing a high degree of anxiety
There could be serious ramifications
And you could lose clients

Moments of crisis will really put you to the test. And you’ll feel anxious to a point where it clouds your thoughts. It might even feel like the end.

You know you should remain calm – or at least appear calm – because your anxiety can spread to your team. But you’ll probably come home and start thinking about worst-case scenarios, such as:

Losing your job
Your team losing their jobs
Losing your credibility
Any having to find another job (maybe you're not such a bad delivery driver after all?)

The good news is, most of the time, it’s not the end. You can outgrow a moment of crisis. Dealing with setbacks, ultimately makes you feel more confident. And you’ll figure it out when you start addressing the situation proactively, with a clear mind. The same is true for your team.

Calming a storm in your team

During a crisis, people’s emotions will run wild – and that’s normal. Some will panic, some will get anxious, and others might just freeze. As their EM, you need to stay calm and focused. It’s your job to show them that whatever situation you’re going through is not the end of the world – even if it feels like it.

Some people start imagining worst-case scenarios. They may fear losing their job or the company going bankrupt. Reassure them that there’s always a solution and a way forward.

Others might use humor as a coping mechanism. That’s okay, too, but make sure they know it’s a way to release tension, not to avoid responsibility.

Each person reacts differently. Over time, you’ll learn how to address their emotions individually. For some, all it takes is a calm, reassuring word. For others, you might need to guide them through practical steps. The goal is to help them move from emotional panic to action. Then, you can start to get back on track.

The short version: chart a course to calm waters

As an EM, you'll face emergencies and crises. Some chaos is an inevitable part of the role. At some point, you’ll likely be tested by:

System outages
Human errors
Hardware failures
Security breaches
And myriad other problems

Emotions usually run high during crises – both yours and your team’s. You might feel anxious, but if you project calm and lead with accountability, it encourages your team to do the same. You can do this by:

Staying calm
Assessing the problem
Understanding what actually happened
Breaking it down into smaller steps
Isolating the issues
And acknowledging responsibility when needed

Even in the most challenging moments, remind yourself and your team that most crises are solvable. There is a way out. Overcoming them will increase your confidence and strengthen your team’s resilience.

Want more tips on leading effective software engineering teams?

Join my newsletter if you’ve found this content useful

Originally published on Medium.com

Content in this blog post by Alex Ponomarev is licensed under CC BY 4.0.