
Minas Tirith Charcoal
A practitioner’s reflection on what really makes recovery work
Over the years, I’ve been involved in more than one IT initiative that drifted into trouble. Different organisations, different technologies, different pressures, but the patterns were surprisingly consistent. What ultimately determined success or failure was rarely a single technical decision. It was the combination of technical discipline and organisational maturity that made recovery possible.
This post distils the most important lessons I’ve learned from recovering enterprise-scale IT projects, written deliberately in a technology‑agnostic way since the specifics change, but the lessons don’t.
Most technical failures start as organisational ones
When a project starts failing, the first instinct is often to blame the technology. The platform is unstable, the architecture is wrong, the tooling isn’t mature. In my experience, those are symptoms, not root causes.
The real problems usually appear earlier:
- Unclear ownership and decision rights
- Conflicting stakeholder expectations
- Overloaded teams afraid to say this won’t work
- Governance that exists on paper but not in practice
By the time technical issues explode, the organisational foundations have often already eroded. Recovery starts by acknowledging that reality—without blame.
If you try to fix the technology without fixing how decisions are made, you’re just buying time.
Stabilisation beats optimisation every time
One of the hardest lessons to apply under pressure is this: don’t improve, stabilise.
When a project is in distress, people naturally want to:
- Add features to regain stakeholder confidence
- Redesign architecture to do it properly
- Introduce new tools or frameworks
That almost always makes things worse.
Successful recoveries I’ve seen all started with the same move:
- Freeze scope
- Reduce change
- Focus on making the current system predictable
Only once the environment stopped shaking could meaningful improvements happen.
A stable but imperfect system is infinitely more valuable than a perfect design that never lands.
Transparency is uncomfortable but non-negotiable
Project recovery exposes uncomfortable truths:
- Plans that were never realistic
- Risks that were known but ignored
- Progress reports that were optimistic at best
The turning point in every successful recovery I’ve been part of was radical transparency:
- Honest status reporting
- Clear articulation of what will not be delivered
- Explicit trade-offs between time, scope, risk, and cost
This often feels risky especially in hierarchical organisations but without it, trust never resets.
You can’t recover stakeholder confidence with optimism, only with credibility.
Governance must get closer, not heavier
A failing project doesn’t need more bureaucracy it needs better governance.
What worked best was:
- Fewer forums, but held more frequently
- Decision-makers in the room, not observers
- Clear escalation paths and fast decisions
In recovery mode, governance shifts from oversight to active steering. Leaders who stayed engaged not just informed made a measurable difference.
Good governance accelerates recovery when it removes ambiguity, not when it adds process.
Culture determines how fast you can turn the ship
Two projects can have identical problems and completely different outcomes. The difference is almost always cultural.
Recovery is dramatically slower when:
- Teams are afraid of being blamed
- Admitting problems is seen as weakness
- People optimise for self-protection instead of outcomes
Conversely, recovery accelerates when:
- Leaders create psychological safety
- Problems are surfaced early and openly
- The team rallies around shared success, not individual survival
No recovery plan survives contact with a broken culture. Psychological safety is not a nice to have in recovery, but it is a delivery dependency.
Security and risk don’t disappear during recovery
Under pressure, there’s a temptation to fix it now and secure it later. That shortcut always comes back with interest.
What worked better:
- Treating security and risk as recovery constraints, not obstacles
- Involving risk and compliance early, not as a final gate
- Making conscious, documented risk decisions instead of implicit ones
Recovered projects that ignored this inevitably re-entered crisis mode later, just under a different name. Recovery that creates new risk is just delayed failure.
Small wins rebuild momentum faster than grand plans
One of the most effective recovery techniques I’ve seen is engineering early, visible wins:
- A stable release after months of chaos
- A single process that suddenly works end-to-end
- A painful manual step finally automated
These moments matter. They restore belief inside the team and among stakeholders that recovery is real. Momentum is rebuilt incrementally, not announced in slide decks.
Recovery isn’t finished when the project is back on track
Some of the worst relapses I’ve seen happened after recovery was declared successful.
Why?
- Temporary governance was rolled back too fast
- Old behaviours quietly returned
- Lessons were discussed but never institutionalised
The strongest organisations treated recovery as a learning event, not just a rescue:
- They updated delivery standards
- They clarified roles permanently
- They adjusted how risk and reality are discussed
If the organisation doesn’t change, the next project will fail the same way, just faster.
Recovering an IT project is never just a technical exercise. It’s a stress test of leadership, culture, and organisational honesty.
The most important lesson I’ve learned is this:
Projects don’t recover because plans get better. They recover because behaviours change.
Technology matters, but people, structure, and trust matter more.