Keep Tier-One Applications Out of Virtual Environments

RedFox@infosec.pub · edit-2 6 months ago

Keep Tier-One Applications Out of Virtual Environments

superkret@feddit.org · edit-2 6 months ago

I work for a newspaper. It was published without fail every single day since 1945 (when my country was still basically just rubble, deservedly).
So even when all our systems are encrypted by ransomware, the newspaper MUST BE ABLE TO BE PRINTED as a matter of principle.
We run all our systems virtualized, because everything else would be unmaintainable and it’s a 24/7 operation.

But we also have a copy of the most essential systems running on bare metal, completely air-gapped from everything else, and the internet.
Even I as the admin can’t access them remotely in any way. If I want to, I have to walk over to another building.

In case of a ransomware attack, the core team meets in a room with only internal wifi, and is given emergency laptops from storage with our software preinstalled. They produce the files for the paper, save them on a USB stick, and deliver that to the printing press.

umami_wasabi@lemmy.ml · 5 months ago

How you keep the air gapped system in sync?

superkret@feddit.org · 5 months ago

We don’t. It’s a separate, simplified system that only lets the core team members access the layout-, editing- and typesetting-software that is locally installed on the bare metal servers.
In emergency mode, they get written articles and images from the reporters via otherwise unused, remotely hosted email addresses, and as a second backup, Signal.
They build the pages from that, send them to the printers, and the paper is printed old-school using photographic plates.

umami_wasabi@lemmy.ml · 5 months ago

That’s a very high degree of BCDR planning, and quite costly I assume.

superkret@feddit.org · edit-2 5 months ago

It’s less than the cost of our cybersecurity insurance, which will probably drop us on a technicality when the day comes.
And it’s not entirely an economic decision. The paper is family-owned in the 3rd generation, historically relevant as one of the oldest papers in the country, and absolutely no one wants to be the one in charge when it doesn’t print for the first time ever.

RedFox@infosec.pub · 6 months ago

Seems like your org has taken resilience and response planning seriously. I like it.

superkret@feddit.org · 6 months ago

Another newspaper in our region was unprepared and got ransomwared. They’re still not back to normal, over a year later.
After that, our IT basically got a blank check from executive to do whatever is necessary.

RedFox@infosec.pub · 6 months ago

Blank check

Funny how that seems to often be the case. They need to see the consequences, not just be warned. An ‘I told you so’ moment…

superkret@feddit.org · 6 months ago

I’m just glad they got to see the consequences in another company.
Their senior IT admin had a heart attack a month after the ransomware attack.

0x0@programming.dev · 6 months ago

save them on a USB stick

…which is also kept with the air-gaped system and tossed once used, i assume…

superkret@feddit.org · 6 months ago

There’s several for redundancy, in their original packaging, locked in a safe, and replaced yearly.

Im_old@lemmy.world · 6 months ago

That article is SO wrong. You don’t run one instance of a tier1 application. And they are on separate DCs, on separate networks, and the firewall rules allow only for application traffic. Management (rdp/ssh) is from another network, through bastion servers. At the very least you have daily/monthly/yearly (yes, yearly) backups. And you take snapshots before patching/app upgrades. Or you even move to containers, with bare hypervisors deployed in minutes via netinstall, configured via ansible. You got infected? Too bad, reinstall and redeploy. There will be downtime but not horrible. The DBs/storage are another matter of course, but that’s why you have synchronous and asynchronous replicas, read only replicas, offsites, etc. But for the love of what you have dear, don’t run stuff on bare metal because “what if the hypervisor gets infected”. Consider the attack vector and work around that.

RedFox@infosec.pub · 5 months ago

Good comments.

Do you think there’s still a lot of traditional or legacy thinking in IT departments?

Containers aren’t new, neither is the idea of infrastructure as code, but the ability to redeploy a major application stack or even significant chunks of the enterprise with automation and the restoration of data is newer.

Im_old@lemmy.world · 5 months ago

There is so much old and creaky stuff lying around and people have no idea what it does. Beige boxes in a cabinet that when we had to decommission it the only way to understand what it does was doing the scream test: turn it off and see who screams!

Or even stuff that was deployed as IaC by an engineer but then they left and so was managed “clickOps”, but documentation never updated.

When people talk about the Tier1 systems they often forget the peripheral stuff required to make them work. Sure the super mega shiny ERP system is clustered, with FT and DR, backups off site etc. But it talks to the rest of the world through an internal smtp server running on a Linux box under the stairs connected to a single consumer grade switch (I’ve seen this. Dust bunnies were almost sentient lol).

Everyone wants the new shiny stuff but nobody wants to take care of the old stuff.

Or they say “oh we need a new VM quickly, we’ll install the old way and then migrate to a container in the cloud”. And guess what, it never happens.

solrize@lemmy.world · 6 months ago

Most everything everywhere is virtual these days, even when the host hardware is single tenant. Companies running hosted applications on bare metal are rare. I run personal stuff that way because proxmox was too much hassle, but a more serious user would have just dealt with it.

floofloof@lemmy.ca · 6 months ago

Most organizations will avoid patching due to the downtime alone, instead using other mitigations to avoid exploitation.

If you can’t patch because of downtime, maybe you are cheaping out too much on redundancy?

RedFox@infosec.pub · 6 months ago

Yeah, that’s pretty risky for this point in time.

I guess the MBA people look at total cost of revenue/reputation loss for things like ransomware recovery, restoration of backups vs the cost of making their IT systems resilient?

Personally, I don’t think so (in many cases) or they’d spend more money on planning/resilience.

PiJiNWiNg@sh.itjust.works · 5 months ago

That immediately stuck out to me as well, what a lame excuse not to patch. I’ve been in IT for a while now, and I’ve never worked in any shop that would let that slide.