It’s the wild, wild, west out there in cyberspace, except the feral camels that once roamed Texas are the hackers, and they’re roaming beyond borders and through firewalls on the daily.
At present, cyber threat intelligence gathering is a mish-mash of intrusion detection system logs, port scans, IP addresses, information sharing platforms, Twitter feeds and traditional write-ups. There is no one consistent language used across these platforms to refer to attacks, techniques or procedures and there’s no one single source of data. Much like post-truth America, you’ve got to look in all the right places to piece together the whole story and even then it’s hard to know if you’ve put the puzzle together the way it was intended. What this means is while there’s massive complexity when trying to understand the path an attacker has taken, it also means that there’s huge potential when it comes to leveraging the data or bits (pun intended) of evidence a hacker leaves behind.
Information Gathering and the Penetration Tester
Penetration testers, who are my focus here, do much of their work when it comes to figuring out attack paths and new ways to penetrate, based on historical data or tried and true ways to compromise a system or application. They might listen to a few podcasts, keep an eye on social media, follow a hacking news website and sign up to a mailing list, but all of this is hugely labour intensive and no one person has the hours in the day to keep on top of, let alone be well versed in, all the latest attacks. The dream, of course, is to have a program or Artificial Intelligence learn the tactics, techniques and procedures of hackers out in the wild, bring it all back into a nice table where all the data is the same data type, turn into a visualisation with a gorgeous dashboard and then teach the team new attacks on the fly as they happen in real-time. This, dream, as wondrous as it sounds, is hanging above the Magic Faraway Tree and yet to be written down and sold as a four set gold embossed collection. What we do have, and I’m focusing here on open source data and software, are many tools and data sets that can bring us just that little bit closer to a rousing monologue that could change the history of how we prevent cyber-attacks in the future.
Big Data Big Complexity
For data analysts, one of the problems with data on the internet is that it comes in many forms, with many definitions and no one universal dictionary to look-up in order to know for sure what a word or a phrase means. Structured Threat Information Expression or STIX, which created by the United States Department of Homeland Security) and is used here in Australia by our own Cyber Security Centre, was created to address this issue. It’s useful in order to try and start standardising the way we talk about cyber threat intelligence so that we are all in fact, having the same conversation, in the same language. Some platforms, like MISP which is a Malware Information Sharing Platform created by Christophe Vandeplas who was working for the Belgian Defence Department at the time, allows users to export the Indicators of Compromise (IOC) that they and others share on the platform in the STIX format. This actively aids the development of a threat intelligence language so that we may use it to talk back to one another and share with the various systems we all use. MISP itself is an interesting platform with the public instance of it boasting more than 1000 organisational users from the across the globe, including the big players like Google, Apple, and our own Federal Police. It’s great at gathering threat feeds that are readily usable for other machines to digest but like every feed I’ve found to date, it tells only one part of the story of an attack or attempted attack. To tell the whole story, human research, interpretation and reasoning is needed, along with further data and frameworks in order to be able to map or make sense, of what actually happened blow by blow. Therefore, mapping attacks is where MITRE’s ATT&CK Framework comes in. ATT&CK describes why an action was performed and the technique used to do it, which is often missing in publicly released reports or write-ups that gloss over the specifics of an attack. MITRE have even produced a STIX version of ATT&CK so you can output the data in a standardised format.
So Many Data Types So Little Time
The future of cyber analytics is now and I am excitedly working towards making the internet a more hospitable place. I would love to hear from you if you are too.
Originally published by the Australian Cyber Security Magazine.