Firmware Border Binaries and Multi-Binary Taint Analysis
Have you heard of ‘border binaries’ before? I hadn’t before this week!
In general, most software security operates on the basis of sources and sinks - vulnerable data, input, influence, etc. comes from a known source, and eventually reaches a vulnerable sink. When thinking of a single binary, this concept can be fairly straightforward:
If a binary takes input via argv that eventually ends up in a system() call and leads to command execution, argv() is the source, and system() is the sink.
But what about when thinking of an entire embedded system, rather than just a single binary?
This is where the concept of border binaries come into play!
What are Border Binaries?
Border binaries are the files on a system that ingest input into the system. This could be in the form of direct user input, packets being received, or even drivers for peripheral devices that read externally-controlled data. From here, this data is passed inwards into the device - it may be written to a database, placed on the filesystem, stored in RAM, etc.
In this scenario, the eventual “sink” may be in a completely different binary on the system. Perhaps you used a router’s web server to write an exploit to a temp file, and the exploit isn’t triggered until an automated cronjob on the system finally interacts with the file at midnight. This is a perfect example of a multi-binary source and sink combination!
In these scenarios, identifying the ‘border binaries’ on an embedded system is an important part of the threat modeling process. You want to understand where user-controlled or potentially-malicious data first enters an embedded system, and eventually, what that data can touch!
Multi-Binary Taint Analysis
These concepts are why I’ve recently been diving down the hole of multi-binary taint analysis! The idea is to determine what areas of a system ingest potentially-dangerous information, and then determine what other binaries may interact with that now-tainted information.
When analyzing a complex firmware binary from a completely black-box perspective, threat modeling can often be a major chore. “What the heck are all of these random system services? Which ones actually ingest data, and which are just managing random internal processes?” The ability to quickly determine which binaries on a given system are responsible for ‘border’-type tasks can quickly narrow your scope when reverse engineering or looking for vulnerabilities, without requiring that you manually investigate every single binary on a system!
This is a concept I find extremely interesting, and something I’m hoping to eventually write a tool to assist with. But for now, all I have is reading material!
My Favorite Papers
I’ve included a few of my favorite papers on the topic below. If you come across any other interesting resources on this topic, send them my way!
Karonte: Detecting Insecure Multi-binary Interactions in Embedded Firmware
Operation Mango: Scalable Discovery of Taint-Style Vulnerabilities in Binary Firmware Services
MIRAGE: Multi-Binary Image Risk Assessment with Attack Graph Employment
Identifying Multi-Binary Vulnerabilities in Embedded Firmware at Scale
A Vulnerability Scanning Method for Web Services in Embedded Firmware
DTaint: Detecting the Taint-Style Vulnerability in Embedded Device Firmware