Encrypted-at-Rest Virtual File-System in Go

One incredibly powerful feature of the Go language that hasn’t seen a lot of use is its native support for virtual file-systems. Most C-style code is written using the standard file operations that all work on passing an integer “file descriptor” around. To re-purpose that code to use anything other than the OS-provided file would involve changing every file operation call in all of the source and libraries involved.

A “file” in Go is just another set of interfaces that you can replace.

The cool thing about that is that all Go code ever written to do operations on files will work just as well if you pass it some other thing you made that implements the file-system interfaces. All you have to do is replace the standard file-system implementations you’re used to, mostly provided by the os package, with your new virtual file-system, and all the ReadFile, OpenFile, etc. function calls will stay the same.

By using a couple of existing Go libraries that do the heavy lifting, it’s possible to implement a simple read-write virtual file-system that will be encrypted and compressed on disk in only a couple pages of code, as you’re about to see.

The two libraries we’ll use are:

rainycape’s VFS library – The VFS provided by the standard library is read-only. This library adds the ability to write to the virtual file system, and also already supports importing the file-system from several different compressed archive formats. We will simply extend this to add encryption, but as of this writing, rainycape’s VFS library is the only working implementation of a Go VFS with write implemented, so it will be the likely starting point for any other Go VFS project you might attempt.

bencrypt – An encryption utility package we will use just for the AES encrypt/decrypt functions.

The first of our two pages of code is a function that simply AES decrypts a compressed file with a given key and passes the decrypted compressed file into the rainycape VFS library, which provides a full read-write implementation of a virtual filesystem decompressed into memory from a compressed file.


The second page of code is to write your changes back from memory to the disk. This does the inverse of the OpenAndDecrypt operation, it just zips the file-system back up to an archive and then AES encrypts it with the given key.

The final piece of code we’ll need to get around is a simple function that will AES encrypt any file. We can then make our starting file-system as a compressed archive and use this function to AES encrypt it to be loaded by OpenAndDecrypt.

That VFS interface returned by OpenAndDecrypt can be used just like the regular os and ioutil interfaces you’re used to getting from the standard library… and it will be easy to retro-fit your existing code to use an interface that could be a VFS.

Here’s a quick example of how you could encrypt a Zip file and then open it up as if it were the os or ioutil packages (because they both implement the same interface).

Now, before you go typing all this into your favorite IDE, I should point out that all of these functions have already been added to the bencrypt library, so you can just call them directly from there.

IMPORTANT WARNING

Also, be aware that this simple method will not encrypt your files once they have been loaded into system memory. They could be read out unencrypted by a debugger, saved in a swap file, or a tool like Volatility.

Final Thoughts

To retro-fit your existing code that uses the os package, you’ll need a defined interface that you can use that could be bound either to os or to vfs. I would suggest stealing this one from the go-vfs package as a starting point.

The other kind of odd thing about this, is that it’s almost entirely a side-effect of how Go handles interfaces. Membership in interfaces is decided at runtime, based on whether or not the required functions exist in a type to satisfy the interface. This lets you define an abstract interface for a file-system after the fact that matches what os already does, and create a virtual file-system that implements that interface as well. Your existing code just needs to be repointed to your new interface instead of to os and everything else will work, even though your interface was written way later than the os package. The same trick could allow replacing or hooking other lateral aspects of standard library functions, that might yield similarly interesting results.

PT_NOTE to PT_LOAD Injection in ELF

We’ve recently checked in a new injection method for ELF executables into binjection, based on the PT_NOTE segment. This method of infection is both really handy (read: easy to implement) and has a high success rate on ET_EXEC binaries without breaking the binary.

Understanding the techniques of binary injection is useful to both the penetration tester and the reverse engineer who is analyzing files for malware. For the pentester, having a common file such as /bin/ls execute arbitrary code each time it is run is useful in establishing and maintaining persistence. A reverse engineer who notices that an ELF binary has 3 PT_LOAD segments and no PT_NOTE segment can safely assume that the binary at the very least is suspect, and should be evaluated further.

The breakdown of the algorithm

Most ET_EXEC binaries (but not golang! [1]) have a PT_NOTE segment that can be entirely overwritten without affecting program execution. The same segment can also be transformed into a loadable segment, where executable instructions can be placed.

We will go through the algorithm next, step by step. If you need a primer on the basics of ELF files and their headers, read this article before proceeding. When you’re ready, open the binjection source code and follow along.


1. Open the ELF file to be injected:


2. Save the original entry point, e_entry:


3. Parse the program header table, looking for the PT_NOTE segment:


4. Convert the PT_NOTE segment to a PT_LOAD segment:


5. Change the memory protections for this segment to allow executable instructions:


6. Change the entry point address to an area that will not conflict with the original program execution.

We will use 0xc000000. Pick an address that will be sufficiently far enough beyond the end of the original program’s end so that when loaded it does not overlap. 0xc000000 will work on most executables, but to be safe you should check before choosing.


7. Adjust the size on disk and virtual memory size to account for the size of the injected code:


8. Point the offset of our converted segment to the end of the original binary, where we will store the new code:


9. Patch the end of the code with instructions to jump to the original entry point:


10. Add our injected code to the end of the file:


11. Write the file back to disk, over the original file:


-= We’re done! =-

Notes on system and interpreter differences of PT_NOTE

The ELF ABI specification doesn’t describe a way in which this segment must be used, therefore it is inevitable that the researcher will find binaries that use the NOTE segment in unique ways. These are sometimes system specific
(BSD varieties have a lot of possible notes), and sometimes original language specific, as in Go binaries.

Final Thoughts

PT_NOTE injection of ET_EXEC ELF binaries is a solid and tested method for the pentester. It is fast and relatively easy to implement, and will work on a high percentage of Linux binaries without modification to the algorithm.

It is probably the easiest of the injection algorithms to start out with, for those new to this area.

Understanding this method of injection is valuable to the budding reverse engineer, an easy red flag to look for in the first of many injection methods that you will encounter over time.

ELF injection is an art-form, and to people on either side of the aisle who enjoy this type of binary analysis and manipulation, the myriad of tricks to implement and detect these methods provide a vast playground with nearly
endless possibilities.

I hope that this article will be useful to those attempting to begin and advance their work in this area.

All that we are is the result of what we have thought.
— Buddhist quote

[1] golang binaries do a unique thing with the PT_NOTE segment. It contains data that must be present during execution, therefore this injection method will not work on them… more on this to follow…

Mach-O Universal / Fat Binaries

This is part two on our series on Mach-O executables, where in the first part we did a deep dive on the Mach-O file format. During the Apple-Intel transition, when they were moving from PowerPC to Intel x86_64 architecture, Apple came out with the Mach-O Universal Executable, a play on existing Fat binaries. This is one of the more interesting binaries types I’ve ever encountered. The Fat / Universal binary format acts as a wrapper for multiple Mach-O binaries of different architectures and CPU types. Meaning these binaries can run on multiple different architecture CPUs. Already having the ability to parse and write the Mach-O file structures, parsing the additional Fat headers is actually pretty simple. We’ve added both parsing and writing support for this file in our binject/debug library. Originally this was to include functionality for such 3rd party signing attacks as described in this set of posts. But legacy file types often contain many exploitable properties, and this file type may even be coming back as Apple is rumored to be moving to ARM CPUs from the Intel CPUs in the future, so let’s dig into the file format and learn some more.

Universal Binary Overview
A high level overview of a Fat / Universal Mach-O file is actually pretty simple, as the Fat headers simply act as a wrapper for the multiple Mach-O files contained within. While these can hold any number of Mach-O files within, on MacOS they are often found with two, being used to bridge between CPU transitions. Below we dive into what these Fat headers actually contain, but short to say it’s not too much. Here is an example of a Fat file structure that contains two Mach-O structures within one Universal file:

An example of the Fat / Universal Mach-O binary format

Universal Header
The Fat Mach-O file header starts with some special magic bytes, then moves on to a 4 byte arch size variable; the arch size is the length of the next dynamically sized section, the arch headers array, with one descriptor for each Mach-O structure contained within the Fat binary. Each arch header contains the same 5 properties, each 4 bytes in length. Each arch header contains the cpu and subcpu fields, which specify the architecture and cpu type the corresponding Mach-O file is for. When a Universal binary is loaded into memory this is what is checked to see what Mach-O file contained within should be loaded. Following that is the arch offset, arch size, and arch alignment, all of which specify where the corresponding Mach-O structure is located and how large it is.

N Mach-O Structures
Following the Fat binary header, there is a Mach-O structure for each of the corresponding arch headers. Each Mach-O file starts at it’s specified offset with large pads of 0 bytes between the headers and each respective Mach-O file. The Mach-O files themselves follow the exact structure outlined in the first post on Mach-Os, except for a slight difference between the reserved bytes (the i386 arch Mach-O binaries don’t seem to have the four reserved 0-bytes after the header). Simply put, these are normal Mach-O files.

Conclusion
This is an interesting file format that is still around on modern MacOS versions, and may even be coming back if Apple makes the move from Intel CPUs to ARM CPUs in the future. Ultimately, this is a pretty simple file format, but I think there is a lot of room for abuse here which we plan to experiment with in our Binjection library! Stop by soon for more interesting low level file format posts!

ahhh
Latest posts by ahhh (see all)

So You Want To Be A Mach-O Man?

Probably one of the least understood executable file formats is Mach-O, short for the Mach Object File Format, this was originally designed for NexTSTEP systems and later adopted by MacOS. When researching this file format there were a lot of documents that were misleading and the documents we did find were pretty old, prompting us to write this post. We’ve mirrored two docs (one and two) we found particularly helpful, although a little outdated. This post will also not cover Universal binaries, otherwise known as Fat Mach-O binaries, which can hold multiple Mach-O objects for different CPU architectures.

Tooling
While there are limited tools for investigating Mach-O files, there are some important ones you will want in your toolbox. One of the most useful tools you can use for parsing and visualizing a Mach-O file is MachOExplorer. otool is also a very useful native tool when investigating these files. Finally, bintriage is an interface to our custom debug library and our command line utility for investigating these files.

Intro / Background
The Mach-O file format on MacOS started with NexTSTEP CPUs targeting CMU’s Mach kernel design and Steve Jobs winning his way back into Apple’s good graces. The merger of the computing models, known as the Apple-Intel transition, gave way to multi-architecture universal file formats, which will be the subject of a follow up post. Out of those came the general Mach-O, intel 64 bit executable binary format for MacOS. That Mach-O executable file is the modern executable on MacOS (inside all Apps), and will be subject of our deep dive today.

High-Level Overview of the File Structure
Mach-O files themselves are very easy to parse, as almost every field is 4 bytes or a multiple of 4 bytes in length. Only larger abstract structures within the file format have offsets and lengths which must be observed, and even fewer data structures have required alignments. If I were to draw a diagram of the file structure based on my interactions and understanding of critical components, it would look like:

High level view of the Mach-O file structure

Mach-O Header
The header is extremely important, it helps provide data such as the magic bytes, the cpu type, and the subcpu type, which indicate the architecture and exact model cpu the binary is for. There is also the type, a 4 byte field that says whether this is an object file, a dynamic library (dylib), or an executable Mach-O file. Next is the ncmd and cmdsz fields, indicating the number of load commands and total size of the load commands. Finally there is the flags field, which is 4 bytes of bitwise flags specifying special linker options. At the end of the header is a reserved 4 byte space.

Example of a Mach-O file header

Load Commands
Probably the most important part of the Mach-O file structure are the load commands. Our debug library parses all of the critical structures required to rebuild a Mach-O file, however there are still several load commands we don’t parse but rather copy directly over. Each load command structure varies and needs to be parsed differently based on the type of load commands. Some load commands are self contained, telling the loader how to load dynamic libraries within the load command itself, where as other load commands reference other structures of data contained within the file, such as sections and tables that get loaded into segments. These are what truly represent how the file is mapped into virtual memory.

Example of some Mach-O load commands

Segments with Sections
Several segments are loaded after and corresponding to the load commands, with their respective sections. This is where a lot of the machine executable code, variables, and pointers to various functions from program code come from. Sections from the _TEXT segment and _DATA segment will almost always be present, while other segments and sections may be optional, depending on the various libraries they call and how they are compiled.

Example of some Mach-O sections

Optional Segments
Some segments are only present with certain Mach-O binary types, based on if the load commands are in place or not. An example of an optional segment would be a _DWARF segment and sections, which are used in debugging. We will likely write more on DWARF sections in a later post as there is often similar optional debug data present in ELF files.

Example of an optional _DWARF segment in a Mach-O file

LinkEdit Segment
The _LINKEDIT segment is the final segment in most Mach-O binaries and contains a number of critical components, such as the dynamic loader info, the symbol table, the string table, the dynamic symbol table, the digital signature, and potentially even more properties. This is one of the most important segments and is unfortunately left out of a lot of documentation, but make no mistake, this segment is critical for the binaries to run.

Example of some of the data in the LinkEdit Segment

Conclusion
Those are the major parts of the Mach-O file format as I understood them and worked with them to recreate working executable files. Stay tuned for a follow up post covering Fat / Universal Mach-O binaries, which act as a wrapper for multiple Mach-O binaries of different architectures. I hope this information is useful to people, and if there are questions or discussions lets talk about them in the comments and forums! If I missed anything major please let me know, and pull requests are definitely welcome on our Binject projects!

ahhh
Latest posts by ahhh (see all)

Introducing Symbol Crash

Welcome to the Symbol Crash repository of write-ups related to binary formats, injections, signing, and our group’s various projects. This all started as a revamp of the Backdoor Factory techniques and port to Go, so BDF functionality can be used as a shared library. It has since blossomed into something much more, a wellspring of cool research and a deep technical community. We recruited a number of passionate computer science and information security professionals along the way and decided to form this group to document our work. We also wanted to give back some of the neat things we were discovering and document some of the harder edge cases we came across in the process.

Most of our current projects live in the Binject organization on GitHub. We formed this blog mostly to discus the nuances of these projects and our lessons learned from these deep dives.

Binject
13 repositories, 91 followers.

We’ve divided the projects up into several libraries. These are as follows, with a short description of each:

debug
We have forked the debug/ folder from the standard library, to take direct control of the debug/elf, debug/macho, and debug/pe binary format parsers. To these parsers, we have added the ability to also generate executable files from the parsed intermediate data structures. This lets us load a file with debug parsers, make changes by interacting with the parser structures, and then write those changes back out to a new file.

shellcode
This library collects useful shellcode fragments and transforms for all platforms, similar to the functionality provided by msfvenom or internally in BDF, except as a Go library.

binjection
The Binjection library/utility actually performs an injection of shellcode into a binary. It automatically determines the binary type and format, and calls into the debug and shellcode libraries to actually perform the injection. Binjection includes multiple different injection techniques for each binary format.

bintriage
This utility is used as an extension of the debug library to provide more insight and debug information around specific binaries. It provides a verbose interface to enumerate and compare all of the features we parse in the binject/debug library.

forger
This is an experimental library to play with various binary specific code signing attacks.

We also plan to write a lot about low level file formats such as Elf, PE, Mach-O formats in the coming months, so def stop by and follow the blog for those updates. Finally, we are always looking for new members who want to join us on this journey of bits and documentation 🙂 If this resonates with you please reach out or comment.

ahhh
Latest posts by ahhh (see all)