When reverse engineering binaries, we could want, at some point, to share the reverse-engineered
information with others. The DWARF format, originally designed to hold debug
information associated with the original source code, is also well-suited for
storing reverse-engineered informations such as structure, function names.
This blog post introduces a new API in LIEF extended to create DWARF files.
It also introduces two plugins for Ghidra and BinaryNinja to export binary analysis
into DWARF.
Creating DWARF with LIEF (extended)
LIEF extended now provides a
comprehensive API to create DWARF files.
This API is available in Python, Rust, and C++ and it looks like this:
1import lief
2
3elf = lief.ELF.parse("./libd5A7BCF0524B8.so")
4
5editor: lief.dwarf.Editor = lief.dwarf.Editor.from_binary(elf)
6unit: lief.dwarf.editor.CompilationUnit = editor.create_compilation_unit()
7unit.set_producer("Generated by LIEF (LLVM backend)")
8
9func: lief.dwarf.editor.Function = unit.create_function("vm_set_register")
10func.set_address(0x1400023)
11
12editor.write("libd5A7BCF0524B8.dwarf")
Under the hood, LIEF uses the LLVM’s DWARF backend to create and generate the final DWARF.
In contrast to LLVM’s low-level API, LIEF provides an abstraction that simplifies
the implementation details of the DWARF format.
For instance, if we want to create a DWARF for a function
that contains a stack variable at the offset (on the stack) 8
,
we can use the following API:
1func: lief.dwarf.editor.Function = unit.create_function("vm_set_register")
2
3var: lief.dwarf.editor.Variable = func.create_stack_variable("my_stack_variable")
4var.set_stack_offset(8)
This code generates the following DWARF:
10x0000000c: DW_TAG_compile_unit
2 DW_AT_producer ("Generated by LIEF (LLVM backend)")
3
40x00000011: DW_TAG_subprogram
5 DW_AT_name ("vm_set_register")
6 DW_AT_entry_pc (0x0000000001400023)
7
80x0000001e: DW_TAG_variable
9 DW_AT_name ("my_stack_variable")
10 DW_AT_location (DW_OP_fbreg -8)
Defining the DW_AT_location
for a stack variable is not as simple as it sounds.
It requires defining some kind of DWARF expression and if you are curious about
the actual implementation, you can check this Github Gist.
Summary
LIEF exposes a high-level API to create DWARF based on the LLVM’s low-level API
DWARF and Reverse Engineering
Reverse engineering tools typically use their own format to store information about
analyzed binaries such as *.idb
and *.bndb
. Most of these tools are not compatible
with each other, except Binary Ninja which has a support for loading IDB (Migrating from IDA).
For Ghidra, importing IDA database is a non-goal (c.f. issue #2921)
One alternative is to export binary information using BinExport
or quokka,
but many tools lack support for importing the exported data.
In contrast, Binary Ninja, Ghidra, and IDA all have built-in support for loading
DWARF files and external DWARF files.
The DWARF format is primarily designed to hold information about the original
source code, and the purpose of reverse engineering is to recover the semantic
of the source code information from the binary.
Therefore, we could use the DWARF as a reverse-engineering shared format to
export types, functions, and variables from reverse-engineered binaries.
It’s worth noting that DWARF is compatible with PE binaries, even though this
is not the default format for storing debug information on Windows.
PE / DWARF
If you compile a Windows executable with clang[-cl]
and with the flags -g -gdwarf-5
,
the final PE will contains DWARF information along with an external .pdb
.
Currently, Binary Ninja is the only tool with a built-in plugin that can generate
a DWARF file from a BinaryView
representation. However, it lacks the ability to
export stack-based variables, which can be crucial information.
The next section introduces two plugins for Ghidra and BinaryNinja to generate DWARF
from these tools.
BinaryNinja & Ghidra Plugins
To provide some background on this feature, I initially developed the BinaryNinja’s DWARF exporter plugin
for my own needs before Vector35 team released an official plugin in BinaryNinja 3.5.
I use this plugin in my reverse engineering workflow to
symbolize QBDI traces from DWARF information:
- I statically reverse-engineer the binary
- I generate a DWARF file
- I trace the binary with QBDI that uses the DWARF file to symbolize:
- Stack accesses (hence the need to stack variables in the DWARF)
- Function calls and their parameters
- Static variables accesses
- goto 1.
For instance, I used this process to reverse engineer the DroidGuard VM
a few years ago.
I’ll take this blog post as an opportunity to share the DWARF associated with
my reverse engineering of the VM libd5A7BCF0524B8.so
:
As mentioned earlier, this functionality is integrated into BinaryNinja since version 3.5,
so I’ll focus more on the Ghidra plugin. For those interested in more details about the
BinaryNinja plugin, you can visit this page: LIEF – BinaryNinja
The Ghidra plugin allows us to export Ghidra’s Program information into a DWARF file.
This can be done from the Project Manager interface by selecting the DWARF
format
in the export section:
You can also use this plugin from the CodeBrowser
tool, by left-clicking on
the LIEF menu and selecting Export as DWARF
:
The plugin is primarily written in Java (using the JNI) and you can also generate a
DWARF file from a headless Java script:
1import lief.ghidra.core.dwarf.export.Manager;
2import lief.ghidra.core.NativeBridge;
3
4public class LiefDwarfExportScript extends GhidraScript {
5 @Override
6 protected void run() throws Exception {
7 NativeBridge.init();
8 Manager manager = new Manager(currentProgram);
9 File output = new File("/home/romain/output.dwarf");
10 manager.export(output);
11 }
12}
IDA Support
I do not plan to support IDA for this functionality. However, if there is strong
demand for it, feel free to reach out to me. You can also create your own IDA
script using the Python or C++ API.
Last Word
This DWARF export functionality is still in early development, so I cannot
guarantee it is free of bugs. Additionally, the current version does not export
comments, but I plan to support this feature in the future.
The source code for the plugins are here:
Thank you for using LIEF.
Romain.