As you know, I’m a crazy person, and I have done a lot of different computer science research projects like developing my own RISC-based CPU and implementing my own x64-based operating system. But over the last weeks and months, I have spent a lot of time doing some low-level research in the Linux Kernel. This work helps me to better understand how Linux works, how SQL Server works on Linux (and on Docker), and it also helps me for further improving my own operating system implementation.
The Windows Subsystem for Linux (WSL) is here a great help, because it gives me a fully working Linux distribution that is seamlessly integrated into Windows itself. One of my major interests was to understand the whole compilation and linking process (static & dynamic) of Linux binaries. For that reason, I had to fully understand the ELF file format, which is used on Linux for executable files (like the PE file format on Windows systems). Linux provides you a lot of different tools with which you can analyze the structure of ELF files – mainly the tools readelf and objdump. The “problem” of these tools is that they are command line tools, and text based. Sometimes you also have to take one information from one tool and perform further analysis with it in another tool.
Because I’m a huge believer and a regular user of Visual Studio Code, I wanted to have that functionality directly available within Visual Studio Code. Unfortunately, I didn’t find any extension on the Visual Studio Code Market Place, that provides that feature set. Therefore, I decided to build my own extension, which I can use to analyze binary files directly within Visual Studio Code.
And after 2 – 3 weeks of some work (and learning TypeScript!), I have released yesterday evening the first version of my extension to the Visual Studio Code Market Place. Implementing the extension wasn’t that hard because the ELF file format is very good documented. But trust me, there are some obscure scenarios that you must think about when parsing the file format.
The great thing about the ELF file format is that you can store inside the file any kind of CPU instructions that you want. As you know, Linux is used on so many different devices these days, and these devices are not always based on Intel/AMD CPU architectures. But the ELF file format doesn’t care about such restrictions. Therefore, I had to decide initially, which CPU architectures are supported by the extension, because each CPU architecture is handled in a different way when parsing an ELF file. The extension currently supports the following CPU architectures:
- X86_64: Intel 64-bit
- AARCH64: ARM 64-bit
The extension currently parses the following information from ELF binary files:
- ELF Header
- Sections
- Segments
- Symbol Tables
- Relocation Entries
- String Tables
You can find the extension directly within Visual Studio Code by searching for “vscode-binutils” or “sqlpassion” or at the Visual Studio Code Market Place. Download it, give it a try, and tell me how you like it, and which features you want to see in future releases. The following additional features are on my roadmap:
- Support for Windows PE files
- Support for Mac OS Mach O files
- Support for 32-bit ELF files
- C++ Name Demangling
- Disassembling the .text sections (OMG???)
- …
The whole source code of the extension is also available on GitHub under the MIT license.
Thanks for your time,
-Klaus