Endpoint Detection and Response (EDR) Systems are NOT Enough

In a time full of ransomware as well as Advanced persistent Thread (APT) incidents the importance of detecting those attacking groups has become increasingly important. Some years ago the best tools/techniques for security incident detection and response included a SIEM-system filled with logs from IPS/IDS systems, proxies, firewalls, AV-logs and so on. In the recent years, an in my personal opinion increasingly relevant component has been added – “Endpoint detection and response – EDR” systems and or features. The features of those EDR systems include live monitoring of endpoints, data analysis, Threat-detection and blocking as well as Threat-hunting capabilities. In both, penetration tests and red-team engagements, these systems can make it difficult to use the public offensive security toolings, as they are more often detected and blocked. However, theese systems have a weakness which allows attackers to bypass the protection. In this blog post I’m gonna summarize all EDR bypass methods I found so far. The tools/techniques listed may not be exhaustive, but are certainly helpful to get a good overview and, if necessary, a better understanding of how to use them.

We must move away from implementation of Endpoint Detection and Response (EDR) Systems and start seriously investing in a Zero Trust Architecture; in which no one single entity can truly be trusted. This post will not discuss such mitigation measures, as that is another blog post waiting to happen.

Introduction

All those of you, who follow the Offensive-Security community will have come across the terms Userland hooking, Syscalls, P/Invoke/D-Invoke and so on again and again over the last two years. I myself came across several blog posts and tools which I didn’t understand fully. I sometimes had the feeling that I need to build up my knowledge from scratch. As I did not need those “new” techniques in many cases, I postponed the study of these topics for some months.

Due to the increasing number of security incidents, more and more companies build up a Security-Operations-Center (SOC) or Computer emergency response team (CERT). Another term is the “Cyber Defense Center”. The main purpose of these units is to analyse emerging security incidents and to identify and block potential attackers. EDR systems are increasingly being implemented and used for analysis here in addition to the SIEM. Meanwhile the EDR bypass topics have become more and more relevant for us Offensive-Security guys. Long story short: I had to dig into those topics now for myself, to be able reproducing and using the public techniques. And I thought the best way to motivate myself is writing a blog post about that topic. The tools and techniques, which are actually published are much older than the references I`m gonna refer to in this post. They were already actively used by malware in the wild before. This blog post will be a summarization of the tools/techniques I found public. I highly recommend you to read all those other blog posts linked here. They contain way more information and background knowledge. Before we dive into the main topic, we have to take a look at some Windows Operating System architecture basics as well as a small part about assembler code. Feel free to skip that part.

Assembler code

If you are writing a program, independent from the programming language, you will most likely use a compiler to build the program from the coresponding source code. The source code snippets are basically translated to Machine Language, which is in the very end binary code like 01010011 00110011 01100011 01110101 01110010 00110011, which can be directly executed by a CPU:

Some compilers, like gcc for example, produce assembler code before translating to Machine Code. Assembler Code instructions actually have an 1-to-1 mapping with Machine Code. So this is the closest to Machine Code and looks for example like that:

By disassembling a program via IDA Pro or Ghidra you will also get assembler code back from an already compiled source code.

Windows OS architecture

Programmers typically don’t want to reinvent the wheel, so basic functions are imported from existing libraries. For example printf() is imported from the library stdio.h in the C-Language. For example Windows developers are using an application programming interface (API), which can also be imported in a program. The so called Win32 API is documented and consists of several library files (DLL-Files), located in the C:\windows\system32\ folder, like for example kernel32.dll , User32.dll and so on:

NTDLL.dll is not part of the Win32 API and is not officially documented.

User-mode / Kernel-mode

The Windows OS has two different privilege levels, that were implemented to protect the Operating System from for example crashes caused by installed applications. All applications installed on a Windows System run in the so called User-mode. The kernel and device drivers run in the so called Kernel-mode. Applications in the User-mode cannot access or manipulate memory sections in the Kernel-mode. AV/EDR systems can only monitor application behaviour in the User-mode, due to the Kernel Patch Protection. And the very last instance in the User-mode are the Windows API functions from NTDLL.dll. If any function from NTDLL.dll is called, the CPU switches to Kernel-mode next, which cannot be monitored by AV/EDR vendors anymore. The single functions of NTDLL.dll are called Syscalls.

Why should I care?

Where do we, as simulated attackers, need or use the Windows API? If we, for example, want to write specific bytes, such as shellcode, into a process we can import WriteProcessMemory from the file kernel32.dll with the following C#-Code snippet:

[DllImport("kernel32.dll", SetLastError = true)]
static extern bool WriteProcessMemory(IntPtr hProcess, IntPtr lpBaseAddress, byte[] lpBuffer, uint nSize, out UIntPtr lpNumberOfBytesWritten);

One example of how to write shellcode into a remote process using kernel32.dll functions can be found here.

Another thing most of us make heavy use of are PE-Loaders. In the most situations we like to stay in memory with our implants as long as possible, to not leave any traces on disk and for AV-Evasion. So Mimikatz or any other C-written toolings have to be loaded from memory, which is done via PE-Loaders. Powersploits Invoke-ReflectivePEInjection or Casey Smith’s C# PE-Loader make heavy use of Windows API functions like CreateRemoteThread, GetProcAddress, CreateThread from kernel32.dll.

Last but not least – depending on which Command & Control framework you are using – most of them use Windows API functions for their modules.

But the functions included in the Win32 API files like kernel32.dll, User32.dll and so on don’t have a direct translation to Machine Code, but are instead mapped to other functions from the native API NTDLL.dll. For example writeProcessMemory from kernel32.dll resolves to NtProtectVirtualMemory -> NtWriteVirtualMemory -> NtProtectVirtualMemory from NTDLL.dll. The first Syscall, NtProtectVirtualMemory, sets new permissions for the process and makes it writable, the seccond one NtWriteVirtualMemory actually writes the bytes and the third call restores the old permissions for the process.

NTDLL.dll, the Native API, is therefore the last instance in front of the operating system.

Userland Hooking

As of the NTDLL.dll functions are the last intance, that can be monitored for suspicious activities from attackers or malware by AV/EDR vendors, they are typically doing exactly that. They inject a custom DLL-file into every new process. You can find DLL files, loaded into a process from AV/EDR Vendors via for example Sysinternals procexp64.exe. You need to check the Show Lower Pane button in the View menu and afterwards check the button to show DLLs loaded:

After selecting your prefered process you will see the loaded DLL-files in the Lower Pane view section. In this case we see the DLL-files loaded by McAfee AV for a cmd.exe:

Powershell.exe has much more injected DLLs from McAfee, most likely because it’s monitored for many more use-cases.

As you can see, there are three DLL-files injected by McAfee and one is called “Thin Hook Environment” – most likely the DLL that monitors Windows API calls.

So, theese loaded DLL-files monitor the process in which they are injected for specific Windows API calls. In my last blog posts I wrote about AV-Evasion in the form of signature changes, encryption and decryption at runtime and so on. If we encrypt our shellcode and decrypt that at runtime to write it into a remote process we can call writeProcessMemory, which under the hood calls NtWriteVirtualMemory at some point. One possible AV/EDR vendor goal can be to see what an attacker exactly loads into memory on runtime. So they can monitor NtWriteVirtualMemory calls. But how is this “monitoring” done?

If a program loads a function like NtWriteVirtualMemory from kernel32.dll, a copy of kernel32.dll is placed into memory. The AV/EDR vendors typically manipulate the in memory copy of this file and add their own code into specific functions, like NtWriteVirtualMemory. When the function is called by the program, the AV/EDRs additional code is executed first, which does in the case of NtWriteVirtualMemory for example an analysis of the bytes, which shell be written into the remote process. By using this technique, they can see the cleartext shellcode bytes, because they are already decrypted in this moment. The AV/EDR vendors technique of embedding their own code in memory by patching API functions is called Userland-Hooking.

By loading a custom Invoke-Mimikatz version like I did in my seccond blog post Bypass AMSI by manual modification part II with defender enabled on a system, the in-memory-scanner catches Mimikatz from memory after decryption and PE-loading. If you take a look at the code again – the decryption is done first, and the PE-Loader runs afterwards. We know now, that the PE-Loader calls several potentially suspicious Windows API calls. Those calls trigger the in-memory-scanner. So avoiding the calls will result in no memory scan at all.

The Userland-Hooking techniques made public till now, which I’m aware of are unhooking the hook somehow, re-patch it in memory, patching the AV/EDRs DLL, or avoid loading Windows API function by using direct Syscalls.

Patching the patch

There were blog posts by @SpecialHoang and MDsec in the beginning of 2019 explaining how to bypass AV/EDR software by patching the patch:

If your implant or tool loads some functions from kernel32.dll or NTDLL.dll, a copy of the library file is loaded into memory. The AV/EDR vendors typically patch some of the functions from the in memory copy and place a JMP assembler instruction at the beginning of the code to redirect the Windows API function to some inspecting code from the AV/EDR software itself. So before calling the real Windows API function code, an analysis is done. If this analysis results in no suspicious/malicious behaviour and returns a clean result, the original Windows API function is called afterwards. If something malicious is found, the Windows API call is blocked or the process will be killed. I stole a nice picture from ired.team, which may help for understanding the process:

Both blog posts focus on bypassing the EDR-software CylancePROTECT and build a PoC code for this specific software. By patching the additional JMP instruction from the manipulated NTDLL.dll in memory, the analysis code of Cylance will never be executed at all. Therefore no detections/blockings can take place:

One disadvantage for this technique is, that you may have to change the patch for every different AV/EDR vendor. It is not very likely, that they all place an additional JMP instruction in front of the same functions at the same point. They will most likely hook different functions and maybe use another location for their patch. If you already know, which AV/EDR solution is in place in your target environment, you can use this technique and you will be fine bypassing the protection by patching the patch.

I also found a repo containing PDF-files with AV/EDR vendors and their corresponding hooked Windows API functions, take a look at this here if your interested:

https://github.com/D3VI5H4/Antivirus-Artifacts

Outflankl’s Dumpert and direct system calls

Outflanknl released a tool called Dumpert from a blog post on June 19, 2019, in which they explain the use of direct system calls to bypass Userland-Hooking. I will not cover all details from the blog post but only sum up the most important facts to understand this topic. The goal of the technique used here is to not load any functions from ntdll.dll at runtime, but instead call them directly with the corresponding assembler code. By disassembling the ntdll.dll file it’s possible to get the assembler code for every single function contained.

One problem here is, that the assembler code is different at some points between Windows OS versions and sometimes even between service pack/built numbers. Google project Zero did some research about the differences, so that they can be looked up on the linked website. By embedding all different assembler code versions for all OS-Versions it´s possible to check for the underlying operating system on runtime and choose the correct assembler code for the needed Windows API function. Assembler code can be embeded in C-Projects via Visual Studio by using ASM-files. The Dumpert project is therefore using an ASM-file, which contains all nessesary Windows API functions in assembler code for each Windows version:

https://github.com/outflanknl/Dumpert/blob/master/Dumpert-DLL/Outflank-Dumpert-DLL/Syscalls.asm

To use this technique you need to know the exact NTDLL.dll functions needed for your project and extract the corresponding assembler code for them via disassembling. Afterwards you need to build an ASM-file containing all different offsets for different Windows OS-Versions. Sounds complicated.

Using this technique also has some disadvantages:

Your binary will not work anymore, whenever a newer Windows version is released. Thats because the assembler code for each function has to be changed again. So you need to build a new implant/tool version whenever changes are released by Microsoft
Disassembling all the Windows API functions is a lot of effort and needs a lot of time/work

But using this technique will enable us to bypass Userland-Hooking in general. This technique is independent from different vendors. They all will not see any Windows API function imports or calls at all. No function imports -> no patch/hook by the AV/EDR software -> stealth/bypass.

Syswhispers

With the release of the tool SysWhispers it became much easier to create custom ASM-files with the corresponding C-Header files. The manual overhead for disassembling ntdll.dll is left out. Building the ASM and Header-File became straight forward by executing a single python script:

~1 Month ago SysWhispers2 was released, which reduces the size of ASM-files and makes use of randomized function name hashes on each generation. The first version will be deprecated in the future so you should use the supported version 2.

Dumpert, Syswhispers and Syswhispers2 currently only support x64 Syscalls. If you need x86 Syscalls, there is SysWhispers2_x86 just released on Github.

If you don’t want to write your toolings/implant in C, you can also get your hands dirty with NimlineWhispers, which builds the ASM-file and the header file for Nim-Code. @ajpc500 also wrote a good blog post about how to use NimlineWhispers for Shellcode Injection via Nim. Check the blog post out here. I played with the Nim syscall Shellcode Injection PoCs for myself and it works like a charm! Be aware, that using the default NTDLL.dll function names will result in a binary containing them in cleartext, visible via any hexeditor:

Talking with @IKalendarov about NimlineWhispers, he found that Windows Defender with Cloud protection enabled executes the Shellcode successfully but throws an alert stating Defensive Evasion detected afterwards:

I found, that this detection can easily be bypassed by renaming the Windows API functions in the ASM-File and of course also in the shellcode injection code. NtAllocateVirtualMemory becomes NtAVM for example and so on. If your shellcode itself or the code behind it contains any Windows API function imports – this can be detected again. So the shellcode loader and the shellcode itself should use Syscalls to stay undetected from Userland-Hooks.

P/Invoke to D/Invoke

@TheRealWover released a C# library called D/Invoke. First this was added to SharpSploit, but later on TheWover released a nuget package ready for import in any VisualStudio project here. There also is a corresponding blog post from June 2020. If you are mostly coding in C#, this is actually the easiest way for you to go for Userland-Hooking bypasses. I’m just gonna pick small parts out of TheWovers post, as this blog post here would explode by explaining everything. If you are new to this topic his blog post may be a little bit too “heavy”. I didn’t understand half reading it the first time. @Jean_Maes_1994 released a blog post which sums up all techniques used via D/Invoke here. The resulting PoC code DInvisibleRegistry can be used to look up different D/Invoke implementation methods and is in my opinion really usefull and understandable.

P/Invoke is basically the default way for statically importing API calls from a Windows library file. The WriteProcessMemory import from kernel32.dll shown above is the P/Invoke approach. AV/EDR systems are able to patch the in memory copy of Windows library files like NTDLL.dll by using this method.

D/Invoke – in comparison to P/Invoke – is loading a Windows API function manually at runtime and calls the function using a pointer to its location in memory. The manual loading of a library file at runtime is at the time of writing not detected by AV/EDR hooks, so that they don´t patch the freshly imported functions and they stay original without hook/patch.

There are three different methods to avoid Userland-Hooking via D/Invoke:

Manual Mapping – this method loads a full copy of the target library file into memory. Any functions can be exported from it afterwards.

DInvoke.Data.PE.PE_MANUAL_MAP mappedDLL = new DInvoke.Data.PE.PE_MANUAL_MAP();
mappedDLL = DInvoke.ManualMap.Map.MapModuleToMemory(@"C:\Windows\System32\ntdll.dll");

OverloadMapping – in addition to Manual Mapping the payload stored in memory is backed by a legitimate file on disk. So the payload will appear to be executed from a legitimate, validly signed DLL on disk.

DInvoke.Data.PE.PE_MANUAL_MAP mappedDLL = DInvoke.ManualMap.Overload.OverloadModule(@"C:\Windows\System32\ntdll.dll");

Syscalls – using this technique not the whole target library is mapped to memory but only a specified function is extracted from it. This method therefore offers more stealth than Manual Mapping.

IntPtr pAllocateSysCall = DInvoke.DynamicInvoke.Generic.GetSyscallStub("NtAllocateVirtualMemory");
NtAllocateVirtualMemory fSyscallAllocateMemory = (NtAllocateVirtualMemory)Marshal.GetDelegateForFunctionPointer(pAllocateSysCall, typeof(NtAllocateVirtualMemory));

For every of the three methods you also need to create unmanaged Delegates for every Windows API function in your code. I won´t cover the whole process here as you can just read the linked blog posts from @TheRealWover or @Jean_Maes_1994.

Initially I planned to show, how to port a P/Invoke CreateRemoteThread C# shellcode injection PoC into a D/Invoke Syscall version. I was fiddling around with all those NTDLL.dll functions needed like NtOpenProcess, NtAllocateVirtualMemory, NtWriteVirtualMemory and CreateThreadEx but was unfortunately not able to successfully get my shellcode execution working. This was because I never used those NTDLL.dll functions before and struggled hard with the questions “which value should be placed in which function argument”, “which kernel32.dll function resolves to which ntdll.dll function” and had a brainfuck many evenings trying to get this to work. In parallel I confronted the awesome @_RastaMouse with all my questions about it. It took only a few days and he published a whole blog post covering exactly this topic:

https://offensivedefence.co.uk/posts/dinvoke-syscalls/

So there is no need to show this a seccond time – ¯\_(ツ)_/¯ – I got my PoC working with the information from his blog post. Just read it by yourself.

NTDLL.dll unhooking in C++ or Nim

We learned that AV/EDR systems hook specific functions of NTDLL.dll to place their own code for analysis in it. There is a nice and short article on ired.team which explains how to map a fresh copy of NTDLL.dll from disk to memory, copying the .text section from the fresh copy into the .text section of the hooked file in memory, so that the hook is undone by overwriting it:

A C++ PoC code for the unhooking process as well as a step by step guide is also included. Go ahead reading it if you didn’t so far.

Again – if someone is not that familiar with C/C++ coding – I recently played with OffensiveNim and the OffensiveNim repo contains a template named clr_host_cpp_embed_bin.nim in which we can embed pure C++ code. We can take this template and embed the C++ PoC from the ired.team website into it and we have a working NTDLL.dll unhooking PoC in Nim:


when not defined(cpp):
    {.error: "Must be compiled in cpp mode"}
# Stolen from https://www.ired.team/offensive-security/defense-evasion/how-to-unhook-a-dll-using-c++

{.emit: """
#include <iostream>
#include <Windows.h>
#include <winternl.h>
#include <psapi.h>

int test()
{
    HANDLE process = GetCurrentProcess();
    MODULEINFO mi = {};
    HMODULE ntdllModule = GetModuleHandleA("ntdll.dll");
    
    GetModuleInformation(process, ntdllModule, &mi, sizeof(mi));
    LPVOID ntdllBase = (LPVOID)mi.lpBaseOfDll;
    HANDLE ntdllFile = CreateFileA("c:\\windows\\system32\\ntdll.dll", GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
    HANDLE ntdllMapping = CreateFileMapping(ntdllFile, NULL, PAGE_READONLY | SEC_IMAGE, 0, 0, NULL);
    LPVOID ntdllMappingAddress = MapViewOfFile(ntdllMapping, FILE_MAP_READ, 0, 0, 0);

    PIMAGE_DOS_HEADER hookedDosHeader = (PIMAGE_DOS_HEADER)ntdllBase;
    PIMAGE_NT_HEADERS hookedNtHeader = (PIMAGE_NT_HEADERS)((DWORD_PTR)ntdllBase + hookedDosHeader->e_lfanew);

    for (WORD i = 0; i < hookedNtHeader->FileHeader.NumberOfSections; i++) {
        PIMAGE_SECTION_HEADER hookedSectionHeader = (PIMAGE_SECTION_HEADER)((DWORD_PTR)IMAGE_FIRST_SECTION(hookedNtHeader) + ((DWORD_PTR)IMAGE_SIZEOF_SECTION_HEADER * i));
        
        if (!strcmp((char*)hookedSectionHeader->Name, (char*)".text")) {
            DWORD oldProtection = 0;
            bool isProtected = VirtualProtect((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)hookedSectionHeader->VirtualAddress), hookedSectionHeader->Misc.VirtualSize, PAGE_EXECUTE_READWRITE, &oldProtection);
            memcpy((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)hookedSectionHeader->VirtualAddress), (LPVOID)((DWORD_PTR)ntdllMappingAddress + (DWORD_PTR)hookedSectionHeader->VirtualAddress), hookedSectionHeader->Misc.VirtualSize);
            isProtected = VirtualProtect((LPVOID)((DWORD_PTR)ntdllBase + (DWORD_PTR)hookedSectionHeader->VirtualAddress), hookedSectionHeader->Misc.VirtualSize, oldProtection, &oldProtection);
        }
    }
    
    CloseHandle(process);
    CloseHandle(ntdllFile);
    CloseHandle(ntdllMapping);
    FreeLibrary(ntdllModule);
    
    return 1;
}
""".}
proc unhook(): int
    {.importcpp: "test", nodecl.}
when isMainModule:
    var result = unhook()
    echo "[*] Assembly executed: ", bool(result)
    # Every code from here is not hooked / detected from Windows API imports at runtime anymore

If you are looking for a language independent solution of unhooking NTDLL.dll I can recommend Shellycoat shellcode.

By injecting this shellcode first – which can be done in any language – the same process of replacing the .text section of the hooked NTDLL.dll is done. After injecting Shellycoat you can inject your implant code, which will not get detected by hooks anymore. Slaeryan also covers different methods of how to unhook NTDLL.dll in the repo with Pros & Cons, thats worth reading it.

SharpBlock – Patching the Entrypoint

@EthicalChaos had a new approach on bypassing EDR systems. This is explained in two blog posts, Lets create an EDR and bypass it part I and Lets create an EDR and bypass it part II – also from June 2020 – with the resulting tool SharpBlock.

SharpBlock is using a different approach in comparison to the others before. It´s creating a new process and is using the Windows Debug API to listen for LOAD_DLL_DEBUG_EVENT events. SharpBlock is looking for the EDR’s DLL to be loaded via debug API and patches the Entrypoint of this newly injected DLL so that it just returns TRUE instead of doing anything else. The target DLL will therefore do nothing and exits -> no hooks/patches again.

SharpBlock enables us to specify a target DLLs file-name or Description to patch it’s Entrypoint. Playing with SharpBlock for this blog post I tried blocking out McAfees EpMPThe.dll with the following command:

SharpBlock.exe -d "McAfee Endpoint Thin Hook Environment" --disable-bypass-amsi -e "C:\Windows\System32\cmd.exe" --disable-bypass-etw --disable-header-patch -w

This resulted in the following behaviour:

I asked @EthicalChaos about a possible reason for this failed block and he told me that this will most likely be the first protection mechanism against SharpBlock. In particular line 123 failed to execute which is the WriteProcessMemory function. As described above WriteProcessMemory resolves to NtProtectVirtualMemory and NtWriteVirtualMemory from NTDLL.dll and McAfee seams to block processes from changing it’s hooking DLL’s memory protection to RWX via NtProtectVirtualMemory or writing into it via NtWriteVirtualMemory. So Sharpblock itself was hooked with EpMPThe.dll and not able to patch the McAfee hooking DLL because of a Userland-Hook. This blog is about Userland-Hooking bypass methods – one way doing that is using direct Syscalls instead of API imports, right?

Using D/Invokes method GetSyscallStub @EthicalChaos changed the WriteProcessMemory function to direct Syscalls in another branch. In this branch NtProtectVirtualMemory and NtWriteVirtualMemory are called directly without a hook, so that SharpBlock patches McAfees hooking DLL successfully again:

And – tada – the DLL is not loaded anymore:

Hell’s Gate VX technique

@am0nsec and @smelly__vx released another technique of using direct Syscalls for Shellcode execution. They released a PoC code written in c as well as a PoC written in .NET Core.

As far as I unterstood this from skimming the official paper “only” the method of retrieving the correct Syscall for functions from NTDLL.dll or other library files is different. So they are not extracted from the file directly. But I have to admit this paper is written in a “heavy” language – so for people like me that are not really deep in this subject it’s hard to unterstand. I’m gonna read it again in some months and maybe I’ll unterstand the approach better – sometimes waiting and reading other posts/papers is the key.

Conclusion

This post has gotten way bigger than I planned. It’s by far the one, where I had to spend the most hours of research for understanding this topic by myself. But that was worth it. For those people reading the article, which were already familiar to the topic – I hope you were not bored. On the other hand, you wouldn’t be reading this last part now if you were bored. 🙂

I went through all those different tools/techniques made public in the last years for bypassing Userland-Hooking. As the AV/EDR solutions from today tend to monitor the last instance of the User-Land which is NTDLL.dll, they patch it’s library functions in memory and put their own code into it for runtime analysis of potentially malicious code. We can undo this patch by loading a fresh copy of NTDLL.dll and overwrite the hooked functions, we can patch out the hook via patch or use direct system calls via different techniques. We found PoC codes in different coding languages, so that C/C++-implants, C#-implants or Nim-implants are covered with bypass code. I hope that I was able to explain this topic in a more or less “light” manner so that people without background knowledge learned something.

Feedback and additions or corrections are strongly encouraged. You can reach me via the above channels as always.