Note: this write-up stays disciplined about what is directly supported by the recovered loader, unpacked payload, helper scripts, and HLIL. Where the sample clearly supports a behaviour, I say so. Where the capability would require recovered modules or live traffic to prove, I leave it open.
I - What we are trying to understand
I came across this sample by starting with the loader first. I already knew the loader was malware from the outset; I had pulled it from a repository of malicious samples. What I wanted to understand was what it actually unpacked and how much of the real implant logic was already present in the recovered stage.
I did not chase the live server-side modules or try to recover every follow-on capability the operator might have delivered later. That was not really the point of this pass.
What interested me was that most of the important machinery is already here.
Even without chasing the server-delivered modules, this sample still exposes:
- the loader-to-payload chain
- the unpacking logic
- the runtime bootstrap
- the import hashing
- the HTTP transport
- the cookie-based C2 wrapping
- the task and module execution framework
So this is not a complete write-up of every possible malicious capability in the broader intrusion set. It is a write-up of the parts that are already present and legible in the recovered artefacts.
What we actually have is a staged chain:
- a Windows PE loader
- an encrypted blob embedded in
.data - a decrypted raw shellcode-style payload
- a second stage that resolves APIs manually, builds an HTTP C2 channel, and executes structured task blobs
So the practical reversing question is not just "is this ChChes?".
It is:
What is the loader actually unpacking, what is the payload actually doing once it starts, and how do we write small Python tooling that proves each step instead of glossing over it?
Based on the URL pattern, tasking model, and overall structure, this lines up well with ChChes / APT10-linked reporting from JPCERT/CC, LAC, and MITRE ATT&CK. But the useful part of the exercise is not the family label. The useful part is making the loader-to-payload chain concrete enough that you can reproduce the analysis yourself, while staying honest about what this recovered stage does and does not prove.
II - The workflow at a glance
The whole reversing path looks like this:
That sequence matters.
If you start from the raw payload without understanding how it was recovered, you miss why the base address and entry matter.
If you look at the HLIL before resolving the hashes, you end up staring at a forest of constants.
If you assign capabilities too early, you end up claiming more than the sample actually proves.
III - What the loader is actually doing
The key loader function in HLIL is sub_408fd0.
Even before naming anything, the structure is recognisable:
10040902b int32_t var_38 = 0xe6fa
200409041 int32_t* ebx = sub_4092cd()
30040904c sub_409f00(ebx, 0x424000, 0xe6fa)
40040905f sub_4017c0(0x434828, ebx, 0xe6fa)
50040906e int32_t* dwSize = sub_401cf0(esi, ebx, 0xe6fa)
600409080 int32_t eax_6 = VirtualAlloc(
7 lpAddress: nullptr,
8 dwSize,
9 flAllocationType: MEM_COMMIT | MEM_RESERVE,
10 flProtect: PAGE_EXECUTE_READWRITE)
1100409086 *arg1 = eax_6This tells us a few things immediately:
- there is a concrete blob size:
0xe6fa - bytes are copied from a concrete VA:
0x424000 - some transform is applied before output size is derived
- the output is intended to be executed, not just stored, because the loader allocates RWX memory with
VirtualAlloc
That is the point where writing a Python script becomes the right move.
Not because "automation is nice", but because the loader has already shown us that the packed data and the unpacking path are concrete enough to reproduce.
IV - Why write the unpacker in Python
When you already have HLIL, it is tempting to keep reversing inside the disassembler and just inspect memory live.
That works for a quick sanity check, but it is weaker than writing the transformation down.
A Python unpacker gives you three things:
- repeatability
- a way to validate your reading of the loader
- a clean payload artefact that you can reload in tooling without depending on the original process state
The mental process behind unpack_stub.py is straightforward:
- identify where the blob lives
- identify where the key lives
- work out how the loader converts VA to raw bytes
- reproduce the decryption
- reproduce the container unpacking
- verify that the output behaves like the next stage you expect
That is the important teaching point. You are not writing "a malware script". You are writing an executable statement of your reverse-engineering hypothesis.
V - Writing the unpacker step by step
The first problem is mundane but critical: Binary Ninja IL text is not the source of truth for bytes. The PE file is.
So the script starts by reading the original sample with pefile and converting virtual addresses into file offsets:
1def va_to_file_offset(pe: pefile.PE, va: int) -> int:
2 rva = va - pe.OPTIONAL_HEADER.ImageBase
3
4 for sec in pe.sections:
5 start = sec.VirtualAddress
6 end = start + max(sec.Misc_VirtualSize, sec.SizeOfRawData)
7
8 if start <= rva < end:
9 return sec.PointerToRawData + (rva - start)
10
11 raise ValueError(f"VA {va:#x} / RVA {rva:#x} not inside any section")This is the first real engineering decision in the script.
Why write this helper first?
Because once the loader HLIL says "copy bytes from 0x424000" and "read the key from 0x4326fc", you need a byte-accurate way to answer those requests from the PE on disk.
After that, the script can read the embedded blob and AES key directly:
1BLOB_VA = 0x424000
2BLOB_SIZE = 0xE6FA
3
4AES_KEY_VA = 0x4326FC
5AES_KEY_SIZE = 16
6
7blob = read_va(pe, raw, BLOB_VA, BLOB_SIZE)
8aes_key = read_va(pe, raw, AES_KEY_VA, AES_KEY_SIZE)At this point the script is not doing anything magical. It is just turning hardcoded observations from the loader into concrete bytes.
VI - Reproducing the decryption logic
The loader clearly transforms the blob before allocating the decoded result. The reconstruction that matched the recovered artefact layout here was AES-ECB over the aligned main body.
The concrete values used by the unpacker were:
1Encrypted blob VA: 0x424000
2Encrypted blob size: 0xe6fa
3AES key VA: 0x4326fc
4AES key: 21 a1 95 06 2f af 32 a6 ab f7 15 8f 09 cf 4f 3fThe corresponding Python is small:
1def aes_ecb_decrypt_blocks(data: bytes, key: bytes) -> bytes:
2 full_len = (len(data) // 16) * 16
3 tail = data[full_len:]
4
5 cipher = Cipher(algorithms.AES(key), modes.ECB())
6 dec = cipher.decryptor()
7
8 return dec.update(data[:full_len]) + dec.finalize() + tailThere are two useful lessons in that function.
First, do not overbuild. If the sample is using a simple block transform, mirror the observed behaviour exactly.
Second, preserve the tail if the blob length is not block-aligned. That reflects the practical reality of reversing real loader code: your script should mimic what the sample does, not what an ideal crypto wrapper would do in a clean-room design.
Once decrypted, the container header becomes readable:
1mode, key_len, key_off, out_len, payload_off = struct.unpack_from("<5I", aes_plain, 0)That header tells the unpacker how to interpret the next stage:
modekey_lenkey_offout_lenpayload_off
This is one of the key transitions in the workflow. Before this point, we are dealing with an encrypted loader blob. After this point, we are dealing with a structured container whose fields we can reason about directly.
VII - Reproducing the container unpacking
The next step in unpack_stub.py is the container-specific decode:
1def unpack_container(aes_plain: bytes) -> tuple[bytes, dict]:
2 mode, key_len, key_off, out_len, payload_off = struct.unpack_from("<5I", aes_plain, 0)
3
4 if mode != 1:
5 raise ValueError(f"Unexpected container mode: {mode:#x}")
6
7 xor_key = aes_plain[key_off:key_off + key_len]
8 payload = aes_plain[payload_off:payload_off + out_len]
9
10 final_xor_byte = xor_key[-1]
11 unpacked = bytes(b ^ final_xor_byte for b in payload)This is the second major engineering idea in the script.
The goal is not just "get output bytes". The goal is to capture the assumptions the reverse-engineering depends on:
modeshould be what we observed- the key material should not be truncated
- the payload length should match the advertised output length
- the final XOR stage should match the loader's real behaviour
That is why the script returns both the unpacked bytes and metadata. When you are validating a reversing hypothesis, introspection matters.
The recovered output is not a PE. It is a raw shellcode-style payload stage.
That result is important because it tells us we have probably decoded the right thing. If the output had looked like random data or a broken PE, the script would have been wrong or incomplete.
VIII - The unpacking workflow as a graph
The loader-to-payload script logic can be drawn very simply:
This is exactly the sort of malware workflow that benefits from a small script.
Each box is concrete.
Each edge corresponds to an observation from the loader.
And once you can run it end-to-end, the reverse-engineering stops being an informal story and becomes a reproducible pipeline.
IX - What comes out of the loader
Once unpacked, the output is not another PE. It is a raw shellcode-style stage.
That changes how you load it in your tooling and how you interpret the next analysis steps. If you expect imports, sections, and a friendly PE entry point, you will waste time.
The recovered entry chain was:
1sub_2a1fec -> sub_2a1ce0 -> sub_2a2065 -> sub_2afd60And the entry itself is visible in HLIL:
1002a1fec void shellcode_entry() __noreturn
2
3002a1ff0 shellcode_main_init()
4002a2009 struct runtime_ctx* eax_1 = init_api_table(&g_ctx)This is the next good teaching point.
When a raw stage enters a small setup routine and then immediately starts building a runtime context, you are not at the payload's real mission logic yet. You are looking at bootstrap code preparing the environment.
X - What the export resolver is actually doing
The payload does not rely on the normal import table for the interesting functionality. Instead, it walks module exports and resolves them by a ROR7-based hash.
The HLIL is clear enough to show the whole idea:
1002a1f17 if (*ebx == 0x5a4d)
2002a1f31 if (*(eax_1 + ebx) == 0x4550 && *(eax_1 + ebx + 0x7c) != 0)
3...
4002a1f68 char* edx_3 = *(esi_2 + (edx_1 << 2)) + ebx
5...
6002a1f79 int32_t ebx_3 = ror.d(var_8_1, 7) + sx.d(eax_2.b)
7...
8002a1f91 if (var_8_1 == arg2)
9002a1fbd return *(ecx_4 + (zx.d(*(edi_2 + (var_c_1 << 1))) << 2)) + ebxThis tells us:
- the function validates
MZandPE - it walks the export name table
- it computes a rotate-right-7 plus additive hash over the export name bytes
- it compares the result to the caller's constant
- it returns the resolved export VA on a match
This is one of those cases where HLIL really earns its keep. We are not vaguely feeling that "some hash loop exists". We can read the control-flow well enough to reimplement it faithfully.
XI - Why write a second Python script for the hashes
Once the resolver is understood, the next bottleneck is readability.
The payload is full of constants like 0xbbafdf85, 0x0c917432, and 0x04a7c4c8. Until you resolve them, init_api_table is just an ocean of opaque numbers.
This is the reason for resolve_hashes.py.
Again, the process matters more than the file itself:
- derive the hash algorithm from HLIL
- reproduce it exactly
- enumerate exports from likely DLLs
- match concrete hashes back to API names
- feed those names back into the reverse-engineering
The core reimplementation is tiny:
1def ror32(x: int, n: int) -> int:
2 x &= 0xFFFFFFFF
3 return ((x >> n) | (x << (32 - n))) & 0xFFFFFFFF
4
5
6def api_hash(name: bytes) -> int:
7 h = 0
8 for b in name:
9 h = (ror32(h, 7) + b) & 0xFFFFFFFF
10 return hThis is exactly what we want from a helper script.
It takes one local reversing insight and turns it into leverage across the rest of the sample.
XII - What the hash script gives us back
Once the hashes are resolved, the bootstrap stops looking mysterious.
A few examples are enough to make the point:
10xbbafdf85 -> kernel32.dll!GetProcAddress
20x0c917432 -> kernel32.dll!LoadLibraryA
30x5d0fb57d -> kernel32.dll!SetErrorMode
40x04a7c4c8 -> winhttp.dll!WinHttpGetIEProxyConfigForCurrentUser
50x8f5ef202 -> winhttp.dll!WinHttpGetProxyForUrl
60x491b47ec -> wininet.dll!InternetInitializeAutoProxyDll, jsproxy.dll!InternetInitializeAutoProxyDll
70x1a7d2670 -> advapi32.dll!CredEnumerateWThat immediately changes what we can say about the payload.
It is not just hiding LoadLibraryA and GetProcAddress to be annoying. It is building a substantial runtime around:
- HTTP communications
- proxy discovery and PAC / WPAD handling
- host identity
- task execution
- proxy authentication compatibility
This is also where disciplined reading matters. The presence of CredEnumerateW does not automatically prove generic credential theft as an operator goal. In this sample, the surrounding resolved APIs support a narrower and better-grounded interpretation: the implant appears designed to operate reliably in enterprise proxy environments.
XIII - What we are actually seeing during bootstrap
Once the hash constants are readable, init_api_table becomes much easier to explain.
In HLIL we can see the payload recover GetProcAddress, then LoadLibraryA, then load additional DLLs:
1002a20f5 void* GetProcAddress = resolve_export_by_ror7_hash(field_38, __saved_ecx_2, var_10c_2)
2002a20fd g_ctx->pGetProcAddress = GetProcAddress
3...
4002a210d resolve_export_by_ror7_hash(g_ctx->field_38, 0xc917432, GetProcAddress)
5002a2115 g_ctx->pLoadLibraryA = eax_10
6...
7002a2211 __builtin_strcpy(dest: &ebp_1[-0x20], src: "winhttp.dll")
8...
9002a22e2 ebp_1[-1] = pLoadLibraryA(&ebp_1[-0xb])
10002a233e ebp_1[-3] = g_ctx->pLoadLibraryA(&ebp_1[-0x20])From there the bootstrap flow looks like this:
That is not the shape of a tiny single-purpose downloader. It is the shape of a backdoor building itself a runtime for real-world operation.
XIV - What the C2 channel is actually doing
The hardcoded URL is visible directly in the HLIL export:
1hxxp://zebra[.]wthelpdesk[.]com/%r.htmThe payload is also explicit about the transport wrapper it wants to use:
1002adf30 int32_t encode_c2_message_into_cookie(int32_t arg1, int32_t arg2, int32_t arg3)
2...
3002adf9b __builtin_strncpy(dest: &var_12c, src: "Cookie", count: 7)And elsewhere:
1002a67d8 __builtin_strcpy(dest: &var_24, src: "Cookie:")
2002a7702 __builtin_strcpy(dest: &var_40, src: "Set-Cookie:")That is a much stronger statement than "it communicates over HTTP".
It tells us the payload deliberately packages outbound data into cookie headers so the traffic blends into ordinary-looking web requests more easily.
The message builder also gives us protocol classes we can observe directly:
1002ade47 if (arg2 != 0x400)
2002ade89 if (arg2 == 0x401)
3002ade91 *(0(result) + result - 1) = 0x42
4...
5002adeb3 if (arg2 == 0x406)
6002aded7 *(0(result) + result - 1) = 0x43
7...
8002adf1d *(0(result) + result - 1) = 0x44
9002ade47 else
10002ade49 int32_t eax_6 = build_initial_host_profile()
11002ade57 *(0(result) + result - 1) = 0x41From that, we can map:
0x400/ markerA: initial registration with host profile0x401/ markerB: polling / check-in0x406/ markerC: result upload- marker
D: another control or status path reached through the remaining branch in this message builder
Again, this is the kind of detail that is worth teaching carefully. We are not just naming a function and projecting behaviour onto it. We can see the message-type branch, the marker assignment, and the initial host-profile path in the HLIL itself.
XV - How the payload and C2 actually communicate
At a high level, the payload does not just open an HTTP connection and dump obvious tasking data into the body. It appears to build an internal message format first, then wrap that message for HTTP transport inside cookie headers.
The visible path looks like this:
The interesting part is that the payload seems to separate logical protocol messages from transport encoding.
In other words:
build_c2_message(...)appears to construct the implant's real application-layer messageencode_c2_message_into_cookie(...)appears to take that message and turn it into something safe to carry in aCookieheader- the HTTP layer then sends that wrapped data to the C2 URL
That is important because it tells us the cookie is not the protocol itself. The cookie is the carrier.
The protocol appears to have at least four message classes:
- initial registration
- periodic polling
- result upload
- an additional control or status path
And we can see those classes being built before the cookie transport step:
1002ae47b int32_t eax_1 = build_c2_message(var_34, 0x400, nullptr)
2002ae48f int32_t eax_4 = encode_c2_message_into_cookie(eax_1, esi[0x19](eax_1), 1)
3
4002ae6b3 int32_t eax = build_c2_message(*0x1c, 0x401, nullptr)
5002ae6c7 int32_t eax_2 = encode_c2_message_into_cookie(eax, 0(eax), 1)
6
7002aeb14 int32_t eax_9 = build_c2_message(&var_20, 0x406, 0x48)That gives us a reasonably clean model for the wire behaviour.
On first contact, the implant likely builds a registration message containing host profiling data and sends that via the cookie path.
After that, it appears to send periodic poll messages to ask for work.
When task execution finishes, it appears to package the result into a result message and send that back through the same transport wrapper.
So the traffic model looks like this:
There are also signs that the server side may use cookie-related response handling as well, which is why Set-Cookie: appears in the payload alongside Cookie:. The careful reading is that Set-Cookie: strongly suggests cookie-aware response parsing, while the strongest directly supported conclusion is the outbound use of Cookie as a C2 transport container.
This is a useful design choice for the operator.
Normal-looking HTTP requests with cookie headers are less conspicuous than bespoke plaintext tasking fields, and the payload can still preserve its own internal protocol structure behind that wrapper.
XVI - What the tasking model actually looks like
The recovered payload is best understood as a modular HTTP backdoor, not a monolithic stealer.
The reason is in the execution path. Inbound work is handled as structured blobs with validation, compatibility checks, checksums, worker-thread execution, caching, and explicit status reporting.
A useful HLIL slice is this one:
1002aa422 if (*(edx_5 + ecx_4 + 0x24) == ecx_5)
2002aa43e esi_3 = sub_2a9690(edx_5 + ecx_4 + 0x10, esi_1, &var_14)
3...
4002aa45e if (esi_3[0xa] == esi_3[6] + esi_3[5] + esi_3[1] + *esi_3)
5002aa498 var_8 = sub_2a9050(esi_3, *ebx, &var_c)Even without perfect names, that is not the shape of:
receive command string -> run command -> send stdout
It is the shape of:
- parse a wrapped task or module format
- validate structure
- verify integrity
- perform compatibility or activation checks
- unpack or materialise the module
- run it via a worker execution path
The diagnostic strings support the same reading:
1"Invalid data received!"
2"Module is not found!"
3"Activation is required!"
4"Incompatible module received!"
5"Checksum error!"
6"Result is empty!"Those strings tell us the operator expects module failures to happen in specific ways and wants detailed feedback.
That is a framework mindset, not a one-shot payload mindset.
XVII - What is proven and what is not
From this recovered payload stage, we can prove:
- staged loader-to-payload execution
- embedded blob extraction and decryption
- raw shellcode-style second-stage recovery
- manual import resolution using export hashing
- HTTP C2
- cookie-based transport
- proxy-aware communications
- structured task or module execution
- cached module handling
- result upload with diagnostic status strings
What we cannot honestly claim from this payload alone is:
- confirmed interactive shell functionality
- confirmed screenshot capture
- confirmed keylogging
- confirmed standalone file theft
- confirmed credential exfiltration as a primary mission
Those may exist in modules delivered later. This recovered stage does not prove them on its own.
XVIII - Summary
The cleanest description of this sample is:
ChChes here appears to be a staged, modular HTTP implant that recovers a raw shellcode-style payload from an embedded encrypted blob, bootstraps itself with ROR7-based API resolution, communicates through cookie-encoded HTTP traffic, receives structured task or module blobs from C2, executes them in worker threads, caches reusable modules, and reports results back to the operator.
From a reversing perspective, the broader lesson is just as useful as the family identification.
Split the problem into loader, bootstrap, transport, and tasking.
Then, when one of those stages is concrete enough, write the smallest Python script that forces your understanding to become testable.
That is what turned this sample from "interesting malware with opaque constants" into a legible workflow from embedded blob to operator tasking.