In the previous post, we built the baseline version of control-flow flattening (CFF): split a function into blocks, route the blocks through a dispatcher, and use a state variable to decide which block executes next.
This post assumes you already understand that baseline model. We are going to focus on the next layer:
- How the current project hardens the baseline flattener.
- What the generated
flattened.cactually looks like. - What the flattened binary looks like in Binary Ninja or
objdump. - How to deobfuscate it from the code that is actually present.
The disassembly discussed here is from flattened_test, built from the flattened.c shown below.
I - Where the baseline CFF transform breaks
The baseline transform from the previous post looked roughly like this:
1while (state != EXIT_STATE) {
2 switch (state) {
3 case 1:
4 /* original block A */
5 state = 2;
6 break;
7 case 2:
8 /* original block B */
9 state = 3;
10 break;
11 case 3:
12 return;
13 }
14}That is structurally flattened, but it is still easy to reverse. Every payload block writes a clear next-state constant, and every block returns to the dispatcher.
A deobfuscator can usually recover the graph by:
- Finding the dispatcher.
- Collecting the state constants used by the dispatcher.
- Finding blocks that write those constants.
- Replacing state-machine routing with direct control-flow edges.
The baseline graph looks like this:
The hardening in this project is designed to make those four recovery steps less obvious.
II - The original function
The original target is intentionally tiny:
1#include <stdio.h>
2
3void check_password(int input) {
4 if (input == 1337) {
5 printf("Access Granted!\n");
6 } else {
7 printf("Access Denied.\n");
8 }
9}
10
11int main() {
12 check_password(1337);
13 return 0;
14}The original control flow is simply:
The goal of the flattener is to hide that tiny branch behind a state machine.
III - Encoding the state space
The first hardening layer removes small sequential state values. The flattener uses this helper:
1STATE_KEY = 0x5A5A5A5A
2STATE_MULT = 1103515245
3STATE_ADD = 12345
4STATE_MASK = 0x7fffffff
5
6def encode_state(value):
7 """Sparse affine/XOR encoding for logical CFF state IDs."""
8 return (((value * STATE_MULT) + STATE_ADD) ^ STATE_KEY) & STATE_MASKThe logical states are still small inside the Python transformation, but the generated C receives only encoded values.
The important mappings are:
1logical 0 -> 0x5a5a6a63 exit state
2logical 1 -> 0x1b9c24fc entry state
3logical 2 -> 0x59d69749 success payload
4logical 3 -> 0x1f0941da failure payload
5logical 4 -> 0x5d4333b7 real password condition
6logical 5 -> 0x1285e200 cleanup state
7logical 7 -> 0x16360f6e guard/pre-dispatch state
8logical 11 -> 0x09dfd4b2 true staging state
9logical 12 -> 0x4f11870f false staging state
10logical 13 -> 0x0d487198 post-payload state
11logical 41 -> 0x5298e5f4 bogus guard alternative
12logical 77 -> 0x12ff9d58 bogus/post opaque alternative
13logical 99 -> 0x35eade3a bogus stateSo instead of seeing state = 2, the binary contains values such as 0x59d69749 and 0x1f0941da.
This does not make the states secret. A reverse engineer can still collect them from the dispatcher. But it removes the obvious ordering and makes the state machine look less like a classroom CFF example.
IV - Hiding state assignments with opaque arithmetic
The second hardening layer wraps state assignments in an input-dependent expression:
1def opaque_encoded_state(value):
2 """Emit encoded(value) with a harmless input-dependent opaque term."""
3 return b("^", encoded_state_const(value), b("*", i("input"), c(0)))That emits C like this:
1state = 0x1b9c24fc ^ (input * 0);Mathematically, input * 0 is always 0, and x ^ 0 is always x. So the runtime value is unchanged.
Structurally, though, the assignment now appears to depend on function input:
This matters most against simple static scripts. A script looking only for direct state = constant assignments may miss or misclassify these assignments at the C level.
In the binary, GCC simplifies some of these expressions into direct immediate stores anyway. Source-level obfuscation is filtered through compiler code generation; the binary is the source of truth.
V - Adding a guard variable
The flattener also introduces a guard variable:
1GUARD_KEY = 0x1357
2
3def create_guard_var():
4 return create_int_decl("guard", b("^", i("input"), c(GUARD_KEY)))In the generated C, guard starts as:
1int guard = input ^ 4951;Then the flattened function uses guard checks that are true during normal execution, such as:
1if (guard == (input ^ 4951)) {
2 state = encoded_state_4;
3} else {
4 state = encoded_state_41;
5}The normal path goes to logical state 4, which is the real password check. The alternate path goes to logical state 41, which is structurally present but not reached during normal execution.
The purpose is to add another data-dependent-looking branch before the real condition:
VI - Separating the real condition from the payloads
The original program directly connects the condition to the two payloads:
The hardened flattener deliberately breaks that direct relationship.
It converts the original if condition into a state-selection expression:
1condition_logic = create_state_assign_expr(
2 c_ast.TernaryOp(
3 original_if.cond,
4 opaque_encoded_state(11),
5 opaque_encoded_state(12)
6 )
7)The generated C is:
1state = (input == 1337) ?
2 (165663922 ^ (input * 0)) :
3 (1326548751 ^ (input * 0));Those constants are the encoded staging states:
1logical 11 -> 0x09dfd4b2 -> true staging
2logical 12 -> 0x4f11870f -> false stagingThe staging states then route to the real payload states:
1logical 11 -> logical 2 -> Access Granted
2logical 12 -> logical 3 -> Access DeniedGraphically, the hardened route is:
This is why the condition block in Binary Ninja does not jump directly to the strings. It writes one of two encoded staging states, returns to the dispatcher, and the dispatcher eventually reaches the payload.
VII - The generated C state machine
After rewriting check_password, the important part of flattened.c is this shape:
1void check_password(int input)
2{
3 int state = 463217916 ^ (input * 0);
4 int guard = input ^ 4951;
5 while (state != 1515874915)
6 {
7 switch (state)
8 {
9 case 463217916:
10 /* logical state 1: opaque entry gate */
11 state = 372641646 ^ (input * 0);
12 break;
13
14 case 372641646:
15 /* logical state 7: guard check */
16 if (guard == (input ^ 4951))
17 state = 1564685239 ^ (input * 0);
18 else
19 state = 1385752052 ^ (input * 0);
20 break;
21
22 case 1564685239:
23 /* logical state 4: real password condition */
24 state = (input == 1337)
25 ? (165663922 ^ (input * 0))
26 : (1326548751 ^ (input * 0));
27 break;
28
29 case 165663922:
30 state = 1507235657 ^ (input * 0);
31 break;
32
33 case 1326548751:
34 state = 520700378 ^ (input * 0);
35 break;
36
37 case 1507235657:
38 printf("Access Granted!\n");
39 state = 222851480 ^ (input * 0);
40 break;
41
42 case 520700378:
43 printf("Access Denied.\n");
44 state = 222851480 ^ (input * 0);
45 break;
46 }
47 }
48}The full file includes cleanup and bogus states too, but this is the core path.
VIII - The binary state machine
In flattened_test, the state variable lives on the stack at [rbp-0x8].
The function starts by initialising the state and guard:
11149: endbr64
2114d: push rbp
3114e: mov rbp,rsp
41151: sub rsp,0x20
51155: mov DWORD PTR [rbp-0x14],edi
61158: mov DWORD PTR [rbp-0x8],0x1b9c24fc
7115f: mov eax,DWORD PTR [rbp-0x14]
81162: xor eax,0x1357
91167: mov DWORD PTR [rbp-0x4],eax
10116a: jmp 1389The stack slots are:
1[rbp-0x14] -> input
2[rbp-0x8] -> encoded state
3[rbp-0x4] -> guardThe dispatcher is split into two pieces:
- The loop latch at
0x1389, which checks whether state equals the exit state0x5a5a6a63. - The switch comparison tree starting at
0x116f, which compares[rbp-0x8]against encoded state constants.
The latch looks like this:
11389: cmp DWORD PTR [rbp-0x8],0x5a5a6a63
21390: jne 116f
31398: leave
41399: retThe switch dispatcher starts like this:
1116f: cmp DWORD PTR [rbp-0x8],0x5d4333b7
21176: je 12b7
3117c: cmp DWORD PTR [rbp-0x8],0x5d4333b7
41183: jg 1389
51189: cmp DWORD PTR [rbp-0x8],0x59d69749
61190: je 12ecThis is not a neat switch jump table. It is a comparison tree, but it is still a dispatcher: it maps encoded state values to case bodies.
IX - The condition block is a branch diamond
The real password check is a normal branch diamond:
112b7: cmp DWORD PTR [rbp-0x14],0x539
212be: jne 12c7
312c0: mov eax,0x09dfd4b2
412c5: jmp 12cc
512c7: mov eax,0x4f11870f
612cc: mov DWORD PTR [rbp-0x8],eax
712cf: jmp 1389That block means:
1if (input == 1337)
2 state = 0x09dfd4b2;
3else
4 state = 0x4f11870f;The two values are not payload states. They are staging states:
10x09dfd4b2 -> logical state 11 -> true staging
20x4f11870f -> logical state 12 -> false stagingThe Miasm patcher targets this branch-diamond shape:
1cmp / jcc / mov true_state / jmp join / mov false_state / store state / jmp dispatcherX - Recovering the state-to-target mapping
Once we know the dispatcher is a state comparison tree, the next job is to map encoded states to case bodies.
The Miasm patcher does this by emulating the dispatcher comparison tree from the loop latch for each collected encoded state. The recovered mapping in flattened_test is:
10x09dfd4b2 -> 0x12d4 true staging block
20x0d487198 -> 0x131f post-payload check
30x1285e200 -> 0x133e cleanup-to-exit state
40x12ff9d58 -> 0x1366 bogus/post opaque alternative
50x16360f6e -> 0x1292 guard/pre-dispatch check
60x1b9c24fc -> 0x1286 entry gate
70x1f0941da -> 0x1307 failure payload
80x35eade3a -> 0x1378 bogus state
90x4f11870f -> 0x12e0 false staging block
100x5298e5f4 -> 0x1347 bogus guard alternative
110x59d69749 -> 0x12ec success payload
120x5a5a6a63 -> 0x1398 exit path
130x5d4333b7 -> 0x12b7 real password conditionThe two staging blocks are tiny:
112d4: mov DWORD PTR [rbp-0x8],0x59d69749
212db: jmp 1389
3
412e0: mov DWORD PTR [rbp-0x8],0x1f0941da
512e7: jmp 1389So the real condition flow is:
The important point is that the encoded state values are only labels. Once the dispatcher mapping is known, 0x09dfd4b2 and 0x4f11870f are no longer mysterious. They are simply edges to 0x12d4 and 0x12e0.
XI - Deobfuscating this binary
For this binary, the deobfuscation workflow is:
The branch-diamond selector gives us:
1input == 1337 -> state 0x09dfd4b2
2input != 1337 -> state 0x4f11870fThe dispatcher map gives us:
10x09dfd4b2 -> 0x12d4
20x4f11870f -> 0x12e0Therefore the condition block can be rewritten conceptually as:
1cmp DWORD PTR [rbp-0x14],0x539
2jne 0x12e0
3jmp 0x12d4That patch bypasses the state write and dispatcher round trip for the real password decision. It still lands on the existing staging blocks, which is conservative and preserves the rest of the function structure.
The current Miasm patcher implements exactly that conservative patch. It copies the original compare, replaces the state-selection diamond with a short conditional jump to 0x12e0 and a short unconditional jump to 0x12d4, then pads the unused bytes with NOPs:
1cmp DWORD PTR [rbp-0x14],0x539
2jne 0x12e0
3jmp 0x12d4
4nop
5...A more aggressive deobfuscator could collapse the staging states too:
10x12d4 -> writes state 0x59d69749 -> dispatcher -> 0x12ec success payload
20x12e0 -> writes state 0x1f0941da -> dispatcher -> 0x1307 failure payloadThat means the condition could also be patched directly to the payloads:
1cmp DWORD PTR [rbp-0x14],0x539
2jne 0x1307
3jmp 0x12ecThe conservative staging-target patch is easier to justify mechanically, and it is the patch this project currently applies. The payload-target patch would give a cleaner graph, but it is not what miasm_deflatten_patcher.py emits.
XII - What the patch removes
Before patching, the condition executes like this:
After patching the condition directly to staging blocks:
If a separate, more aggressive pass also collapsed staging, the flow could become:
That is the core deobfuscation win: the original semantic branch becomes visible again.
XIII - Reading the bogus states correctly
The project also inserts bogus or normally unreachable states. These are not random accidents; they are part of the hardening layer.
Examples:
1logical 41 -> 0x5298e5f4
2logical 77 -> 0x12ff9d58
3logical 99 -> 0x35eade3aThey appear in the dispatcher and have real case bodies, but the normal path does not need them.
For example, state 99 checks whether the state is equal to its own encoded value:
1case 904584762:
2 if (state == 904584762)
3 state = 1515874915 ^ (input * 0);
4 else
5 state = 463217916 ^ (input * 0);
6 break;Since the dispatcher only enters that case when state == 904584762, the true branch is the meaningful branch. The else branch is structurally present but not feasible through the dispatcher.
This is a useful reminder: do not treat every visible edge as semantically equal. Some edges are decoys created by opaque predicates or self-checks.
XIV - Summary
The project hardens the baseline CFF transform by:
- Encoding logical states with a sparse affine/XOR transform.
- Wrapping state assignments in input-dependent opaque arithmetic.
- Adding a guard variable and bogus states.
- Routing the real password condition through staging states before the payloads.
In the binary, the important facts are:
- The state lives at
[rbp-0x8]. - The exit state is
0x5a5a6a63. - The loop latch is at
0x1389. - The switch comparison tree starts at
0x116f. - The real password condition is at
0x12b7. - The real condition is a branch diamond.
- The condition selects staging states
0x09dfd4b2and0x4f11870f. - Those staging states resolve to
0x12d4and0x12e0, and then to the success and failure payload states.
The useful deobfuscation model is: find where the real condition selects encoded states, resolve those states through the dispatcher, and patch direct branches to the resolved targets.
That gives us back the original semantic shape: