A black-box approach against obfuscated regular expressions using Pin

After reading the “Regular Expressions Obfuscation Under the Microscope” blogpost by Axel “0vercl0k” Souchet less than two weeks ago, I decided to try the approach detailed by Jonathan Salwan in his blogpost “A binary analysis, count me if you can“.

[NOTE: You probably want to read the two blogposts mentioned above before continuing reading.]

In the latter post, Jonathan explains how to find the right password in a crackme program with a black-box approach, by instrumenting the program with Intel’s Pin, a Dynamic Binary Instrumentation (DBI) framework. The basic idea of the approach is that (depending on how the target program was coded), you can analyze the execution flow of the program by counting the number of instructions executed when processing different input strings, and based on that instruction count, determine whether the program is accepting or not the input characters. Quoting Jonathan, this approach works for programs that rely “on a simple string-compare function that returns as soon as the function sees a difference” between your input string and the correct string; that is, it works for programs in which “as long as the comparison is correct, the number of instruction executed increases“.

Part 1 – Simple, non-obfuscated regular expression

Let’s take a look at the first C program that accepts the regular expression “Hi-[0-9]{4}”, taken from the “Regular Expressions Obfuscation Under the Microscope” post:


(fsm_example.c)

#include "stdafx.h"
#include <stdio.h>
#include <string.h>

unsigned char checkinput(char* s)
{
    unsigned int state = 0, i = 0;
    do
    {
        switch(state)
        {
            case 0:
            {
                if(*s == 'H')
                    state = 1;

                break;
            }

            case 1:
            {
                if(*s == 'i')
                    state = 2;
                else
                    return 0;

                break;
            }

            case 2:
            {
                if(*s == '-')
                    state = 3;
                else
                    return 0;

                break;
            }

			case 3: case 4: case 5: case 6:
            {
                if(*s >= '0' && *s <= '9')
                    state++;
                else
                    return 0;

                break;
            }

            case 7:
                return 1;
        }
    } while(*s++);

    return 0;
}

int main(int argc, char *argv[])
{
    if(argc != 2)
    {
        printf("./fsm <string>\n");
        return 0;
    }

    if(checkinput(argv[1]))
        printf("Good boy.\n");
    else
        printf("Bad boy.\n");

    return 1;
}

This is the graph of the checkinput() function at the binary level:

regex1

Now let’s assume that we don’t know what strings are accepted by the checkinput() function, so we want to find it out by using the Pin-based black-box approach detailed above.

After some testing, I realized that counting the whole number of instructions executed by the program, as done by Jonathan, wasn’t working for me. So, instead of that, I decided to narrow down the instrumentation and just count the number of executed basic blocks belonging to the checkinput() function. In my compiled binary, the checkinput() function starts at 0x401000, and ends at 0x401000. So I made this Pintool, based on the tools/SimpleExamples/icount.cpp example, which counts the number of executed basic blocks which are located within the checkinput() start/end addresses:

#include "pin.H"
#include <iostream>
#include <cstring>

/* ===================================================================== */
/* Global Variables */
/* ===================================================================== */

UINT64 bbl_count = 0;
ADDRINT CHECKINPUT_START_ADDRESS = 0x401000;	//Start address of the checkinput() function
ADDRINT CHECKINPUT_END_ADDRESS   = 0x4010b9;	//End address of the checkinput() function


bool isCheckinputAddress(BBL bbl){
	ADDRINT address = BBL_Address(bbl);
	return ((address >= CHECKINPUT_START_ADDRESS) && (address <= CHECKINPUT_END_ADDRESS));
}

/* ===================================================================== */
/* Print Help Message                                                    */
/* ===================================================================== */

INT32 Usage()
{
    cerr <<
        "This tool prints out the number of basic blocks belonging to the checkinput() function executed to stderr.\n"
        "\n";

    cerr << KNOB_BASE::StringKnobSummary();
    cerr << endl;
    return -1;
}

/* ===================================================================== */
VOID docount()
{
    bbl_count++;
}
/* ===================================================================== */

VOID Trace(TRACE trace, VOID *v)
{
    // Visit every basic block in the trace
    for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl))
    {
		if (isCheckinputAddress(bbl)){
			// Insert a call to docount() before every basic block
			BBL_InsertCall(bbl, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
		}
    }
}

/* ===================================================================== */
VOID Fini(INT32 code, VOID *v)
{
    cerr <<  "Count " << bbl_count  << endl;
    
}
/* ===================================================================== */
/* Main                                                                  */
/* ===================================================================== */

int main(int argc, char *argv[])
{
    if( PIN_Init(argc,argv) )
    {
        return Usage();
    }
    
    TRACE_AddInstrumentFunction(Trace, 0);
    PIN_AddFiniFunction(Fini, 0);

    // Never returns
    PIN_StartProgram();
    
    return 0;
}

I’ve also coded a Python script that bruteforces the fs_example.exe binary by instrumenting it with the Pintool shown above:

import os
import sys
import string
import time
import random

def main():
    start_time = time.time()
    
    pwd  = "_______"
    base = 0
    off  = 0x00
    sav  = 0x00

    alphabet = string.ascii_uppercase + string.ascii_lowercase + string.digits + '-'

    while base < len(alphabet):
        pwd = pwd[:off] + alphabet[base] + pwd[off+1:]
        cmd = 'pin.bat -t icount.dll -- fsm_example.exe "%s"' % pwd
        input, output = os.popen4(cmd)
		#Grab the number of executed basic blocks from the Pintool's output
        count = int(output.read().split()[-1])

        if sav == 0:
            sav = count

        print "Tried '%s', basic block count: %d\t\t[%d]" % (pwd, count, count-sav)

		#If the difference between the current count of executed basic blocks and the previous count
		#is greather than a threshold (5), assume that the current char was accepted as valid.
		#Sometimes the difference can be negative, that's why I'm using abs().
        if abs(count-sav) >= 5:
            print ">>>> Guessed char: %s" % alphabet[base]
            off += 1
            if off >= len(pwd):
                break
            base = 0
            sav = 0

        base += 1
        sav = count

    end_time = time.time()
    print "The password is %s [Took %s seconds]" %(pwd, end_time - start_time)
    sys.exit(0)


if __name__ == '__main__':
    main()

Note that for the sake of simplicity I’ve hardcoded the length of valid inputs (7 chars) when defining the “pwd” variable; there are some ways to guess it when it’s unknown, as explained in Jonathan’s post.

So, the interesting thing about this instrumentation-driven bruteforce is that we can discover one valid character at a time, instead of having to guess the full string at once!. So, instead of having to try len(alphabet) ** 7 == 63 ** 7 == 3938980639167 combinations (worst case when doing the most basic bruteforce), using this approach we just need to try len(alphabet) * 7 == 63 * 7 == 441 combinations in the worst case.

This is an excerpt from the output of the Python script, guessing the password in 175 seconds:

Tried 'A______', basic block count: 41		[0]
Tried 'B______', basic block count: 41		[0]
Tried 'C______', basic block count: 41		[0]
Tried 'D______', basic block count: 41		[0]
Tried 'E______', basic block count: 41		[0]
Tried 'F______', basic block count: 41		[0]
Tried 'G______', basic block count: 41		[0]
Tried 'H______', basic block count: 10		[-31]
>>>> Guessed char: H
[...]
Tried 'Hi-00c_', basic block count: 37          [0]
Tried 'Hi-00d_', basic block count: 37          [0]
Tried 'Hi-00e_', basic block count: 37          [0]
Tried 'Hi-00f_', basic block count: 37          [0]
Tried 'Hi-00g_', basic block count: 37          [0]
Tried 'Hi-00h_', basic block count: 37          [0]
Tried 'Hi-00i_', basic block count: 37          [0]
Tried 'Hi-00j_', basic block count: 37          [0]
Tried 'Hi-00k_', basic block count: 37          [0]
Tried 'Hi-00l_', basic block count: 37          [0]
Tried 'Hi-00m_', basic block count: 37          [0]
Tried 'Hi-00n_', basic block count: 37          [0]
Tried 'Hi-00o_', basic block count: 37          [0]
Tried 'Hi-00p_', basic block count: 37          [0]
Tried 'Hi-00q_', basic block count: 37          [0]
Tried 'Hi-00r_', basic block count: 37          [0]
Tried 'Hi-00s_', basic block count: 37          [0]
Tried 'Hi-00t_', basic block count: 37          [0]
Tried 'Hi-00u_', basic block count: 37          [0]
Tried 'Hi-00v_', basic block count: 37          [0]
Tried 'Hi-00w_', basic block count: 37          [0]
Tried 'Hi-00x_', basic block count: 37          [0]
Tried 'Hi-00y_', basic block count: 37          [0]
Tried 'Hi-00z_', basic block count: 37          [0]
Tried 'Hi-000_', basic block count: 44          [7]
>>>> Guessed char: 0
Tried 'Hi-000A', basic block count: 44          [0]
Tried 'Hi-000B', basic block count: 44          [0]
Tried 'Hi-000C', basic block count: 44          [0]
Tried 'Hi-000D', basic block count: 44          [0]
Tried 'Hi-000E', basic block count: 44          [0]
Tried 'Hi-000F', basic block count: 44          [0]
Tried 'Hi-000G', basic block count: 44          [0]
Tried 'Hi-000H', basic block count: 44          [0]
Tried 'Hi-000I', basic block count: 44          [0]
Tried 'Hi-000J', basic block count: 44          [0]
Tried 'Hi-000K', basic block count: 44          [0]
Tried 'Hi-000L', basic block count: 44          [0]
Tried 'Hi-000M', basic block count: 44          [0]
Tried 'Hi-000N', basic block count: 44          [0]
Tried 'Hi-000O', basic block count: 44          [0]
Tried 'Hi-000P', basic block count: 44          [0]
Tried 'Hi-000Q', basic block count: 44          [0]
Tried 'Hi-000R', basic block count: 44          [0]
Tried 'Hi-000S', basic block count: 44          [0]
Tried 'Hi-000T', basic block count: 44          [0]
Tried 'Hi-000U', basic block count: 44          [0]
Tried 'Hi-000V', basic block count: 44          [0]
Tried 'Hi-000W', basic block count: 44          [0]
Tried 'Hi-000X', basic block count: 44          [0]
Tried 'Hi-000Y', basic block count: 44          [0]
Tried 'Hi-000Z', basic block count: 44          [0]
Tried 'Hi-000a', basic block count: 44          [0]
Tried 'Hi-000b', basic block count: 44          [0]
Tried 'Hi-000c', basic block count: 44          [0]
Tried 'Hi-000d', basic block count: 44          [0]
Tried 'Hi-000e', basic block count: 44          [0]
Tried 'Hi-000f', basic block count: 44          [0]
Tried 'Hi-000g', basic block count: 44          [0]
Tried 'Hi-000h', basic block count: 44          [0]
Tried 'Hi-000i', basic block count: 44          [0]
Tried 'Hi-000j', basic block count: 44          [0]
Tried 'Hi-000k', basic block count: 44          [0]
Tried 'Hi-000l', basic block count: 44          [0]
Tried 'Hi-000m', basic block count: 44          [0]
Tried 'Hi-000n', basic block count: 44          [0]
Tried 'Hi-000o', basic block count: 44          [0]
Tried 'Hi-000p', basic block count: 44          [0]
Tried 'Hi-000q', basic block count: 44          [0]
Tried 'Hi-000r', basic block count: 44          [0]
Tried 'Hi-000s', basic block count: 44          [0]
Tried 'Hi-000t', basic block count: 44          [0]
Tried 'Hi-000u', basic block count: 44          [0]
Tried 'Hi-000v', basic block count: 44          [0]
Tried 'Hi-000w', basic block count: 44          [0]
Tried 'Hi-000x', basic block count: 44          [0]
Tried 'Hi-000y', basic block count: 44          [0]
Tried 'Hi-000z', basic block count: 44          [0]
Tried 'Hi-0000', basic block count: 49          [5]
>>>> Guessed char: 0
The password is Hi-0000 [Took 175.642999887 seconds]

Part 2 – Obfuscated regular expression

In the “Regular Expressions Obfuscation Under the Microscope” post, Axel Souchet also made an obfuscated version of the C program that accepts the regular expression “Hi-[0-9]{4}”.

In this case, the graph of the checkinput() function looks pretty harder than the original version:

regex2

After some tries to solve this complicated version with the very same approach described in Section 1, I had to go a little less black-box. This obfuscated version includes (fake) state transitions for non-valid input, so even non-valid input chars may increase the count of executed basic blocks, rendering the approach described in Section 1 impractical.

Then I thought that, instead of counting the number of executed basic blocks (which is misleading in this obfuscted program) I could try to register attempts to read more chars from the input string (that should mean that the checkinput() function has validated the current char and is going to process the following character).

After quickly browsing the checkinput() function in the binary, I found that a lot of paths end up here:incp

That “INC P” (INC ECX) instruction at address 0x40267d just makes ECX point to the next character in the input string; if the next char is not null (as checked in the next basic block at address 0x40267e), the execution flow jumps back to the beginning of the loop that processes the input string.

Having that in mind, I coded another Pintool that instruments that “INC P” (INC ECX) instruction at address 0x40267d shown above and counts how many times it is executed:

#include "pin.H"
#include <iostream>
#include <cstring>

/* ===================================================================== */
/* Global Variables */
/* ===================================================================== */

UINT64 ins_count = 0;
ADDRINT INC_P_ADDRESS = 0x40267d;		//Address of INC P (INC ECX)


//Determines if an instruction address is the same as the "INC P" (INC ECX) instruction of our interest
bool is_inc_p_address(INS ins){
	return INS_Address(ins) == INC_P_ADDRESS;
}


/* ===================================================================== */
/* Print Help Message                                                    */
/* ===================================================================== */

INT32 Usage()
{
    cerr <<
        "This tool prints out how many times the INC P (INC ECX) instruction was executed .\n"
        "\n";

    cerr << KNOB_BASE::StringKnobSummary();
    cerr << endl;
    return -1;
}

/* ===================================================================== */

VOID docount()
{
    ins_count++;
}

/* ===================================================================== */

VOID Instruction(INS ins, VOID *v)
{
		if (is_inc_p_address(ins)){
			// Insert a call to docount() before the INC P instruction
			INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
		}
}

/* ===================================================================== */

VOID Fini(INT32 code, VOID *v)
{
    cerr <<  "Count " << ins_count  << endl;
    
}

/* ===================================================================== */
/* Main                                                                  */
/* ===================================================================== */

int main(int argc, char *argv[])
{
    if( PIN_Init(argc,argv) )
    {
        return Usage();
    }
    

    INS_AddInstrumentFunction(Instruction, 0);
    PIN_AddFiniFunction(Fini, 0);

    // Never returns
    PIN_StartProgram();
    
    return 0;
}

I have coded another Python script that bruteforces the regex2.exe binary by instrumenting it with this second Pintool. Note that this script is a bit more complicated than the first one: it will recursively get a list of candidate chars for each password position (some non-valid chars do hit the INC P instruction, so we have a bunch of possible chars for each position), up to a user-defined maximum length. When it reaches the maximum position, it will make a bit of dumb-bruteforce for the last character; we are trying to avoid dumb-bruteforce, but here we are doing it just for the last char, while having the previous chars being serious candidates according to the data obtained during instrumentation.

import os
import sys
import string
import random
import time

#Execute the binary and check if the output says 'Good boy'
def is_solution_valid(pwd):
    cmd = 'regex2.exe "%s"' % pwd
    input, output = os.popen4(cmd)
    is_valid = 'Good boy' in output.read()
    print "\n[Checking password %s: %s]\n" % (pwd, is_valid)
    return is_valid


#Returns a list of candidate chars at a given offset
def get_candidates(pwd, off, alphabet):
        previous = 0
        candidates = []
        for c in alphabet:
            pwd = pwd[:off] + c + pwd[off+1:]
            cmd = r'pin.bat -t icount_inc_p.dll -- regex2.exe "%s"' %(pwd)
            input, output = os.popen4(cmd)
            count = int(output.read().split()[-1])

            if previous == 0:
                previous = count

            print "Tried '%s', inc p count: %d\t\t[%d]" % (pwd, count, count-previous)

            #When a char is possibly valid, the number of times the "INC P" instruction is executed
            #is greater than the current offset being tested (because the checkinput() function grabs one more char)
            if count > off:
                print ">>>> Candidate char: %s" % c
                candidates.append(c)

            previous = count
        return candidates



def solve(off, pwd, max_offset):

    #End of recursion, we are trying to guess the last char.
    #Either we guess the full password and we finish, or we go back in the recursion process
    if off == max_offset:
        #Here we just blindy bruteforce every possible value for the last char;
        #That's not that much, 63 tries at most, and it's the simplest/fastest thing to do at this point.
        for c in alphabet:
            pwd2 = pwd[:off] + c + pwd[off+1:]
            if is_solution_valid(pwd2):
                end_time = time.time()
                print "Found it (exit 1)! Password: %s. Elapsed time: %s seconds." % (pwd2, end_time - start_time)
                sys.exit(0)
        return
    else:
        candidates = get_candidates(pwd, off, alphabet)
        print "get_candidates at offset %d: %s" % (off, candidates)
        #raw_input()
        for c in candidates:
            pwd = pwd[:off] + c + pwd[off+1:]
            if is_solution_valid(pwd):
                end_time = time.time()
                print "Found it (exit 2)!! Password: %s. Elapsed time: %s seconds." % (pwd, end_time - start_time)
                sys.exit(0)
            else:
                solve(off + 1, pwd, max_offset)


if __name__ == '__main__':
    alphabet = list(string.ascii_lowercase + string.ascii_uppercase + string.digits + '-')
    random.shuffle(alphabet)
    start_time = time.time()
    solve(0, '_______', 6)

This is an excerpt from the output of the second Python script, guessing the password in 239 seconds:

Tried 's______', inc p count: 0		[0]
Tried 'n______', inc p count: 0		[0]
Tried 'H______', inc p count: 1		[0]
>>>> Candidate char: H
Tried 'q______', inc p count: 0		[-1]
Tried 'V______', inc p count: 0		[0]
Tried 'm______', inc p count: 0		[0]
Tried '0______', inc p count: 0		[0]
Tried 'a______', inc p count: 0		[0]
Tried 'v______', inc p count: 0		[0]
Tried 'F______', inc p count: 0		[0]
Tried 'k______', inc p count: 0		[0]
Tried 'I______', inc p count: 0		[0]
Tried 'b______', inc p count: 0		[0]
Tried 'P______', inc p count: 0		[0]
Tried 'L______', inc p count: 1		[0]
>>>> Candidate char: L
[...]
Tried 'HmOoPx_', inc p count: 5		[0]
Tried 'HmOoPA_', inc p count: 6		[1]
>>>> Candidate char: A
Tried 'HmOoP4_', inc p count: 5		[-1]
Tried 'HmOoPS_', inc p count: 5		[0]
Tried 'HmOoPW_', inc p count: 5		[0]
Tried 'HmOoPN_', inc p count: 6		[1]
>>>> Candidate char: N
Tried 'HmOoPe_', inc p count: 5		[-1]
Tried 'HmOoPt_', inc p count: 5		[0]
Tried 'HmOoP3_', inc p count: 5		[0]
Tried 'HmOoPE_', inc p count: 6		[1]
>>>> Candidate char: E
Tried 'HmOoP6_', inc p count: 5		[-1]
Tried 'HmOoP9_', inc p count: 5		[0]
Tried 'HmOoPd_', inc p count: 5		[0]
Tried 'HmOoPQ_', inc p count: 5		[0]
Tried 'HmOoPG_', inc p count: 5		[0]
get_candidates at offset 5: ['H', 'A', 'N', 'E']

[Checking password HmOoPH_: False]
[Checking password HmOoPHs: False]
[Checking password HmOoPHn: False]
[Checking password HmOoPHH: False]
[...]
[Checking password HmOoPNp: False]
[Checking password HmOoPN5: False]
[Checking password HmOoPNc: True]

Found it (exit 1)! Password: HmOoPNc. Elapsed time: 239.9150002 seconds.

Wait, it says that the valid password is “HmOoPNc”! That string doesn’t even match the regex “Hi-[0-9]{4}”! It turns out that the obfuscated program is a bit buggy and is accepting some strings that do not match the regular expression, and we found these unexpected solutions with the Pintool + Python script :). If you test the regex2.exe program with the “HmOoPNc” input you will see that it’s validated:

C:\pintool>regex2.exe HmOoPNc
Good boy.

Other unexpected solutions include the following strings: ‘HPAUfdc’, ‘ciScWtS’, ‘c5gJ2yc’, ‘ LVS5nec’, ‘c5gLXcS’, ‘c5g4PQc’, ‘HygJtWS’, ‘HijXpdc’, ‘LV-4885’, ‘cB6m-dc’.

I hope you liked the post, any comments are welcome! 🙂

Advertisements

2 thoughts on “A black-box approach against obfuscated regular expressions using Pin

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s