In part 1 of keeping strings real, strings were chased around in a disassembler to provide insight into the functionality of a piece of malware. Part two investigates the instance where there seem to be no recognizable strings in the target at all.
When doing a quick skim of a malware file (in this case, ldr.exe md5:007571544614a7646e750a51ccaf2e9e), sometimes you encounter data that looks similar to the below image. In this particular instance, it seems that these strings are encrypted.
Encrypted strings make it very difficult to do quick analysis. Fortunately, a there are a couple of options to get past this small road block. The string can be observed as it is passed around between functions, or debuggers can be used to halt execution when the target string data is accessed.
For this sample, the above strings were followed from the initial cross reference to a function that is self contained. It takes arguments, executes some code, and returns. The code has a loop, operates on individual bytes (reads from pointers into register halves), and performs a few additions. Is this a possible candidate for a string decryption? You betcha.
After walking through the function once in a debugger, it becomes obvious this function decrypts the string to a different buffer. This is an excellent first step, but there are a massive amount of strings in this file. It would be less than desirable to execute this function in a debugger and make note of the result for each and every encrypted string present. There has to be a faster and more elegant way to figure these out. Now what?
Enter Cryptanalysis. This particular function is not very large or complicated, so determining the algorithm used to reveal the strings should not take an unreasonable amount of effort. After determining the algorithm, it is possible to write a program or script to accept the encrypted string data and output the decrypted string.
Below is what the reversed function looks like.
This function accepts what looks like a null terminated pascal string. The first character in the string contains the length (0 to 255), followed by the ciphered string data, then a zero to indicate the end of the string.
The next step is to add the cipher key value to the first encrypted character in the string. This key value starts at 186 (or 0xBA in hexadecimal). On each loop pass, the key is increased by 2 and added to the next character in the string.
For instance, the character ‘a’ is represented by the number 97 (0×61). To encrypt this initial data based on the algorithm above, we would subtract 186 (0xBA). To decrypt it later, 186 (0xBA) is added to the encrypted data.
The result of this 97 – 186 subtraction is 167 (0xA7). This math looks funny, but it works this way when working with individual bytes and their associated range of 0 to 255 (unsigned).
This behavior is due to the wrap-around effect caused by an integer overflow. To see this in action on Windows, open calculator (calc.exe), change the view to scientific mode, then change the number system from decimal (Dec) to hexadecimal (Hex), lastly change the size from “Qword” to “Byte.” Now you can type in 61 minus BA and the result is A7 (167).
Keeping the above math in mind, the algorithm can now be re-implemented using IDA’s built in scripting language (IDC). The script will be need to be passed the source string data, extract a byte, add the key to the byte, store the result, add 2 to the key, and repeat this process till all bytes in the string have been processed.
The Byte() function will be used to extract the byte from the “address” of the string’s beginning found in IDA’s dissassembler window. The Message() function will display the deciphered byte in the message window, and the PatchByte() function will modify the representation of the byte inside of the disassembler window. (Note: PatchByte() can be commented out to prevent the script from actually modifying any data, it will simply print the result in the message window)
The script representation of this algorithm reconstruction is found in the image below, and the idc script itself can be downloaded from our PC Tools ThreatFire forum, where you can log in and scroll down the thread for 186plus2_decipher.zip:
Now it is time for some fun. An encrypted string is selected in IDA for decoding and the script is launched. The result: