We have reported about a vulnerability relating to Microsoft Word application and called as CVE-2012-0158. You can refer to here. Â According to the result on Virus Total, we analyze a malware relating the CVE-2012-0158 vulnerability.Last month, I paid much attention to the following email:
Figure 1. E-mail in the inbox
This email was sent from an unknown address (**firstname.lastname@example.org) and attached with a .DOC file (Lá»ch-cÃ¡c-ngÃ y-lá» .doc). Because of suspecting this file attached a keylogger, a type of malware, we had uploaded this file on Virus Total site (https://www.virustotal.com) in order to check whether it is a file infected malware or not. As expected, the result of Virus Total was 28/57 antiviruses including Kaspersky, Bitdefender, ESET-NOD32, which identified a vulnerability CVE-2012-0158 exploit. This exploit was owned by an author namedÂ Tráº§n Duy Linh.
I started shaping of what you need to find the shellcode containing in .DOC file. I used Frank Boldewinâs the OfficeMalscanner toolkit  to scan this file. The result returned contains an OLE2 Compound Format embedded into this file.
Figure 2. The result was returned by the OfficeMalscanner toolkit
Continuing to scan OLE2 file by the OfficeMalscanner toolkit:
Figure 3. The result canning OLE file was returned by the OfficeMalscanner toolkit.
The scanned OLE file cannot detect malware. Therefore, we decided to find shellcode by hand.
We used 010 Editor  to analyze this .DOC file. As this file is not like the RTF file analyzed earlier, we decided to try to find the NOP string (90 90 90 90) from which the shellcode often start. The result returning included 2 offset addresses where the NOP string was started. I was particularly interested to 0x6DD0 offset
Figure 4. Signs of shellcode in .DOC file.
Before the NOP-Sled block, we noticed 4 bytes of 0x27583C30 (Litte Endian) value, an address of opcode (JMP ESP) located in Windows XP SP3âs MSCOMCTL.OCX. A remarkable byte string behind the NOP-Sled block was the same as some opcode assembly of familiar codes (PUSHAD vÃ JMP [offset]).
To be sure, we tested to disassemble a hex code starting at 0x6E00 by a disassembly online. The code began with PUSHAD and used the first 0x1F bytes to decode the following 0x167 bytes by XOR with 0xCC.
Figure 5. Shellcode transforms themselves by XOR with 0xCC
We extracted 0x167 bytes starting from 0x6E1F offset to a .bin file and used FileInSight  to perform XOR with 0xCC.
Thanks to disassemble the hex code by IDA Pro tool, we recovered the results after several times pressing the âCâ button:
Figure 6. âkernel32â string stored in Stack.
As a result, we were capable of confirming this shellcode including:
- Starting position: 0x6E00 offset of .DOC file.
- Size of shellcode: 0x187
- Shellcode transforms themselves with the first 0x1F bytes by XOR with 0xCC
We started more thoroughly analyzing about shellcode. Now, there were two ways to be likely to analyze dynamically shellcode:
- : We extracted shellcode by hand. Then, we were going to use a tool to transform shellcode to .exe file or write a program to jump into shellcode.
- : Changing one byte of 0xCC value in .DOC file.
We decided to choose the 2nd method. We had changed one of the 0x90 bytes (NOP code) to 0xCC. Subsequently, we debugged this .DOC file by loading Microsoft Office 2007 SP3 into IDA Pro on a virtual machine running Windows XP SP3. The display of the debug process stopped at the point where we had changed by 0xCC.
Setting breakpoint at the decryption functionâs the location of RET command, we debugged continuously. Shellcode was decrypted by the XOR algorithm and started the main job.
Firstly, the shellcode parsed PEB to get the address of kernel32.dll
After getting the base address of kernel32.dll, shellcode used a decryption function to find the addresses of 6 APIs owning encrypted strings as follows:
The decryption function performed the following tasks:
- Parsing the address of ENT (Export Name Table) of kernel32.dll.
- Browsing each APIs and decrypting the name of API with following algorithm.
- Getting the address of the function of decrypted name coinciding with the input value.
This is an assembly code of the decryption function:
1. Stack:00121CDA decrypt_: ; CODE XREF: Stack:00121D2Dp
2. Stack:00121CDA pusha
3. Stack:00121CDB mov ebp, [esp+24h]
4. Stack:00121CDF mov eax, [ebp+3Ch]
5. Stack:00121CE2 mov edx, [ebp+eax+78h]
6. Stack:00121CE6 add edx, ebp
7. Stack:00121CE8 mov ecx, [edx+18h] ;Get Number of function;
8. Stack:00121CEB mov ebx, [edx+20h] ;Get Export Name Table(ENT)
9. Stack:00121CEE add ebx, ebp
11. Stack:00121CF0 loc_121CF0: ; CODE XREF: Stack:00121D0Dj
12. Stack:00121CF0 jecxz short loc_121D28
13. Stack:00121CF2 dec ecx
14. Stack:00121CF3 mov esi, [ebx+ecx*4]
15. Stack:00121CF6 add esi, ebp
16. Stack:00121CF8 xor edi, edi
17. Stack:00121CFA xor eax, eax
18. Stack:00121CFC cld
20. Stack:00121CFD loc_121CFD: ; CODE XREF: Stack:00121D07j
21. Stack:00121CFD lodsb
22. Stack:00121CFE test al, al
23. Stack:00121D00 jz short loc_121D09
24. Stack:00121D02 rol edi, 13h
25. Stack:00121D05 add edi, eax
26. Stack:00121D07 jmp short loc_121CFD
27. Stack:00121D09 ; ---------------------------------------------------------------------------
29. Stack:00121D09 loc_121D09: ; CODE XREF: Stack:00121D00j
30. Stack:00121D09 cmp edi, [esp+28h] ; Compare to cipher
31. Stack:00121D0D jnz short loc_121CF0
32. Stack:00121D0F mov ebx, [edx+24h]
33. Stack:00121D12 add ebx, ebp
34. Stack:00121D14 mov cx, [ebx+ecx*2]
35. Stack:00121D18 mov eax, edx
36. Stack:00121D1A mov ebx, [eax+1Ch]
37. Stack:00121D1D add ebx, ebp
38. Stack:00121D1F mov eax, [ebx+ecx*4]
39. Stack:00121D22 add eax, ebp
40. Stack:00121D24 mov [esp+1Ch], eax
42. Stack:00121D28 loc_121D28: ; CODE XREF: Stack:loc_121CF0j
43. Stack:00121D28 popa
44. Stack:00121D29 retn
45. Stack:00121D2A ; ---------------------------------------------------------------------------
We set a breakpoint after the decryption function, and traced continuously to receive 6 respectively APIs:
Thanks to encrypting and decrypting the name of the APIs, shellcode made the analyzing process become difficult.
After getting these addresses of APIs, shellcode allocated memory and retrieved the HANDLE of kernel32. We were wondering why the author of shellcode had used repeatedly the decryption function to get addresses of APIs before shellcode allocated memory and read data from .DOC file.
1. debug016:00350157 loc_350157: ; CODE XREF: debug016:0035014Aj
2. debug016:00350157 push 0
3. debug016:00350159 push 0
4. debug016:0035015B push dword ptr [ebp-18h] ; offset 0x1A830
5. debug016:0035015E push dword ptr [ebp-4] ; hFile
6. debug016:00350161 call dword ptr [ebp-40h] ; call SetFilePointer
7. debug016:00350164 push dword ptr [ebp-14h]
8. debug016:00350167 push 40h
9. debug016:00350169 call dword ptr [ebp-34h] ; call GlobalAlloc
10. debug016:0035016C mov [ebp-0Ch], eax ; Allocate 7B2 bytes
11. debug016:0035016F push 0
12. debug016:00350171 lea eax, [ebp-1Ch]
13. debug016:00350174 push eax
14. debug016:00350175 push dword ptr [ebp-14h] ; size=0x7B2
15. debug016:00350178 push dword ptr [ebp-0Ch] ; Buffer
16. debug016:0035017B push dword ptr [ebp-4] ; hFile
17. debug016:0035017E call dword ptr [ebp-3Ch] ; call ReadFile
18. debug016:00350181 mov eax, [ebp-0Ch]
20. debug016:00350184 JMP_To_Dropper_: ; Jump to dropper in .DOC
21. debug016:00350184 jmp eax
During analyzing, we detected that the shellcode executed a different code in .DOC file (we called this new code as shellcode2) by moving 0x7B2 bytes from the location of 0x1A830 value to allocated buffer, and then, shellcode were jumping straight into this buffer.
Because of getting errors when debugging shellcode2 by IDA Pro, we decided to extract shellcode2 by the 010 Editor tool and called shellcode2 as âDropperâ for convenient using. We stored the hex code of the Dropper into a .BIN file and started to analyze the Dropper.