Blog

Win32 API Shellcode Hash Algorithm

1. A Modest Proposal

Daylight Saving Time

Allegedly, the purpose of Daylight Saving Time is to save energy by manipulating a unit of measurement.

Mileage Saving Time

I have a similar proposal for how to save on gasoline usage. If we redefine the mile to be 4,800 feet during Summer — when people drive the most. Then everyone will drive 10% more miles per gallon of gas. So for example, during the winter, if your car gets 30MPG, then during Mileage Saving Time, you’d be getting 33MPG!

(Actually, it’s more like redefining the distance between San Francisco, and Sacramento from 90 miles to 80 miles. That way the two cities are closer together, reducing the amount of time and energy spent traveling between them.)

2. Something Technical

Simple Hash Function(s)

I occasionally spend time reverse engineering shellcode used in various attacks. And, someday, should you find yourself in a similar situation, the following information might be useful…

The Last Stage of Delerium research group, back in 2002, published a technique for doing Win32 API RVA lookups using only the hash of a string — the name of the API function — rather than storing, and performing a full compare on the very long string. (Which some shellcode still does anyway.)

In their paper, they proposed the use of a simple shift and add style integer hash. (Just like the ones taught in a CompSci 101 class.) I don’t believe that this hash has a more proper name. In the paper, and accompanying [hard to find] PoC code, they used h=((h<<5)|(h>>27))+c to calculate the hash. (c being each letter of the string, and h the running total.)

Reference:

http://lsd-pl.net/projects/winasm-1.0.1.pdf (§2.1 Page 9)

http://lsd-pl.net/projects/winasm-1.0.1.tar.gz
(wasn’t linked to on the site, hard to find)
lsd-pl.net is down at the moment, let me know if you really need a copy of these files and can’t find them elsewhere. (‘Cause I found copies.)

Matt Miller’s 2003 paper Understanding Windows Shellcode §3.3.1 http://nologin.org/Downloads/Papers/win32-shellcode.pdf [PDF] uses almost exactly the same technique, except that the bit shift length is different in his examples. It’s essentially: h=((h<<19)|(h>>13))+c

For some reason — probably widespread code availability and copy pasta — this is the only hash algorithm ever used in any shellcode I’ve seen. Little things like this make it possible to construct a phylogenetic tree of shellcode evolution/authorship over time.

Rationale

I got tired of asking, Wait, what was 0xEC0E4E8E again? So, I made up a table of every hash, for most of the common Win32 API functions. Hopefully this gets indexed by Google, as I have a habit of using that to search, rather then remembering where I left my list.

A Perl script to generate that table

#!/usr/bin/perl -w
use strict;
sub ror {
my $number = shift;
my $bitshift = shift;
return ($number >> $bitshift) | ($number << (32 - $bitshift));
}
sub hash {
my @name = unpack("C*",shift);
my $hash = 0;
for(my $i=0; $i<=$#name; $i++) {
$hash = ror($hash, 0xD) + $name[$i];
}
return $hash;
}
while(<>) {
chomp;
printf ("%08x\t%s\n", hash($_),$_); #modify this line if you want SQL or HTML, &c.
}

Except from Last Stage of Delerium’s PoC

This is the code that almost nobody ever saw, and is never used by malware.

    xor   eax,eax                          ; index
i3: mov   esi,[4*eax+ebx]                  ; ptr to symbol name
add   esi,edx                          ; rva2va
push  ebx                              ; hash: h=((h<<5)|(h>>27))+c
push  eax
xor   ebx,ebx
i4: xor   eax,eax                          ; hash loop
lodsb
rol   ebx,5
add   ebx,eax
cmp   eax,0
jnz   i4
ror   ebx,5
cmp   ebx,ecx                          ; hash compare
pop   eax
pop   ebx
je    i5                               ; same: symbol found
inc   eax
jmp   i3                               ; different: go to the next symbol

Or in actual bytes…


00000985  53                push ebx
00000986  50                push eax
00000987  33DB              xor ebx,ebx ; hash loop
00000989  33C0              xor eax,eax
0000098B  AC                lodsb
0000098C  C1C305            rol ebx,0x5
0000098F  03D8              add ebx,eax
00000991  83F800            cmp eax,byte +0x0
00000994  75F3              jnz 0x989
00000996  C1CB05            ror ebx,0x5
00000999  3BD9              cmp ebx,ecx ; hash compare
0000099B  58                pop eax
0000099C  5B                pop ebx
0000099D  7403              jz 0x9a2
0000099F  40                inc eax
000009A0  EBDE              jmp short 0x980

The Table (For Google’s Sake)

This is just to make is easy to Google for any of these DWORDs. (As big and little endian, DWORDs and BYTEs in various formats.



Julia Wolf @ FireEye Malware Intelligence Lab

Questions/Comments to research [@] fireeye [.] com



2 thoughts on “Win32 API Shellcode Hash Algorithm

  1. Nice one on the Mileage Savings Time. I really detest DST.
    Most times, it seems, ‘normal’ people just don’t get it when technical people (like Ben Franklin) are joking. When those ‘normal’ people happen to be legislators… bad things can happen.
    Great analysis, as well - I always enjoy your technical dissections!
    Keep up the good work…

Comments are closed.