Blog

Technical details of Srizbi’s domain generation algorithm

This post will dive into the algorithm by which Srizbi decides which domain name to contact on a given day

For this analysis, I [Julia] reverse engineered the following sample:

MD5:8adb642389b8bf999e2017d731edcb00 bot.sys (Linked on Mon Dec 10 03:57:34 2007 — if the timestamps can be trusted.)

Which is not only unpacked, but the author left the debugging symbols in. The original name for this project is revealed by: c:\reactor3\client\Release\client.pdb

The bot (Srizbi) is loaded into the NT Kernel as a device driver,
and then mucks around with the pointers for several kernel functions, like
ZwEnumerateKey, ZwOpenKey, ZwQueryValueKey, ZwCreateFile, ZwLoadDriver,
etc. These are all fairly common rootkit techniques, which probably
deserve a post of their own. This post is about the pseudo-random DNS
name generation.

When Srizbi can’t contact 208.72.169.22 or 208.72.169.136 or … — its
primary C&C servers (all hosted at McColo, but you know this story by
now), after ten attempts, it switches to a backup server — four backup
servers in fact, with randomly generated names. It then cycles through
these destinations forever, until something tells it what it wants to
hear.

These backup DNS names were not registered, which made everyone ask:
Who would code a backup DNS name into a bot, and then fail to register
that name?
I speculated that it may have been an unknown feature of the
code base that the bot author used, assuming that they didn’t write that
much of the code, or even read any of it.

Upon examining the binary in IDA, the mechanism became obvious.

There is a call to KeQuerySystemTime, and then RtlTimeToTimeFields to
get the current year, month, and day. This is done in a loop four times
with a call to the function at the bottom of this post. The date is swizzled around in various ways, such as if there are more than 12 months, the quotient is converted to years, and
leap-years are calculated, and other transforms like that. The returned day count is rounded down to the nearest third day. (i.e. Julian days
which are even multiples of three.)

An array of eight bytes (only the low nibble is used) is created from
every other byte from the XOR of the current day, and the magic number
in this specific sample, 0x5BE741E3.

The third byte of the day, is XOR’d over all eight bytes in this
array, and each byte is multiplied by the current iteration of the far
outside loop – the one driving the creation of four DNS names – mod fifteen.

This leaves you with eight bytes, with each byte in the range 0 to 14
inclusive. Each byte is used as an index into the string:

"qwertyuiopasdfghjklzxcvbnm" — of which only the first 15 letters can
ever be used. This is why all of the DNS names never use the letter h,
or z, or m, etc.

As the function is linear, if you have the four DNS names, or
even just the first one, and the date it was generated, you can crank through this function
backwards to get the magic number, which varies between most samples. We identified 55 unique magic numbers throughout the hundreds of samples we analyzed.

q w e r t y u i o p a s d f g h j k l z x c v b n m
0 1 2 3 4 5 6 7 8 9 a b c d e - - - - - - - - - - -

So, given a name like yrytdyip.com, the array of eight bytes that
gives you that string is, y=5, r=3, y=5, t=4, d=12, y=5, i=7, p=9, or
0x5354c579 when you put it all together.

The first name the bot spits out has been multiplied by one, so it’s the
exact same thing that’s in memory after all of the above. The next one is
multiplied by two, and then the modulus is taken on each byte to keep
their values below fifteen. The third one is multiplied by three, mod 15.
And the fourth multiplied by four, mod 15.

So, for example:

5*2 = 10%15 ≡ 10
5*3 = 15%15 ≡ 0
5*4 = 20%15 ≡ 5
12*2 = 24%15 ≡ 9
12*3 = 36%15 ≡ 6
12*4 = 48%15 ≡ 3

So the Multiplication table looks like this:

0 1 2 3 4 5 6 7 8 9 a b c d e
0 2 4 6 8 a c e 1 3 5 7 9 b d
0 3 6 9 c 0 3 6 9 c 0 3 6 9 c
0 4 8 c 1 5 9 d 2 6 a e 3 7 b

Here are some real-world examples:

yrytdyip.com  5354c579  2008-11-13  (times one)
auaopagr.com  a6a89ae3  2008-11-13  (times two)
qpqduqud.com  090c606c  2008-11-13  (times three)
ydywryfu.com  5c5135d6  2008-11-13  (times four)
ererseqg.com  2323b20e  2008-11-17  (times one)
tutuitqf.com  4646740d  2008-11-17  (etc.)
upupruqd.com  6969360c  2008-11-17
ododgoqs.com  8c8ce80b  2008-11-17

These examples are from the same bot, but they are more than three days
apart, so the Julian date has rolled over:

Nov 13, 2008 ≡ 318 = (106*3)
Nov 16, 2008 ≡ 321 = (107*3)
Nov 19, 2008 ≡ 324 = (108*3)

XORing two of these DNS names together, reveals the byte from the current
date, which is repeatedly used: (The two days shown XOR’s here, obviously.)

0x2323b20e ^ 0x5354c579 = 0x70777777 (That’s [EBP+day+3])

So, on Nov 19, all of the bots in this Botnet
incremented to the next set of DNS names. They do this every three
days, and would have for the rest of forever, until someone (either related to the Botnet or not) told the bots what to do. The details for this are in a previous post.

This is the function that does most of this work. It’s only somewhat
crudely commented right now, but it should be evident what is happening.


; Attributes: bp-based frame
build_dns_name proc near
var_something= dword ptr -0Ch
var_random= dword ptr -4
arg_year= dword ptr  8
arg_month= dword ptr  0Ch
arg_day= dword ptr  10h
arg_that_new_buffer= dword ptr  14h
arg_iteration= dword ptr  18h
; register ecx: (null)
push    ebp
mov     ebp, esp
sub     esp, 0Ch
push    ebx             ; still 0
push    esi
push    edi
push    [ebp+arg_day]
mov     [ebp+var_random], 5BE741E3h ; Mystery number for XORs
push    [ebp+arg_month]
push    [ebp+arg_year]
call    normalize_date  ; Not entirely certain yet what this returns
push    3
xor     edx, edx
pop     ecx
mov     ebx, eax        ; Julian date or maybe (years since 1970)+(day-1)
div     ecx
lea     ecx, [ebp+var_something]
sub     ebx, edx        ; days mod 3
mov     [ebp+arg_day], ebx ; every three days
xor     esi, esi
first_loop:
mov     al, byte ptr [ebp+esi+arg_day]
xor     al, byte ptr [ebp+esi+var_random]
mov     dl, al
shr     dl, 4
mov     [ecx], dl
inc     ecx
and     al, 0Fh
mov     [ecx], al
inc     ecx
inc     esi
cmp     esi, 4
jl      short first_loop
mov     byte ptr [ebp+arg_day+3], bl ; days mod 3
mov     ebx, [ebp+arg_iteration] ; range 0:3
and     byte ptr [ebp+arg_day+3], 0Fh
xor     esi, esi
inc     ebx             ; range 1:4
second_loop:
movzx   eax, byte ptr [ebp+arg_day+3]
lea     ecx, [ebp+esi+var_something]
movzx   edx, byte ptr [ecx] ; 8 bytes of stuff from above
xor     eax, edx
imul    eax, ebx        ; dns name loop iteration
push    15
cdq                     ; extend sign of EAX info EDX
pop     edi
idiv    edi             ; mod 15
inc     esi
cmp     esi, 8
mov     [ecx], dl       ; remainder mod 15
jl      short second_loop
mov     eax, [ebp+arg_that_new_buffer]
xor     ecx, ecx
third_loop:
movzx   edx, byte ptr [ebp+ecx+var_something]
mov     dl, byte ptr ds:aQwertyuiopasdfghjklzxcvbnm[edx] ; "qwertyuiopasdfghjklzxcvbnm"
mov     [eax], dl
inc     eax
inc     ecx
cmp     ecx, 8
jl      short third_loop
mov     byte ptr [eax], 0
dec     eax
loc_11FCB:
mov     cl, [eax+1]
inc     eax
test    cl, cl
jnz     short loc_11FCB
mov     edi, eax
mov     esi, offset a_com ; ".com"
movsd
movsb
pop     edi
pop     esi
pop     ebx
leave
retn    14h
build_dns_name endp

Julia Wolf (edits by Alex Lanstein) @ FireEye Malware Intelligence Lab

Questions/Comments to research [@] fireeye [.] com

13 thoughts on “Technical details of Srizbi’s domain generation algorithm

  1. Oh my god … what an inventive system these guys used ! And from recent news, I understand it actually worked (even for a short while). Does this mean the bad guys can more or less pick one of the new names for the current date and register that name, to setup the new c&c server ?
    but, does it not mean they have to keep registering new names at a very high rate, I presume not all bots are online all the time and they dont want to miss out bots that are only online this week but not the next and come back after that ? Im not sure I understand the whole thing ….

  2. Allen,
    to put it in simpler terms, from what i understand the bot takes the current date, runs it through the function as describes and spits out a few domain names for that day, then the next day comes along and it has 4 more it can use. This cycle continues forever, and makes it possible to reconnect with any bot even if it takes awhile for the bot to find a registered domain by the hacker, without the hacker having to sit on domains and servers.
    I believe this blog talked about there being 4 domains hardcoded into the bot. However, I dont think that they knew that the domains would change every day, because the bot has never ran the function due to the CnC servers always being online. I think everyone just blew off the domains thing thinking it was just a dumb mistake of the hacker coding domains in then not registering them.
    Because the main CnC server has never been down before, no one really cared too much to study the rest of the bot that wasn’t being utilized. Only now that the servers are down did they find that the bot generates new domains everyday.
    when you said “Does this mean the bad guys can more or less pick one of the new names for the current date and register that name, to setup the new c&c server ?” you are exactly correct. even if security groups kept buying the domains everyday before the hacker could, eventually they would run out of money and the bot would be regained by the hacker. Once the connection to the CnC is reestablished the bot no longer needs to keep generating the domains.
    And to answer your second question about always having to register new domains. They know they probably wont get every bot like this, but the vast majority will be picked up. and if the hacker wanted to he could easily just buy a few more domains at random times and pickup the rest.
    this backup plan is pretty ingenious if you ask me, as long as the hacker is patient he/she/they can get the botnet back under control no matter what happens.

  3. Why couldnt one of the domains be registered and then uninstall instructions be sent to infected PC’s by a security group?

  4. Couldn’t the hosts just use the same algorithm and blacklist all those names generated from it … forever? I’m sure it wouldn’t affect the rest of the world if the domain name is gibberish to begin with.

  5. compuboy04: as I understand it, the bots need to see a private key from the C&C servers before accepting instructions, and from what I recall, it’s a 1024 byte key, which nobody has cracked yet.

  6. Rather than trying to break the PKI, as long as messages can be sent to the bot, there’s a chance that there are exploitable bugs in the bot’s code. Has analysis been done to identify possible buffer overflows in the bot, and if present, exploit them?
    Also, is it possible to track down where the domain registrations are coming from? It’s pretty easy to cover up a single registration, but I can only imagine it would be harder to cover up (potentially) many registrations for domains that are known in advance to all participants.

  7. An even simpler solution to prevent this botnet from being revived is for Verisign to block registration of these domain names!

  8. Sven - I don’t understand why the private key is important, somebody has the recipient (in this case the bot). Encryption is only important to secure the communication channel.

  9. This may be a naive idea, but would it not make sense to try and decode the communications between the individual bots and the Commanders then create your own C&C server and register it with as many names as you can that this algorithm will generate, for say 01 FEB 2009 - 07 FEB 2009. Then force another shutdown of the current C&C servers at the end of JAN.
    Then those of us who would rather see an end to this botnet would have effectively performed a technological coup d’etat and gathered control of it ourselves. Even if there is no shutdown or uninstall function built into the code base (a truly malicious coder wouldn’t have one) with a command hierarchy whose only order is “stand your ground” the spam would still stop.
    Again, I’ve never looked into this code or the communications between the clients and servers, so I don’t know how feasible this idea truly is. But you must admit, it at least sound good.
    -R

  10. The best way to fix the problem is to monitor the DNS queries and then have ISPs contact the infected customers to clean their computers through their walled garden infrastructure.
    Is there code available that will generate the DNS names?

Comments are closed.