This post will dive into the algorithm by which Srizbi decides which domain name to contact on a given day
For this analysis, I [Julia] reverse engineered the following sample:
MD5:8adb642389b8bf999e2017d731edcb00 bot.sys (Linked on Mon Dec 10 03:57:34 2007 — if the timestamps can be trusted.)
Which is not only unpacked, but the author left the debugging symbols in. The original name for this project is revealed by: c:\reactor3\client\Release\client.pdb
The bot (Srizbi
) is loaded into the NT Kernel as a device driver,
and then mucks around with the pointers for several kernel functions, like
ZwEnumerateKey, ZwOpenKey, ZwQueryValueKey, ZwCreateFile, ZwLoadDriver,
etc. These are all fairly common rootkit techniques, which probably
deserve a post of their own. This post is about the pseudo-random DNS
name generation.
When Srizbi can’t contact 208.72.169.22 or 208.72.169.136 or … — its
primary C&C servers (all hosted at McColo, but you know this story by
now), after ten attempts, it switches to a backup server — four backup
servers in fact, with randomly generated names. It then cycles through
these destinations forever, until something tells it what it wants to
hear.
These backup DNS names were not registered, which made everyone ask:
Who would code a backup DNS name into a bot, and then fail to register
I speculated that it may have been an unknown feature of the
that name?
code base that the bot author used, assuming that they didn’t write that
much of the code, or even read any of it.
Upon examining the binary in IDA, the mechanism became obvious.
There is a call to KeQuerySystemTime, and then RtlTimeToTimeFields to
get the current year, month, and day. This is done in a loop four times
with a call to the function at the bottom of this post. The date is swizzled around in various ways, such as if there are more than 12 months, the quotient is converted to years, and
leap-years are calculated, and other transforms like that. The returned day count is rounded down to the nearest third day. (i.e. Julian days
which are even multiples of three.)
An array of eight bytes (only the low nibble is used) is created from
every other byte from the XOR of the current day, and the magic number
in this specific sample, 0x5BE741E3.
The third byte of the day, is XOR’d over all eight bytes in this
array, and each byte is multiplied by the current iteration of the far
outside loop – the one driving the creation of four DNS names – mod fifteen.
This leaves you with eight bytes, with each byte in the range 0 to 14
inclusive. Each byte is used as an index into the string:
"qwertyuiopasdfghjklzxcvbnm" — of which only the first 15 letters can
ever be used. This is why all of the DNS names never use the letter h
,
or z
, or m
, etc.
As the function is linear, if you have the four DNS names, or
even just the first one, and the date it was generated, you can crank through this function
backwards to get the magic number, which varies between most samples. We identified 55 unique magic numbers throughout the hundreds of samples we analyzed.
q w e r t y u i o p a s d f g h j k l z x c v b n m 0 1 2 3 4 5 6 7 8 9 a b c d e - - - - - - - - - - -
So, given a name like yrytdyip.com
, the array of eight bytes that
gives you that string is, y=5, r=3, y=5, t=4, d=12, y=5, i=7, p=9, or
0x5354c579 when you put it all together.
The first name the bot spits out has been multiplied by one, so it’s the
exact same thing that’s in memory after all of the above. The next one is
multiplied by two, and then the modulus is taken on each byte to keep
their values below fifteen. The third one is multiplied by three, mod 15.
And the fourth multiplied by four, mod 15.
So, for example:
5*2 = 10%15 ≡ 10 5*3 = 15%15 ≡ 0 5*4 = 20%15 ≡ 5 12*2 = 24%15 ≡ 9 12*3 = 36%15 ≡ 6 12*4 = 48%15 ≡ 3
So the Multiplication table looks like this:
0 1 2 3 4 5 6 7 8 9 a b c d e 0 2 4 6 8 a c e 1 3 5 7 9 b d 0 3 6 9 c 0 3 6 9 c 0 3 6 9 c 0 4 8 c 1 5 9 d 2 6 a e 3 7 b
Here are some real-world examples:
yrytdyip.com 5354c579 2008-11-13 (times one) auaopagr.com a6a89ae3 2008-11-13 (times two) qpqduqud.com 090c606c 2008-11-13 (times three) ydywryfu.com 5c5135d6 2008-11-13 (times four) ererseqg.com 2323b20e 2008-11-17 (times one) tutuitqf.com 4646740d 2008-11-17 (etc.) upupruqd.com 6969360c 2008-11-17 ododgoqs.com 8c8ce80b 2008-11-17
These examples are from the same bot, but they are more than three days
apart, so the Julian date has rolled over:
Nov 13, 2008 ≡ 318 = (106*3) Nov 16, 2008 ≡ 321 = (107*3) Nov 19, 2008 ≡ 324 = (108*3)
XORing two of these DNS names together, reveals the byte from the current
date, which is repeatedly used: (The two days shown XOR’s here, obviously.)
0x2323b20e ^ 0x5354c579 = 0x70777777 (That’s [EBP+day+3])
So, on Nov 19, all of the bots in this Botnet
incremented to the next set of DNS names. They do this every three
days, and would have for the rest of forever, until someone (either related to the Botnet or not) told the bots what to do. The details for this are in a previous post.
This is the function that does most of this work. It’s only somewhat
crudely commented right now, but it should be evident what is happening.
; Attributes: bp-based frame build_dns_name proc near var_something= dword ptr -0Ch var_random= dword ptr -4 arg_year= dword ptr 8 arg_month= dword ptr 0Ch arg_day= dword ptr 10h arg_that_new_buffer= dword ptr 14h arg_iteration= dword ptr 18h ; register ecx: (null) push ebp mov ebp, esp sub esp, 0Ch push ebx ; still 0 push esi push edi push [ebp+arg_day] mov [ebp+var_random], 5BE741E3h ; Mystery number for XORs push [ebp+arg_month] push [ebp+arg_year] call normalize_date ; Not entirely certain yet what this returns push 3 xor edx, edx pop ecx mov ebx, eax ; Julian date or maybe (years since 1970)+(day-1) div ecx lea ecx, [ebp+var_something] sub ebx, edx ; days mod 3 mov [ebp+arg_day], ebx ; every three days xor esi, esi first_loop: mov al, byte ptr [ebp+esi+arg_day] xor al, byte ptr [ebp+esi+var_random] mov dl, al shr dl, 4 mov [ecx], dl inc ecx and al, 0Fh mov [ecx], al inc ecx inc esi cmp esi, 4 jl short first_loop mov byte ptr [ebp+arg_day+3], bl ; days mod 3 mov ebx, [ebp+arg_iteration] ; range 0:3 and byte ptr [ebp+arg_day+3], 0Fh xor esi, esi inc ebx ; range 1:4 second_loop: movzx eax, byte ptr [ebp+arg_day+3] lea ecx, [ebp+esi+var_something] movzx edx, byte ptr [ecx] ; 8 bytes of stuff from above xor eax, edx imul eax, ebx ; dns name loop iteration push 15 cdq ; extend sign of EAX info EDX pop edi idiv edi ; mod 15 inc esi cmp esi, 8 mov [ecx], dl ; remainder mod 15 jl short second_loop mov eax, [ebp+arg_that_new_buffer] xor ecx, ecx third_loop: movzx edx, byte ptr [ebp+ecx+var_something] mov dl, byte ptr ds:aQwertyuiopasdfghjklzxcvbnm[edx] ; "qwertyuiopasdfghjklzxcvbnm" mov [eax], dl inc eax inc ecx cmp ecx, 8 jl short third_loop mov byte ptr [eax], 0 dec eax loc_11FCB: mov cl, [eax+1] inc eax test cl, cl jnz short loc_11FCB mov edi, eax mov esi, offset a_com ; ".com" movsd movsb pop edi pop esi pop ebx leave retn 14h build_dns_name endp
Julia Wolf (edits by Alex Lanstein) @ FireEye Malware Intelligence Lab
Questions/Comments to research [@] fireeye [.] com






Oh my god … what an inventive system these guys used ! And from recent news, I understand it actually worked (even for a short while). Does this mean the bad guys can more or less pick one of the new names for the current date and register that name, to setup the new c&c server ?
but, does it not mean they have to keep registering new names at a very high rate, I presume not all bots are online all the time and they dont want to miss out bots that are only online this week but not the next and come back after that ? Im not sure I understand the whole thing ….
Allen,
to put it in simpler terms, from what i understand the bot takes the current date, runs it through the function as describes and spits out a few domain names for that day, then the next day comes along and it has 4 more it can use. This cycle continues forever, and makes it possible to reconnect with any bot even if it takes awhile for the bot to find a registered domain by the hacker, without the hacker having to sit on domains and servers.
I believe this blog talked about there being 4 domains hardcoded into the bot. However, I dont think that they knew that the domains would change every day, because the bot has never ran the function due to the CnC servers always being online. I think everyone just blew off the domains thing thinking it was just a dumb mistake of the hacker coding domains in then not registering them.
Because the main CnC server has never been down before, no one really cared too much to study the rest of the bot that wasn’t being utilized. Only now that the servers are down did they find that the bot generates new domains everyday.
when you said “Does this mean the bad guys can more or less pick one of the new names for the current date and register that name, to setup the new c&c server ?” you are exactly correct. even if security groups kept buying the domains everyday before the hacker could, eventually they would run out of money and the bot would be regained by the hacker. Once the connection to the CnC is reestablished the bot no longer needs to keep generating the domains.
And to answer your second question about always having to register new domains. They know they probably wont get every bot like this, but the vast majority will be picked up. and if the hacker wanted to he could easily just buy a few more domains at random times and pickup the rest.
this backup plan is pretty ingenious if you ask me, as long as the hacker is patient he/she/they can get the botnet back under control no matter what happens.
If they had been really clever, they would’ve just used some pki magic.
Why couldnt one of the domains be registered and then uninstall instructions be sent to infected PC’s by a security group?
Couldn’t the hosts just use the same algorithm and blacklist all those names generated from it … forever? I’m sure it wouldn’t affect the rest of the world if the domain name is gibberish to begin with.
compuboy04: as I understand it, the bots need to see a private key from the C&C servers before accepting instructions, and from what I recall, it’s a 1024 byte key, which nobody has cracked yet.
Rather than trying to break the PKI, as long as messages can be sent to the bot, there’s a chance that there are exploitable bugs in the bot’s code. Has analysis been done to identify possible buffer overflows in the bot, and if present, exploit them?
Also, is it possible to track down where the domain registrations are coming from? It’s pretty easy to cover up a single registration, but I can only imagine it would be harder to cover up (potentially) many registrations for domains that are known in advance to all participants.
An even simpler solution to prevent this botnet from being revived is for Verisign to block registration of these domain names!
Sven - I don’t understand why the private key is important, somebody has the recipient (in this case the bot). Encryption is only important to secure the communication channel.
Wow Julia!
This is awesome research.
I analyzed two variants of the Srizbi malware in a VM environment, my published report can be found at
http://dash.za.net/Sources/Docs/Srizbi.pdf
Keep the good work up!
All the best!
Sincerely,
Dash Shendy
This may be a naive idea, but would it not make sense to try and decode the communications between the individual bots and the Commanders then create your own C&C server and register it with as many names as you can that this algorithm will generate, for say 01 FEB 2009 - 07 FEB 2009. Then force another shutdown of the current C&C servers at the end of JAN.
Then those of us who would rather see an end to this botnet would have effectively performed a technological coup d’etat and gathered control of it ourselves. Even if there is no shutdown or uninstall function built into the code base (a truly malicious coder wouldn’t have one) with a command hierarchy whose only order is “stand your ground” the spam would still stop.
Again, I’ve never looked into this code or the communications between the clients and servers, so I don’t know how feasible this idea truly is. But you must admit, it at least sound good.
-R
The best way to fix the problem is to monitor the DNS queries and then have ISPs contact the infected customers to clean their computers through their walled garden infrastructure.
Is there code available that will generate the DNS names?
thanks Aha! so that’s how they do it, those little rats