A Closer Look at the Twitter-Controlled Botnet (Part 1)
Update 2: mirrored malware links taken down. This is not something in my control.
Part 1 of this post will cover getting the malware, decoding it and scanning it. If I have time, Part 2 will be some disassembly & debugging (both static and dynamic).
The domains found hosting malware have been notified (Ubuntu, rifers.org). The malware has been taken down from these sites in order to prevent further propagation, but is offered below in a password protected archive for the reader to practice on.
I wasn’t aware of Jose Nazario’s post concerning this topic while I was conducting this research; I had only been exposed to the Wired Threat Level article prior to researching. So while I present some of the same information as Jose, this duplication of information only came to my attention afterwords.
If you’ve read Jose’s post, this post may still be worth the read for several reasons:
- Jose and I differed on some of the tools & techniques used.
- I attempt to offer a more detailed description of my methods/logic as a pseudo-tutorial.
- I mirror all the necessary info so the readers can do this themselves.
- There’s a quick discussion on some malware I found hosted at ubuntu.com (Jose probably saw it too but didn’t mention it) as well as a possible lead to a very sloppy botnet master.
Getting the Malware:
I was reading some feeds on Friday (Aug 14th) and came across Wired’s article on outsourcing botnet C&C (command & control) to Twitter. What caught my eye wasn’t so much the article itself but the screenshot accompanying the article. Many times when major outlets report on botnet/worms/viruses/etc, crucial details are left out either intentionally (to protect the innocent) or accidentally. This was not the case with this article.
I immediately recognized the tweets in the above screenshot as being base64 encoded. Furthermore, all of the posts started with the same 18 characters, indicating to me that these are not encrypted nor obfuscated beyond the simple base64 encoding. Perhaps the botnet herders are using Robin Wood’s KreiosC2 for nefarious purposes? This is evidence for a fairly unsophisticated botnet herder.
I transcribed the messages captured in the screenshot and decoded them in order from most recent to least recent. Some contained what appeared to be multiple links (redirections valid as of Aug 14th, 2009):
http://bit.ly/3RwAN (dead link)
There’s several interesting items here, in no particular order:
- It appears as though Debian is better at proactively moderating these type of posts than Ubuntu is (all the Debian links were dead when I tried them but the Ubuntu link worked fine). In Ubuntu’s defense, however, the offending links were killed within an hour of me notifying them.
- Payloads are being pushed out in rapid succession to both the C&C venues (Twitter, Jaiku, Tumblr, etc) and the payload hosting sites, indicating that this process has been automated. Automated payload deployment was determined by looking at some of the URLs linked in the Twitter screenshot:
It can be deduced from these URLs that malware was uploaded to rifers.org in a short enough time period to warrant consecutive numbers. Furthermore, it is clear that whoever controlled the Twitter C&C made these uploads as well, judging by the upd4t3 handle present across services.
- All the Twitter posts that included two redirect URLs appear to have a nonsense link as the second URL. If anyone has a theory as to the purpose of these secondary links, please leave a comment or shoot me an email @ my[remove_this].firstname.lastname@example.org
- The botnet herder’s name is Rafael? I took another look at the malware hosted at Ubuntu and removed the ‘plain/‘:
Decoding the Malware:
Get the base64 samples (password: infected) (LINK REDACTED).
Turning these base64 strings into something meaningful was more involved than simply decoding them. Still, the first step was to decode them. For that I wrote a little Python script. (I’m new to Python and figured this would be a simple exercise. It was.)
# decodes base64 files
# (C) 2009 Paul Makowski. GPLv2.
# usage: python /b64_decode.py (encoded_file) (output_file)
encodedFile = sys.argv
outputFile = sys.argv
encodedFileHndl = open(encodedFile,”r”)
outputFileHndl = open(outputFile, “w”)
After decoding the malware I now had 5 files and named them after their URLs: 9506, 9507, 9508, 9509 & 252515.
I ran an md5 on all of them (I used OS X… it would be md5sum in Linux):
$ md5 *.base64
MD5 (252515.base64) = a5f84f74cf9aa832355d5cd558cbfca6
MD5 (9506.base64) = 7743eac81be2b803093a6277323f17cb
MD5 (9507.base64) = a5f84f74cf9aa832355d5cd558cbfca6
MD5 (9508.base64) = a5051a6e5365bdc4dd8267e62d3e2902
MD5 (9509.base64) = 1a81e69e65b75f8b9e72e94c6f86a52b
As you can see, payloads 9507 from rifers.org and 252515 from ubuntu.com are actually the same payload. (Yes I know about md5 collisions…but there’s very little point to messing with the hashes in this scenario.)
So now we’ve narrowed down the available payloads to 4: 9506 through 9509. I named these 9506.bin through 9509.bin (since at this point I didn’t know their true filetype).
Making Sense of the Malware:
The first thing I tried after I de-base64′ed the payloads was to take a look at them with a hex editor. Being on OS X, I used Hex Fiend (if I were on Windows, I’d use WinHex; Linux I’d use hexedit):
- This is not a Windows executable; this is a .zip file. I determined this by the magic number at the beginning of the file (seen above). PK means zip; MZ (or ZM) means Windows PE. file verified these findings:
$ file 950*.bin
9506.bin: Zip archive data, at least v2.0 to extract
9507_252515.bin: Zip archive data, at least v2.0 to extract
9508.bin: Zip archive data, at least v2.0 to extract
9509.bin: Zip archive data, at least v2.0 to extract
- There’s a file called gbpm.dll inside the archive. At the bottom of the binary (not shown), is another string that reads gbpm.exe. This also turned out to be a file in the archive.
All of the other payloads appeared the same way under a hex editor. I renamed them all from *.bin to *.zip and unzipped them.
Now I had four folders, each containing a unique gdpm.dll and gdpm.exe. I renamed all the gdpm.exes to gdpm.livemalware so I wouldn’t accidentally execute them on my Windows box.
I checked the md5s to see if any were duplicates:
$ md5 950*/*.dll && md5 950*/*.livemalware
MD5 (9506/gbpm.dll) = 0dc041988367e4ca0faa1f119c748efb
MD5 (9507_252515/gbpm.dll) = 6cd9ee23dedf7c6a53668a7c4f830d78
MD5 (9508/gbpm.dll) = 1a1b3c05470ea788a86c4b9ed5c9b28f
MD5 (9509/gbpm.dll) = b15df1614d09ebb7b15d04ce914ee05f
MD5 (9506/gbpm.livemalware) = 4c537d461490ac998256c6deca11eeb4
MD5 (9507_252515/gbpm.livemalware) = 359ca7a025c3fe3cb7f60a3dd8ff4478
MD5 (9508/gbpm.livemalware) = b3a7f3145dc93e8721a4078f5e32fb44
MD5 (9509/gbpm.livemalware) = 08b05a33c6a989cc9c3f0a0918afa943
None were the same – I have 4 different pairs of malware samples :)
I uploaded the files to Virustotal to see if any were recognized. AV detection was poor to say the least (not that I’m surprised):
9506/gbpm.dll (4/41 antivirus detection) (new file)
9506/gbpm.exe (11/39 antivirus detection) (new file)
9507_252515/gbpm.dll (4/41 antivirus detection) (new file)
9507_252515/gbpm.exe (13/39 antivirus detection)
9508/gbpm.dll (5/41 antivirus detection)
9508/gbpm.exe (13/39 antivirus detection)
9509/gbpm.dll (6/41 antivirus detection)
9509/gbpm.exe (8/41 antivirus detection)
The files marked new file had not been seen by Virustotal previously. All .dlls had a fairly low detection rate. That combined with the fact that some had not been seen by Virustotal previously reminds me of PandaLabs recent press release on viruses only being useful for 24 hours.
So what kind of malware do we have anyways? Virustotal points toward Eldorado or Svelta for some files. Jose says in his post that these aren’t the botnet control agents, but are additional feature-adding payloads. Perhaps this means keyloggers, DDoS tools, etc?
In Part 2 of this post, I will delve into dissecting the malware and making sense of what it does. Hopefully manual analysis will yield more information than AV signatures…