Using Microscanner in a CI/CD Pipeline

On 16th May 2018 I attended the London Gophers meetup at Improbable. Amongst the speakers was Liz Rice, a Software Engineer and Technology Evangelist who works for Aqua Security. Liz gave an amazing talk about writing a debugger from scratch, which I thoroughly enjoyed. At the end of her talk, she gave a little plug to a new free project coming out of her company called Microscanner, and that peeked my interest, prompting me to write this blog post.

What is Microscanner?

Microscanner is a ‘tool that scans container images for package vulnerabilities’. All you need to do is regiser an email address in return for a token, and use the token alongside the supplied binary inside your Dockerfile to run the command that checks for the vulnerabilities. If this command fails, the docker build can fail too, meaning that you never release docker images that have severe vulnerabilities.

CI/CD

What really interested me with this tool is the ability to add this to a CI/CD pipeline. I’m currently writing our CI/CD pipeline at Antidote and this seemed like a great step to add.

The recommended way of running Microscanner is to add the binary to the Dockerfile and execute it.

ADD https://get.aquasec.com/microscanner /
RUN chmod +x /microscanner
ARG MICROSCANNER_TOKEN
RUN /microscanner ${MICROSCANNER_TOKEN} [--continue-on-failure]

What I wanted to do was see whether it was possible to not add this snippet to every Dockerfile you have, but be able to have another Dockerfile do the hard work, and your ‘main’ Dockerfile to not change.

Of course, this is possible by taking making this new Dockerfile inherit from the built docker image.

FROM <your docker image>
ADD https://get.aquasec.com/microscanner /
RUN chmod +x /microscanner
ARG MICROSCANNER_TOKEN
RUN /microscanner ${MICROSCANNER_TOKEN}

Given that the docker image <your docker image> would have just been built in your CI pipeline, the image should exist locally. This means that only 4 layers need to be created, and run, and that should mean this step is very fast.

But, how do we insert the <your docker image> into the Dockerfile? It’s probably not that easy to actually have the Dockerfile on disk, so let’s have a look how to do this with the Python Docker SDK.

from io import BytesIO
import json

import docker

from .settings import MICROSCANNER_TOKEN  # your microscanner token

# this template has changed slightly to allow ease of parsing the results
DOCKERFILE_TEMPLATE = """
FROM {}
ADD https://get.aquasec.com/microscanner /
RUN chmod +x /microscanner
ARG MICROSCANNER_TOKEN
RUN /microscanner ${{MICROSCANNER_TOKEN}} --continue-on-failure > /scan.json
RUN cat /scan.json
"""


def scan_image(image):
    """Scan docker image `image` for vulnerabilities"""
    # get docker client
    client = docker.from_env()

    # variables for output capture
    output = ''

    # run docker build command
    for line in client.images.build(
            # build from this Dockerfile
            fileobj=BytesIO(DOCKERFILE_TEMPLATE.format(image).encode('utf-8')),
            # tag the image with this
            tag='{}-scan'.format(image),
            # don't use the cache as we want microscanner to run everytime
            nocache=True,
            pull=False,  # don't pull the FROM image
            buildargs={  # build args to pass
                'MICROSCANNER_TOKEN': MICROSCANNER_TOKEN,
            },
            stream=True,  # stream command output
            decode=True,  # decode command output
        ):
        if 'stream' in line:
            stream = line['stream'].strip()
            output += stream  # capture output
            print(stream)  # print output to screen

    # parse output to get vulnerabilities (this definitely needs improving...)
    last_command_output = output.split('RUN cat /scan.json')[1]
    inner_json = last_command_output.partition('{')[2].rpartition('}')[0]
    vuln_str = '{' + inner_json  + '}'
    
    # get dict from output
    vulns = json.loads(vuln_str)

    # process vulns...

With this snippet, you can scan your docker image for vulnerabilities without having to rewrite the Dockerfile for each of your services. This step can run in your pipeline after your image has built, and probably after you have tested it.

Given that you also have the results of the vulnerability scan accessible to you in code, you can integrate even more with your CI pipeline. There are many different ways that you can take this:

stopping the pipeline from continuing if there are big vulnerabilities
alerting about new vulnerabilities
creating vulnerability reports

I love finding little tools like this that can easily integrate into a CI/CD pipeline. Thanks to Liz Rice for mentioning this, and Aqua Security for building Microscanner.