Posted under misc by Sander with tag(s) python job

January 2016, while applying for a freelance position over at Pine Security I received the following coding assesment:

Hoi Sander,

Bij deze ontvang je het assessment voor je sollicitatie. Je hebt 2 uur de tijd om het assessment te doen. De door jou uitgewerkte oplossing dient om 14:00 naar ons terug te worden gestuurd. Het is de bedoeling dat je zo ver mogelijk komt met een goede oplossing/aanpak voor het probleem.

In de bijlagen vind je de beschrijving van het assessment en een (SQLite) database dump die je kunt gebruiken om je output (queries) te verifiëren.

Nog even 2 aandachtspunten:

  • De code voor het assessment moet in python worden geschreven
  • Het is niet nodig om een koppeling met de database te maken; er hoeven alleen queries te worden gegenereerd. Je kunt deze uiteraard ook met de hand testen.

Succes!

Assignment

Build a query builder library in Python. This library should generate SQL queries that can be used to query the attached data set. First implement basic CRUD queries that work on single tables, then extend the library with more advanced queries like selects and updates over multiple tables, search/filter functionality et cetera.

The library doesn’t have to run the queries themselves, but should return the composed queries.

For writing the library, make no use of other libraries other than a standard database access library (e.g. sqlite) if you wish to access the database from the library. The goal is to see how you tackle the problem yourself; hence minimal external code inclusion. Please write the library using Python.

You have 2 hours for this assignment. It is not to be expected that you are able to implement all the functionality in that time frame, but we are curious to see how far you get.

Click here to download the attached database

Explanation of the attached database:

  • The DB is a dump of a small SQLite database. The easiest way to work with the data is to import it into sqlite again (sqlite3 task.db < assessment.dbdump.txt)
  • The database resembles a small webshop: products, orders and contact details for both billing and shipping.
  • Many fields are to be used as foreign keys: these are most of the int fields.
  • The order table contains order information. The products that are part of an order, are in orderitem. The amount field denotes how many of this product is in the order, the itemprice is the price at which the item was purchased (per item).

Again, the code you will write should be a library that returns SQL queries, not a program that executes them. To give you an idea, it should be able to return the following queries, in order of importance:

SELECT `id`, `product`, `amount`, `itemprice`, `order` FROM "orderitem";
SELECT `id`, `product`, `amount`, `itemprice`, `order` FROM "orderitem" WHERE `id` = 3;
INSERT INTO "product" (`name`, `description`, `price`, `url`) VALUES (Doctor Who Adipose Stress Toy, Adorable squishy baby Adipose, 2499, 'http://www.thinkgeek.com/product/e5ed/');
UPDATE "product" SET `price` = 3499 WHERE `id` = 1;
DELETE FROM "product" WHERE `id` = 3;
SELECT * FROM "orderitem" WHERE `order` = 1 AND `amount` > 2;
SELECT "order", SUM(`itemprice` * `amount`) FROM `orderitem` GROUP BY `order`;
SELECT "order".* FROM `order` LEFT JOIN "orderstatus" ON "orderstatus".`id` = `order`.`status` WHERE "orderstatus".`name` = 'pending';
SELECT SUM("orderitem".`itemprice` * "orderitem".`amount`) AS `price` FROM "order" INNER JOIN "orderitem" ON "orderitem".`order` = "order".`id` INNER JOIN "orderstatus" ON "orderstatus".`id` = "order".`status` WHERE "order".`id` = 1 AND "orderstatus".`name` = 'accepted' GROUP BY "order".`id`;

You may decide on the library interface yourself. For instance, whether all columns should be passed to the library, whether tables should be pre-registered et cetera.

Good luck!

TL;DR

Build a SQL query builder given our schema in 2 hours. Implement CRUD, JOINs, SUM() and GROUP BY.

My entry

I was able to implement CRUD before the time was up. In retrospect I would have done some things differently, but nonetheless I felt confident about the following snippet:

#!/usr/bin/python2
import json,sys


def validate_fields(f):
    def wrapper(self, *args, **kwargs):
        for k in sum([list(args), kwargs.keys()], []):
            if not isinstance(k, dict) and not k in self.fields:
                raise Exception('Unknown field \"%s\" for table \"%s\"' % (k, self.tablename))

        return f(self, *args, **kwargs)
    return wrapper


class Shop(object):
    raw = False
    tablename = None
    fields = None

    def _serialize(self, v):
        # TO-DO: support datetime
        return json.dumps(v)

    def _output(self, sql):
        if self.raw:
            sys.stdout.write(sql)
        else:
            print sql

    def _where(self, d):
        sql = ''
        condition = ' WHERE'

        for k, v in d.iteritems():
            assignment = '='

            for cond_oper in '>=', '<=', '<>', '>', '<':
                if isinstance(v, str) and v.startswith(cond_oper):
                    assignment = cond_oper
                    v = v.replace(cond_oper, '', 1)

                    if cond_oper.startswith('<') or cond_oper.startswith('>'):
                        try:
                            v = int(v)
                        except:
                            raise Exception('Conditional operator \"%s\" not compatible with non-integer value' % cond_oper)

                    break

            sql += '%s \"%s\" %s %s' % (condition, k, assignment, self._serialize(v))
            condition = ' AND'

        return sql

    def select(self, *args, **kwargs):
        sql = """SELECT %s FROM \"%s\"""" % (','.join(args), self.tablename)

        if kwargs:
            sql += self._where(kwargs)

        self._output('%s;' % sql)

    def create(self, **kwargs):
        keys, values = zip(*kwargs.items())
        values = [self._serialize(z) for z in values]

        self._output("""INSERT INTO \"%s\"(%s) VALUES(%s);""" % (
            self.tablename,
            ','.join(keys),
            ','.join(values)
        ))

    def delete(self, *args, **kwargs):
        sql = """DELETE FROM \"%s\"""" % self.tablename

        if kwargs:
            sql += self._where(kwargs)

        self._output('%s;' % sql)

    def update(self, *args, **kwargs):
        set_key = args[0].keys()[0]
        set_val = args[0][set_key]

        sql = """UPDATE \"%s\" SET %s=%s""" % (
            self.tablename,
            set_key,
            self._serialize(set_val)
        )

        if kwargs:
            sql += self._where(kwargs)

        self._output(sql)


class address(Shop):
    tablename = 'address'
    fields = ['id', 'name', 'street', 'zipcode', 'housenumber', 'city', 'country', 'phone', 'email']

    def __init__(self):
        super(address, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(address, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(address, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(address, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(address, self).update(*args, **kwargs)


class order(Shop):
    tablename = 'order'
    fields = ['id', 'date', 'status', 'customer', 'recipient']

    def __init__(self):
        super(order, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(order, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(order, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(order, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(order, self).update(*args, **kwargs)


class orderitem(Shop):
    tablename = 'orderitem'
    fields = ['id', 'product', 'amount', 'itemprice', 'order']

    def __init__(self):
        super(orderitem, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(orderitem, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(orderitem, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(orderitem, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(orderitem, self).update(*args, **kwargs)


class orderstatus(Shop):
    tablename = 'orderstatus'
    fields = ['id', 'name']

    def __init__(self):
        super(orderstatus, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(orderstatus, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(orderstatus, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(orderstatus, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(orderstatus, self).update(*args, **kwargs)


class product(Shop):
    tablename = 'product'
    fields = ['id', 'name', 'description', 'price', 'url']

    def __init__(self):
        super(product, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(product, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(product, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(product, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(product, self).update(*args, **kwargs)


product().create(id=1, name='name', description='desc', price=5, url='http://test.nl')

product().read('id')
product().read('id', url='http://google.com', price=5)
product().read('id', price='>5')  # product().read('id', price=GT(5)) is netter, E_NOTIME

product().delete()
product().delete(id=">5")

product().update({'price': 3}) # product().filter(id=5).update(price=3) is netter, E_NOTIME
product().update({'price': 7}, id=2)
product().update({'price': 7}, name='foo', id=3)


# exceptions
# product().delete(idd=">5") # unknown field 'idd'
# product().read('id', price='>err') # illegal condition

Running the program yields the following output:

INSERT INTO "product"(url,price,id,description,name) VALUES("http://test.nl",5,1,"desc","name");
SELECT id FROM "product";
SELECT id FROM "product" WHERE "url" = "http://google.com" AND "price" = 5;
SELECT id FROM "product" WHERE "price" > 5;
DELETE FROM "product";
DELETE FROM "product" WHERE "id" > 5;
UPDATE "product" SET price=3
UPDATE "product" SET price=7 WHERE "id" = 2
UPDATE "product" SET price=7 WHERE "name" = "foo" AND "id" = 3

Apart from using double quotes instead of single quotes, the output is valid SQL.

Result

I had passed the assessment. They tested it with several queries, including SQL injection.

However, after receiving word that while I did pass the test, I did not get any follow up so weirdly enough I did not actually get the job. Hmmmz… too bad :-(