| misc

In January of 2016 I received a coding assessment while applying for a freelance position with a security company.

Nowadays I am reluctant to do these type of assessments, as I feel that there is enough code and projects online that showcase my abilities. Back then I gave it a shot anyway. I received the following email:

Build a SQL query builder in 2 hours

Implement: CRUD, JOINs, SUM() and GROUP BY.

Build a query builder library in Python. This library should generate SQL queries that can be used to query the attached data set. First implement basic CRUD queries that work on single tables, then extend the library with more advanced queries like selects and updates over multiple tables, search/filter functionality et cetera.

The library doesn’t have to run the queries themselves, but should return the composed queries.

For writing the library, make no use of other libraries other than a standard database access library (e.g. sqlite) if you wish to access the database from the library. The goal is to see how you tackle the problem yourself; hence minimal external code inclusion. Please write the library using Python.

You have 2 hours for this assignment. It is not to be expected that you are able to implement all the functionality in that time frame, but we are curious to see how far you get.

  • The DB is a dump of a small SQLite database. The easiest way to work with the data is to import it into sqlite again (sqlite3 task.db < assessment.dbdump.txt)
  • The database resembles a small webshop: products, orders and contact details for both billing and shipping.
  • Many fields are to be used as foreign keys: these are most of the int fields.
  • The order table contains order information. The products that are part of an order, are in orderitem. The amount field denotes how many of this product is in the order, the itemprice is the price at which the item was purchased (per item).

Again, the code you will write should be a library that returns SQL queries, not a program that executes them. To give you an idea, it should be able to return the following queries, in order of importance:

SELECT `id`, `product`, `amount`, `itemprice`, `order` FROM "orderitem";
SELECT `id`, `product`, `amount`, `itemprice`, `order` FROM "orderitem" WHERE `id` = 3;
INSERT INTO "product" (`name`, `description`, `price`, `url`) VALUES (Doctor Who Adipose Stress Toy, Adorable squishy baby Adipose, 2499, 'http://www.thinkgeek.com/product/e5ed/');
UPDATE "product" SET `price` = 3499 WHERE `id` = 1;
DELETE FROM "product" WHERE `id` = 3;
SELECT * FROM "orderitem" WHERE `order` = 1 AND `amount` > 2;
SELECT "order", SUM(`itemprice` * `amount`) FROM `orderitem` GROUP BY `order`;
SELECT "order".* FROM `order` LEFT JOIN "orderstatus" ON "orderstatus".`id` = `order`.`status` WHERE "orderstatus".`name` = 'pending';
SELECT SUM("orderitem".`itemprice` * "orderitem".`amount`) AS `price` FROM "order" INNER JOIN "orderitem" ON "orderitem".`order` = "order".`id` INNER JOIN "orderstatus" ON "orderstatus".`id` = "order".`status` WHERE "order".`id` = 1 AND "orderstatus".`name` = 'accepted' GROUP BY "order".`id`;

You may decide on the library interface yourself. For instance, whether all columns should be passed to the library, whether tables should be pre-registered et cetera. Good luck!

My entry

I was able to implement CRUD before the time was up. In retrospect I would have done some things differently, but nonetheless I felt confident about the following snippet:

#!/usr/bin/python2
import json,sys


def validate_fields(f):
    def wrapper(self, *args, **kwargs):
        for k in sum([list(args), kwargs.keys()], []):
            if not isinstance(k, dict) and not k in self.fields:
                raise Exception('Unknown field \"%s\" for table \"%s\"' % (k, self.tablename))

        return f(self, *args, **kwargs)
    return wrapper


class Shop(object):
    raw = False
    tablename = None
    fields = None

    def _serialize(self, v):
        # TO-DO: support datetime
        return json.dumps(v)

    def _output(self, sql):
        if self.raw:
            sys.stdout.write(sql)
        else:
            print sql

    def _where(self, d):
        sql = ''
        condition = ' WHERE'

        for k, v in d.iteritems():
            assignment = '='

            for cond_oper in '>=', '<=', '<>', '>', '<':
                if isinstance(v, str) and v.startswith(cond_oper):
                    assignment = cond_oper
                    v = v.replace(cond_oper, '', 1)

                    if cond_oper.startswith('<') or cond_oper.startswith('>'):
                        try:
                            v = int(v)
                        except:
                            raise Exception('Conditional operator \"%s\" not compatible with non-integer value' % cond_oper)

                    break

            sql += '%s \"%s\" %s %s' % (condition, k, assignment, self._serialize(v))
            condition = ' AND'

        return sql

    def select(self, *args, **kwargs):
        sql = """SELECT %s FROM \"%s\"""" % (','.join(args), self.tablename)

        if kwargs:
            sql += self._where(kwargs)

        self._output('%s;' % sql)

    def create(self, **kwargs):
        keys, values = zip(*kwargs.items())
        values = [self._serialize(z) for z in values]

        self._output("""INSERT INTO \"%s\"(%s) VALUES(%s);""" % (
            self.tablename,
            ','.join(keys),
            ','.join(values)
        ))

    def delete(self, *args, **kwargs):
        sql = """DELETE FROM \"%s\"""" % self.tablename

        if kwargs:
            sql += self._where(kwargs)

        self._output('%s;' % sql)

    def update(self, *args, **kwargs):
        set_key = args[0].keys()[0]
        set_val = args[0][set_key]

        sql = """UPDATE \"%s\" SET %s=%s""" % (
            self.tablename,
            set_key,
            self._serialize(set_val)
        )

        if kwargs:
            sql += self._where(kwargs)

        self._output(sql)


class address(Shop):
    tablename = 'address'
    fields = ['id', 'name', 'street', 'zipcode', 'housenumber', 'city', 'country', 'phone', 'email']

    def __init__(self):
        super(address, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(address, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(address, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(address, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(address, self).update(*args, **kwargs)


class order(Shop):
    tablename = 'order'
    fields = ['id', 'date', 'status', 'customer', 'recipient']

    def __init__(self):
        super(order, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(order, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(order, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(order, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(order, self).update(*args, **kwargs)


class orderitem(Shop):
    tablename = 'orderitem'
    fields = ['id', 'product', 'amount', 'itemprice', 'order']

    def __init__(self):
        super(orderitem, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(orderitem, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(orderitem, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(orderitem, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(orderitem, self).update(*args, **kwargs)


class orderstatus(Shop):
    tablename = 'orderstatus'
    fields = ['id', 'name']

    def __init__(self):
        super(orderstatus, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(orderstatus, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(orderstatus, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(orderstatus, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(orderstatus, self).update(*args, **kwargs)


class product(Shop):
    tablename = 'product'
    fields = ['id', 'name', 'description', 'price', 'url']

    def __init__(self):
        super(product, self).__init__()

    @validate_fields
    def read(self, *args, **kwargs):
        super(product, self).select(*args, **kwargs)

    @validate_fields
    def create(self, **kwargs):
        super(product, self).create(**kwargs)

    @validate_fields
    def delete(self, *args, **kwargs):
        super(product, self).delete(*args, **kwargs)

    @validate_fields
    def update(self, *args, **kwargs):
        super(product, self).update(*args, **kwargs)


product().create(id=1, name='name', description='desc', price=5, url='http://test.nl')

product().read('id')
product().read('id', url='http://google.com', price=5)
product().read('id', price='>5')  # product().read('id', price=GT(5)) is netter, E_NOTIME

product().delete()
product().delete(id=">5")

product().update({'price': 3}) # product().filter(id=5).update(price=3) is netter, E_NOTIME
product().update({'price': 7}, id=2)
product().update({'price': 7}, name='foo', id=3)


# exceptions
# product().delete(idd=">5") # unknown field 'idd'
# product().read('id', price='>err') # illegal condition

Running the program yields the following output:

INSERT INTO "product"(url,price,id,description,name) VALUES("http://test.nl",5,1,"desc","name");
SELECT id FROM "product";
SELECT id FROM "product" WHERE "url" = "http://google.com" AND "price" = 5;
SELECT id FROM "product" WHERE "price" > 5;
DELETE FROM "product";
DELETE FROM "product" WHERE "id" > 5;
UPDATE "product" SET price=3
UPDATE "product" SET price=7 WHERE "id" = 2
UPDATE "product" SET price=7 WHERE "name" = "foo" AND "id" = 3

Apart from using double quotes instead of single quotes, the output is valid SQL.

Result

I had passed the assessment. They tested it with several queries, including SQL injection.

However as I have previously mentioned - I am against these type of assessments, as there is too much time (and stress) involved without pay. I’d rather not work for free.