UNB/ CS/ David Bremner/ teaching/ cs2613/ assignments/ CS2613 Assignment 5

Overview

This assignment is based on the material covered in Lab 14, Lab 15, and Lab 16.

The goal of the assignment is to develop a simple query language that lets the user select rows and columns from a CSV File, in effect treating it like database.

Make sure you commit and push all your work using coursegit before 16:30 on Friday November 17.

General Instructions

Reading CSV Files

We will use the builtin Python CSV module to read CSV files.

def read_csv(filename):
    '''Read a CSV file, return list of rows'''

    import csv
    with open(filename,'rt',newline='') as f:
        reader = csv.reader(f, skipinitialspace=True)
        return [ row for row in reader ]

Save the following as "~/cs2613/assignments/A5/test1.csv"; we will use it several tests. You should also construct your own example CSV files and corresponding tests.

name,   age,    eye colour
Bob,    5,      blue
Mary,   27,     brown
Vij,    54,     green

Here is a test to give you the idea of the returned data structure from read_csv.

def test_read_csv():
    assert read_csv('test1.csv') == [['name', 'age', 'eye colour'],
                                     ['Bob', '5', 'blue'],
                                     ['Mary', '27', 'brown'],
                                     ['Vij', '54', 'green']]

Query

Write a function query that wraps it's three arguments in a dict and passes the following test.

def test_query():
    assert query('==', 'age', 1) == { 'op': '==',
                                      'left': 'age',
                                      'right': 1 }

Parsing Headers

The first row most in most CSV files consists of column labels. We will use this to help the user access columns by name rather than by counting columns.

Write a function header_map that builds a dictionary from labels to column numbers.

table = read_csv('test1.csv')

def test_header_map_1():
    hmap = header_map(table[0])
    assert hmap == { 'name': 0, 'age': 1, 'eye colour': 2 }

Matching rows

We are going to write a simple query languge where each query is produced by a call to the function query above. with the op argument as one of ==, <=, and >=. In the initial version, left and right are numbers or strings. Strings are interpreted as follows: if they are column labels, retrieve the value in that column; otherwise treat it as a literal string. With this in mind, write a function check_row that takes a row in dictionary form, and checks if it matches a query dict.

def test_check_row():
    row = {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}
    assert check_row(row, query('==', 'age', 5))
    assert not check_row(row, query('==', 'eye colour', 5))
    assert check_row(row, query('==', 'eye colour', 'blue'))
    assert check_row(row, query('>=', 'age', 4))
    assert check_row(row, query('<=','age', 1000))

Extending the query language

Extend check_row so that it supports operations AND and OR. For these cases both left and right operands must be queries. Hint: this should only be a few more lines of code.

def test_check_row_logical():
    row = {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}
    assert check_row(row,
                     query('OR',
                           query('==', 'age', 5),
                           query('==','eye colour', 5)))
    assert not check_row(row,
                     query('AND',
                           query('==', 'age', 5),
                           query('==','eye colour', 5)))

Filtering tables

Use your previously developed functions to implement a function filter_table that selects certain rows of the table according to a query. You might want a helper function to convert rows to dict form compatible with check_row.

def test_filter_table1():
    assert filter_table(table,query('>=', 'age', 0)) == [['name', 'age', 'eye colour'],
                                                         ['Bob', '5', 'blue'],
                                                         ['Mary', '27', 'brown'],
                                                         ['Vij', '54', 'green']]

    assert filter_table(table,
                        query('<=','age', 27)) == [['name', 'age', 'eye colour'],
                                                   ['Bob', '5', 'blue'],
                                                   ['Mary', '27', 'brown']]

    assert filter_table(table,
                        query('==', 'eye colour', 'brown')) == [['name', 'age', 'eye colour'],
                                                                ['Mary', '27', 'brown']]

    assert filter_table(table,
                        query('==','name', 'Vij')) == [['name', 'age', 'eye colour'],
                                                       ['Vij', '54', 'green']]


def test_filter_table2():
    assert filter_table(table,
                        query('AND',
                              query('>=', 'age', 0),
                              query('>=', 'age','27'))) == [['name', 'age', 'eye colour'],
                                                           ['Mary', '27', 'brown'],
                                                           ['Vij', '54', 'green']]


    assert filter_table(table,
                        query('AND',
                              query('<=', 'age', 27),
                              query('>=', 'age','27'))) == [['name', 'age', 'eye colour'],
                                                            ['Mary', '27', 'brown']]

    assert filter_table(table,
                        query('OR',
                              query('==', 'eye colour', 'brown'),
                              query('==', 'name', 'Vij'))) == [['name', 'age', 'eye colour'],
                                                               ['Mary', '27', 'brown'],
                                                               ['Vij', '54', 'green']]