UNB/ CS/ David Bremner/ tags/ python

This feed contains pages with tag "python".

# Before the lab

• Try to do as many of these questions as you can
• Bring your questions.

# Racket questions

## Hash tables, recursion

Write a racket function `list->hash` that, given a list, returns a hash table which maps the number `i` to the list element in `i`th position (starting from position `1`). For full marks, your function should

• pass the following tests,
• be tail recursive, and
• not use mutation (no functions ending in '!').
```(module+ test
(require rackunit)
(define hash-table (list->hash (list "a" "b" "c") (hash) 1))
(check-equal? (hash-ref hash-table 1) "a")
(check-equal? (hash-ref hash-table 2) "b")
(check-equal? (hash-ref hash-table 3) "c"))
```

## Tail recursion + lists I

Complete the Racket function `sum-pos-helper` to tail-recursively sum the positive elements of a list of integers.

```#lang racket
(define (sum-pos . nums)
(define (sum-pos-helper nums acc)
(cond
[(empty? nums)   ]
[(positive? (first nums)) ]
[else  ]))
(sum-pos-helper nums 0))

(module+ test
(require rackunit)
(check-equal? (sum-pos) 0)
(check-equal? (sum-pos 1) 1)
(check-equal? (sum-pos 1 0 -1) 1)
(check-equal? (apply sum-pos (range -9 10)) 45))
```

## Tail recursion + lists II

Write a tail-recursive function `sixes-and-sevens` that keeps all multiples of 6 or 7. It should in particular pass the following test.

``````(module+ test
(require rackunit)
(check-equal? (sixes-and-sevens '(1 6 7 12)) '(6 7 12)))``````

You may use `reverse` in your solution, but for full marks do not use `append`.

### Testing

Copy your solution to the previous question and add at least 3 tests tests for your `sixes-and-sevens` function. For full marks you should test 3 logically different things, and document what you are testing.

## Higher order functions

Write the Racket function `binmap` to apply the passed binary operator (function) to the two given lists. Your function should use no other list functions other than `first`, `rest`, `cons`, and `empty?` (in particular don’t use the builtin function `map`). Your function should pass the following tests.

```(module+ test
(require rackunit)

(check-equal? (binmap + '(1 2 3) '(4 5 6)) '(5 7 9))
(check-equal? (binmap * '(1 2 3) '(4 5 6)) '(4 10 18))

(check-equal? (binmap string-append '("hello" "world ")
'(" mom" "travel"))
'("hello mom" "world travel"))

(check-equal? (binmap + '(1 2 3) '(4 5 6 7)) '(5 7 9))
(check-equal? (binmap + '(1 2 3 4) '(4 5 6)) '(5 7 9)))
```

## More Tail recursion

Complete the Racket function `helper` to re-impliment the function `binmap` from the previous question tail-recursively. Provide appropriate tests for `binmap2`. Your function `helper` should use no other list functions other than `first`, `rest`, `cons`, and `empty?`.

```(define (binmap2 op list1 list2)
(define (helper lst1 lst2 acc) #|function body goes here|#)
(reverse (helper list1 list2 '())))
```

# JavaScript Questions

## Recursion, classes

In this question, the `Expr` JavaScript class represents simple expressions using multiplication and addition, e.g. `(6*9)+(3*4)`. Write the `eval` method for `Expr`. Your method should compute the value of the expression (a number), and should pass the following `jasmine` tests

```let Expr=require("../expr.js").Expr;

describe("expr",
function() {
let six_plus_nine = new Expr('+', 6, 9);
let six_times_nine = new Expr('*', 6, 9);
it("addition",
function() {
expect(six_plus_nine.eval()).toBe(15);
});
it("multiplication",
function() {
expect(six_times_nine.eval()).toBe(54);
});
it("compound",
function() {
expect(new Expr('+', six_times_nine,
six_plus_nine).eval()).toBe(69);
});});
```

## Strings, Higher order functions

Complete the following JavaScript function to concatenate strings. For full marks, use the `reduce` method.

```function join(lst) {

}
exports.join = join;
```

Your function should pass the following tests.

```join=require("./join.js").join;
describe("join", function () {
it("empty", function () { expect(join([]), ""); });
it("single", function () {
expect(join(["holidays"]), "holidays");
});
it("several", function () {
expect(join(["happy", " ", "holidays"]), "happy holidays");
});
});
```

## More strings and higher order functions

Use a higher-order array method (i.e. no loops or recursion) to impliment a JavaScript function `revapp` that appends its string arguments in reverse order. Your code should pass the following test.

```revapp=require("../revapp.js").revapp;
describe("revapp", function () {
it("letters", function () {
expect(revapp("a","b","c","d")).toEqual("dcba");
})});
```

## JSON, classes

Write a JavaScript class called `Student` that supports writing and reading objects from disk. Your class should pass the following Jasmine test. The first argument to the constructor is the name, and the second the id number of the student.

```let Student = require("../student.js").Student;

describe("json", function () {
let bob = new Student("bob",42);

it("roundtrip", function() {
let filename = "test-roundtrip.json";
bob.write(filename);
expect(Student.read(filename)).toEqual(bob);
})});
```

You may use the following module (extended from Lab 11)

```let fs = require("fs");

function read_json_file(filename) {
return JSON.parse(fs.readFileSync(filename));
}

function write_json_file(obj,filename) {
fs.writeFileSync(filename,JSON.stringify(obj));
}

exports.read_json_file=read_json_file;
exports.write_json_file=write_json_file;
```

## More classes

Write a class `Person` that passes the following `jasmine` tests. You need to write a constructor and a method `birthday`.

```let person=require("../person.js");
let Person=person.Person;
let People=person.People;

describe("person",
function () {
let bob=new Person("bob", 42);

it("constructor",
function () {
expect(bob.name).toEqual("bob");
expect(bob.age).toEqual(42);
});
it("birthday does not mutate",
function (){
let newbob = bob.birthday();
expect(bob.age).toEqual(42);
expect(newbob.age).toEqual(43);
});

});
```

Write a second class `People` that passes following `jasmine` tests. You need to write a constructor and a `length` attribute.

```describe("people",
function() {
let people=new People("ancestry.json");
it("read all records",
function () {
expect(people.length).toEqual(10);
});
it("hash table",
function() {
console.log(people);
expect(people["Clara Aernoudts"]).toEqual(new Person("Clara Aernoudts",95));
});
});
```

You will need the file `ancestry.json` with the following content

```[
{"name": "Carolus Haverbeke", "sex": "m", "born": 1832, "died": 1905, "father": "Carel Haverbeke", "mother": "Maria van Brussel"},
{"name": "Emma de Milliano", "sex": "f", "born": 1876, "died": 1956, "father": "Petrus de Milliano", "mother": "Sophia van Damme"},
{"name": "Maria de Rycke", "sex": "f", "born": 1683, "died": 1724, "father": "Frederik de Rycke", "mother": "Laurentia van Vlaenderen"},
{"name": "Jan van Brussel", "sex": "m", "born": 1714, "died": 1748, "father": "Jacobus van Brussel", "mother": "Joanna van Rooten"},
{"name": "Philibert Haverbeke", "sex": "m", "born": 1907, "died": 1997, "father": "Emile Haverbeke", "mother": "Emma de Milliano"},
{"name": "Jan Frans van Brussel", "sex": "m", "born": 1761, "died": 1833, "father": "Jacobus Bernardus van Brussel", "mother":null},
{"name": "Pauwels van Haverbeke", "sex": "m", "born": 1535, "died": 1582, "father": "N. van Haverbeke", "mother":null},
{"name": "Clara Aernoudts", "sex": "f", "born": 1918, "died": 2012, "father": "Henry Aernoudts", "mother": "Sidonie Coene"},
{"name": "Emile Haverbeke", "sex": "m", "born": 1877, "died": 1968, "father": "Carolus Haverbeke", "mother": "Maria Sturm"},
{"name": "Lieven de Causmaecker", "sex": "m", "born": 1696, "died": 1724, "father": "Carel de Causmaecker", "mother": "Joanna Claes"}
]
```

# Python questions

## Generator

Write a python generator `powergen(n)` that returns a sequence
1, n, n2, n3, n4, …
when passed to `next`.

```def powergen(n):

def test_powergen():
gen = powergen(2)
first = next(gen)
second = next(gen)
third = next(gen)
assert (first == 1)
assert (second == 2)
assert (third == 4)
```

### List comprehension

Use a list comprehension to complete the following test for `powergen`.

```def test_powergen_list():
gen = powergen(3)
threes =
assert(threes == [1, 3, 9, 27, 81, 243, 729, 2187, 6561])
```

## FizzBuzz Iterator

Complete the `next` method to return `'FizzBuzz', 'Fizz', 'Buzz'` for iterations divisible by 3 and 5, 3 only, and 5 only, respectively. `next` should return the iteration number otherwise.

```
class FizzBuzz:
def __init__(self, max=100):
self.n=1; self.max=max

def __next__(self):

def test_fizzbuzz_next():
fb=FizzBuzz(15)
assert (list(fb) == [1,2,'Fizz',4,'Buzz','Fizz',7,8,'Fizz',
'Buzz', 11, 'Fizz', 13, 14,'FizzBuzz'])
```

### FizzBuzz Iterator part II

Complete the `iter` method for the `FizzBuzz` class.

```def __iter__(self):

def test_fizzbuzz_iter():
fb=FizzBuzz(100)
first = list(fb); second = list(fb)
assert (first == second)
```

## ArithSeq iterator

Write a python iterator class `ArithSeq` where `ArithSeq(first, step, max)` creates a sequence of integers starting at `first`, going up by `step`, and stopping at the last value in the sequence no larger than `max`. Your class should pass the following tests.

```from arithseq import ArithSeq

def test_evens():
assert [ x for x in ArithSeq(0,2,10) ] == [0,2,4,6,8,10]

def test_odds():
assert [ x for x in ArithSeq(1,2,10) ] == [1,3,5,7,9]<
```

### Testing

Write two more tests for `ArithSeq`. Explain what you are testing, and why that case is not covered by the given tests.

## ArithSeq function

Write a python function `arith_seq` using a list comprehension that passes the following tests (i.e. produces the same sequences as the class `ArithSeq`).

```from arithseq2 import arith_seq

def test_evens():
assert arith_seq(0,2,10) == [0,2,4,6,8,10]

def test_odds():
assert arith_seq(1,2,10) == [1,3,5,7,9]</code></pre></li>
```

# Octave Questions

## Image processing

Complete the Octave function `nbrcount` that counts for each element of a zero/one matrix, how many of the neighbours are one. For full marks your solution should be vectorized.

```function out = nbrcount(img)

endfunction

%!test
%! A=       [1,0,0;
%!           0,0,0;
%!           0,0,1;
%!           1,0,0];
%! counts = [0,1,0; 1,2,1; 1,2,0; 0,2,1]
%! assert(nbrcount(A) == counts)
```

### Isolated

Use the `nbrcount` function from the previous question to define a function `isolated` that finds all the isolated ones (those ones whose neighbours are all zero). For full marks your solution should be vectorized.

```function out = isolated(img)

endfunction

%!test
%! A= [1,0,0; 0,0,0; 0,0,1; 1,0,0];
%! assert(isolated(A) == A)

%!test
%! A=[1,0,0;
%!    0,0,0;
%!    0,1,1;
%!    1,0,1];
%! assert(isolated(A) == [1,0,0; 0,0,0; 0,0,0;0,0,0])
```

## Normalize

Complete the vectorized Octave function `normalize` according to the given comment and tests. You do not need to copy the usage comment.

```##usage: matrix = percent(raw, maxes)
##
## raw - raw scores, one row per student.
## maxes - maximum possible for that column
##
## Output is a matrix one row per student, with ratios
function out=normalize(raw, maxes)

endfunction
%!test
%! #    journal,assgn,  midterm,final
%! mxs=[260,    60,     20,     60];
%! nrm = [0.9, 0.9, 0.9, 0.9; 0.75, 0.75, 0.75, 0.75; 1, 0, 1, 0];
%! raw=[234, 54, 18, 54; 195, 45, 15, 45; 260, 0, 20, 0];

%! assert(normalize(raw, mxs), nrm, eps);
```

### percent

Use the function `normalize` from the previous question to complete the following vectorized Octave function to calculate final percentages for students in a class. You do not need to copy the usage comment.

```##usage: scores = percent(raw, maxes, weights)
##
## raw - raw scores, one row per student.
## maxes - maximum possible for that column
## weights - weight for that column in percent
##
## Output is a column vector of a percent for each student.
function out=percent(raw, maxes, weights)

end

%!test
%! #    journal,assgn,  midterm,final
%! mxs=[260,    60,     20,     60];
%! wgt=[20,     30,     20,     30];
%! raw=[234,    54,     18,     54;
%!      195,    45,     15,     45;
%!      260,    0,      20,     0;
%!      0,      60,     0,      60;
%!      200,    40,     17,     33];
%! assert(percent(raw, mxs, wgt), [90;75;40;60;68.88], .01);
```

## Check diet

Write an octave function `checkdiet` that checks if a proposed diet meets the minimum daily requirements. Your function should pass the given tests. For full marks your function should be fully vectorized (i.e. no loops).

```## usage: passes = checkdiet(TABLE, MINS, DIET)
##
## Check if DIET passes the min daily requirements
##   TABLE(i,j) is the amount of nutrient i in food j
##   MINS(i) = minimum amount of nutrient i required
##   DIET(j) = amount of food j in proposed diet
function yesno = checkdiet(table, mins, diet)
end
%!test
%! assert(checkdiet(eye(3),ones(3,1),ones(3,1)) == 1)
%!test
%! assert(checkdiet(eye(3),zeros(3,1),[1;0;0]) == 1)
%!test
%! assert(checkdiet(eye(3),[1;0;0],zeros(3,1)) == 0)
%!test
%! assert(checkdiet(eye(3),[1;0;0],0.5*ones(3,1)) == 0)
%!test
%! assert(checkdiet(ones(3,3),[1;0;0],0.5*ones(3,1)) == 1)
```

### testing

Identify two weaknesses of the given test suite for `checkdiet` and add two new tests.

## mincol

Write the octave function `mincol` that finds the position of the smallest element in each row of a matrix. You function should pass the given test. For full marks your function should be fully vectorized (i.e. no loops).

```## usage: given a matrix of integers, for each row
## return the column containing the minimum value
function out = mincol(data)

endfunction
%!test
%! A = [1,2,3;
%!     3,2,4;
%!     1,2,0;
%!     5,3,4;
%!     3,2,1;
%!     -10,0,0];
%! assert (mincol(A) == [1;2;3;2;3;1]);
```
Posted Wed 09 Dec 2020 08:30:00 AM

Due: 2020-11-20 at 20:11

# Overview

This assignment is based on the material covered in Lab 16 and Lab 17.

The goal of the assignment is to develop an (extremely) simple text based accounting program. This is inspired by Ledger, a command line accounting tool. It is also a very simplified version of a Double-entry accounting system. A simple example of our language is as follows:

```open cash 100
open expenses 0
open equity 0
transfer cash expenses 50
transfer equity cash 100
balance cash
balance expenses
```

We expect output similar to "cash=150, expenses=50". As you can see, every transaction has two accounts (hence the name Double-entry). Here 'equity' and 'expenses' don't correspond to bank accounts, but nonetheless help us track our finances.

# General Instructions

• Every non-test function should have a docstring
• Use list and dictionary comprehensions as much as reasonable.
• Your code should pass all of the given tests, plus some of your own with different data.
• Reference PEP8 for coding style and formatting.
• For a detailed marking scheme, see the rubric

# Main program

In this assignment you are provided with the main program, and are asked to write the `scan` generator and the `Type` class. The latter should be a relatively trivial `enum` class.

```from scanner import scan, Type

def _next(scanner,wanted):
token=next(scanner)
if (token.type!=wanted):
raise ValueError(token.value)
return token

def ledger(string):
accounts = {}
scanner = scan(string)
for token in scanner:
if token.type == Type.BALANCE:
next_token = _next(scanner,Type.IDENT)
balance = 0
account_name = next_token.value
if account_name in accounts:
balance = accounts[account_name]
yield (account_name,balance)
elif token.type == Type.OPEN:
account_token = _next(scanner, Type.IDENT)
val_token = _next(scanner, Type.CURRENCY)
accounts[account_token.value]=val_token.value
elif token.type == Type.TRANSFER:
from_tok = _next(scanner, Type.IDENT)
to_tok = _next(scanner,Type.IDENT)
val_tok = _next(scanner, Type.CURRENCY)
if from_tok.value in accounts and to_tok.value in accounts:
accounts[from_tok.value] -= val_tok.value
accounts[to_tok.value] += val_tok.value
else:
raise ValueError("Unexpected token {:s}".format(token.type.name))
```

In general there 3 kinds of "statements" in our input format.

```BALANCE IDENT
OPEN IDENT CURRENCY
TRANSFER IDENT IDENT CURRENCY
```

When your scanner is complete, the following tests should pass. As usual you are responsible for adding any additional tests to ensure complete test coverage.

```from ledger import ledger

def test_empty():
assert list(ledger("")) == []

def test_balance():
assert list(ledger('''
balance cash
balance stock
''')) == [("cash",0),("stock",0)]

def test_open():
assert list(ledger('''
open cash 100
balance cash
''')) == [("cash",10000)]

def test_transfer():
assert list(ledger('''
open cash 100
open expenses 0
transfer cash expenses 50
balance cash
balance expenses
''')) == [("cash",5000),("expenses",5000)]
```

# Scanner

In order to get experience with iterators and regular expressions, our language will be token based rather than line based. Because of this we need to write a scanner (sometimes called a lexer or tokenizer). You might find it instructive to look at the tokenizer example from the python documentation, but note that this assignment makes several different design choices from that code.

## Using Enum

Write a subclass of Enum that passes the following test. This will be used by the scanner to classify the tokens (rather than e.g. having a different subclass for each token).

```def test_enum():
'''check that defined enum type matches assignment'''

assert sorted([ member.name for member in Type ]) == ["BALANCE", "CURRENCY", "IDENT", "OPEN", "TRANSFER"]
```

## Token Class

Our scanner will return objects rather than strings. Create a token class with two properties `type` and `value`. Add a `__str__` method to pass the following tests. Notice that internally currency is represented by an integer number of cents, but "pretty printed" as dollars and cents.

```def test_token():
token=Token(Type.IDENT,"hello")
assert token.type==Type.IDENT
assert token.value=="hello"

def test_str():
id=Token(Type.IDENT,"hello")
assert str(id) == '[IDENT: hello]'
money=Token(Type.CURRENCY,10042)
assert str(money) == '[CURRENCY: 100.42]'
```

### Equality testing

In our language keywords are case insensitive, but user defined identifiers are case sensitive. For either case we store the text as the user typed it in the value property. Currency equality is just numbers. Add an `__eq__` method to your `Token` class that makes the following tests pass. This method is mainly used in testing the scanner.

```def test_equal_ident():
assert Token(Type.IDENT,"Bob") == Token(Type.IDENT,"Bob")
assert Token(Type.IDENT,"Bob") != Token(Type.IDENT,"bOb")

def test_equal_keywords():
assert Token(Type.TRANSFER,"transfer") != Token(Type.OPEN,"open")
assert Token(Type.OPEN,"OPEN") == Token(Type.OPEN,"open")
assert Token(Type.BALANCE,"BALANCE") == Token(Type.BALANCE,"balance")

def test_equal_currency():
assert Token(Type.CURRENCY,1000) == Token(Type.CURRENCY,1000)
assert Token(Type.CURRENCY,1000) != Token(Type.CURRENCY,1001)
```

## `scan` generator

The real work of the scanner is in the `scan` function. Write the `scan` function as a generator L17.
Your `scan` function should use regular expressions to match the various kinds of tokens and `yield` appropriate `Token` objects. Note that you can pass `re.IGNORECASE` to various methods to match case insensitively.

```def test_scan_keywords():
scanner=scan('''TrAnsFer transfer
OPEN BALANCE balance''')
toks = [Token(Type.TRANSFER,"TrAnsFer"), Token(Type.TRANSFER,"transfer"),
Token(Type.OPEN,"OPEN"),
Token(Type.BALANCE,"BALANCE"),  Token(Type.BALANCE,"balance")]

assert [tok for tok in scanner] == toks

def test_scan_identifiers():
scanner=scan("equity cash end_of_the_world_fund")
assert list(scanner) == [Token(Type.IDENT,"equity"),
Token(Type.IDENT,"cash"),
Token(Type.IDENT,"end_of_the_world_fund")]
```

### Scanning numbers

As mentioned above, currency is stored as integers. That means you need to match the dollars and cents separately (e.g. with separate groups in the corresponding regular expression) and combine the results.

```def test_scan_currency():
scanner=scan("100 100.00 100.42 -123.45")
assert list(scanner) == [Token(Type.CURRENCY,10000),
Token(Type.CURRENCY,10000),
Token(Type.CURRENCY,10042),
Token(Type.CURRENCY,-12345)]
```

### Handling errors

In case of unmatched text, you should raise `ValueError`, with the unmatched text as a parameter.

```def test_scan_bad():
scanner=scan("&crash")
with pytest.raises(ValueError, match="&crash"):
next(scanner)
```
Posted Fri 20 Nov 2020 08:11:00 PM Tags: /tags/python

# Counter, generators, and classes

Time
15 minutes
Activity
Demo

Recall our generator based counter from L17.

```def make_counter(x):
print('entering make_counter')
while True:
yield x
print('incrementing x')
x = x + 1
```

In Lab 17 we almost simulated the generator behaviour with a closure, except that `next(counter)` was replaced by `counter()`. We can provide a compatible interface by using a python class.

```class Counter:
"Simulation of generator using only __next__ and __init__"
def __init__(self,x):
self.x = x
self.first = x

def __next__(self):

self.x = self.x + 1
return self.x - 1

print('first')
counter = Counter(100)
print('second')
print(next(counter))
print('third')
print(next(counter))
print('last')
```
• Observe that the implimentation is closer to the closure based version of L17 than the original generator version.
• It's a recurring theme in Python to have some syntax/builtin-function tied to defining a special `__foo__` method

# Fibonacci, again

Time
35 minutes
Activity
individual

Save the generated based Fibonacci example as `~/fcshome/cs2613/L18/fibgen.py`

Save the following in `~/fcshome/cs2613/L18/fib.py`

```#!/usr/bin/python3
class Fib:
def __init__(self,max):
self.max = max
self.a = 0
self.b = 1

def __next__(self):
if self.a < self.max:

else:
raise StopIteration
```

Complete the definition of `__next__` so that following test passes. Hint: Consider the following closure based version from Lab 17

```from fib import Fib
from fibgen import fibgen

def test_fib_list():

genfibs=list(fibgen(100))
fibber=Fib(100)

fibs=[]
while True:
try:
fibs.append(next(fibber))
except:
break

assert genfibs == fibs
```

We may wonder why this odd looking while loop is used to build `fibs` rather than some kind of list comprehension. The following simpler version currently fails:

```def test_fib_list_2():
genfibs=list(fibgen(100))
classfibs=list(Fib(100))
assert genfibs==classfibs
```

This failure is due the fact that we haven't really built an iterator yet. Remember Python works by duck-typing, which means that an iterator is something that provides the right methods.

Add the following method to your `Fib` class. Observe the test above now passes.

```    def __iter__(self):
return self
```

In addition to signal to the Python runtime that some object is an iterator, the `__iter__` serves to restart the traversal of our "virtual list". If we want iterators to act as lists, then the following should really print the same list of Fibonacci numbers twice

```if __name__ == '__main__':
fibber = Fib(100)
for n in fibber:
print(n)
for n in fibber:
print(n)
```

Since it doesn't, let's formalize that as a test. Modify the `__init__` and (especially) the `__iter__` method so that following test passes.

```def test_fib_restart():
fibobj = Fib(100)
list1 = list(fibobj)
list2 = list(fibobj)
assert list1 == list2
```

# Object copying and equality

Time
25 min
Activity
individual.

Save the following skeleton as `~/fcshome/cs2613/labs/L18/expr.py`.

```class Expr:
def __init__(self,op,left,right):
pass

def __eq__(self, other):
"""Overrides the default implementation
```

Replace the definitions of `__init__` and `__eq__` so that the following test passes. You likely want either the `vars` builtin function or the `__dict__` method.

When complete, the following tests should pass.

```from expr import Expr
from copy import deepcopy

six_plus_nine = Expr('+', 6, 9)
six_times_nine = Expr('*', 6, 9)
compound1 =  Expr('+', six_times_nine, six_plus_nine)
compound2 =  Expr('*', six_times_nine, compound1)
compound3 =  Expr('+', compound2, 3)

def test_equality():
assert six_plus_nine == deepcopy(six_plus_nine)
assert compound1 == deepcopy(compound1)
assert compound2 == deepcopy(compound2)
assert compound3 == deepcopy(compound3)
```

# Arithmetic

Time
25 min
Activity
individual.

Replace the `eval` method definition in your `Expr` class so that the following additional test passes; a simple `if/elif/else` block to test the operator should suffice..

```def test_basic():
assert six_plus_nine.eval() == 15
assert six_times_nine.eval() == 54
```

Use deepcopy to update `test_basic` to make sure that eval does not modify the object `self`.

If the following test does not already pass, update your `eval` method so that it does

``` def test_recur():
assert compound1.eval() == 69
assert compound2.eval() == 3726
assert compound3.eval() == 3729
```

Add similar checking for mutation to `test_recur`

Posted Wed 18 Nov 2020 08:30:00 AM

# Overview

This assignment is based on the material covered in Lab 14 and (mainly) Lab 15.

The goal of the assignment is to develop a simple query language that lets the user select rows and columns from a CSV File, in effect treating it like database.

• Make sure you commit and push all your work using coursegit before 16:30 on Friday November 6.

# General Instructions

• Every non-test function should have a docstring
• Feel free to add docstrings for tests if you think they need explanation
• Use list and dictionary comprehensions as much as reasonable.
• Your code should pass all of the given tests, plus some of your own with different data. If you want, you can use some of the sample data from the US Government College Scorecard. I've selected some of the data into smaller files:

2015-1000.csv.gz
Thu 29 Oct 2020 03:30:03 PM

2013-100.csv.gz
Thu 29 Oct 2020 03:30:03 PM

2014-100.csv.gz
Thu 29 Oct 2020 03:30:03 PM

2013-1000.csv.gz
Thu 29 Oct 2020 03:30:03 PM

2014-1000.csv.gz
Thu 29 Oct 2020 03:30:03 PM

2015-100.csv.gz
Thu 29 Oct 2020 03:30:03 PM

• A marking rubric is available.

# Reading CSV Files

We will use the builtin Python CSV module to read CSV files.

```def read_csv(filename):
'''Read a CSV file, return list of rows'''

import csv
with open(filename,'rt',newline='') as f:
reader = csv.reader(f, skipinitialspace=True)
return [ row for row in reader ]
```

Save the following as "~/fcshome/assignments/A5/test1.csv"; we will use it several tests. You should also construct your own example CSV files and corresponding tests.

```name,   age,    eye colour
Bob,    5,      blue
Mary,   27,     brown
Vij,    54,     green
```

Here is a test to give you the idea of the returned data structure from `read_csv`.

```def test_read_csv():
assert read_csv('test1.csv') == [['name', 'age', 'eye colour'],
['Bob', '5', 'blue'],
['Mary', '27', 'brown'],
['Vij', '54', 'green']]
```

# Parsing Headers

The first row most in most CSV files consists of column labels. We will use this to help the user access columns by name rather than by counting columns.

Write a function `header_map` that builds a dictionary from labels to column numbers.

```table = read_csv('test1.csv')

def test_header_map_1():
hmap = header_map(table[0])
assert hmap == { 'name': 0, 'age': 1, 'eye colour': 2 }
```

# Selecting columns

Use your implimentation of `header_map` to write a function `select` that creates a new table with some of the columns of the given table.

```def test_select_1():
assert select(table,{'name','eye colour'}) == [['name', 'eye colour'],
['Bob',  'blue'],
['Mary', 'brown'],
['Vij',  'green']]
```

# Transforming rows into dictionaries

Sometimes it's more convenient to work with rows of the table as dictionaries, rather than passing around the map of column labels everwhere. Write a function `row2dict` that takes the output from header_map, and a row, and returns a dictionary representing that row (column order is lost here, but that will be ok in our application).

```def test_row2dict():
hmap = header_map(table[0])
assert row2dict(hmap, table[1]) == {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}
```

# Matching rows

We are going to write a simple query languge where each query is a 3-tuple `(left, op, right)`, and `op` is one of `=`, `<`, and `>`. In the initial version, `left` and `right` are numbers or strings. Strings are interpreted as follows: if they are column labels, retrieve the value in that column; otherwise treat it as a literal string. With this in mind, write a function `check_row` that takes a row in dictionary form, and checks if it matches a query tuple.

```def test_check_row():
row = {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}
assert check_row(row, ('age', '=', 5))
assert not check_row(row, ('eye colour', '=', 5))
assert check_row(row, ('eye colour', '=', 'blue'))
assert check_row(row, ('age', '>', 4))
assert check_row(row, ('age', '<', 1000))
```

# Extending the query language

Extend `check_row` so that it supports operations `AND` and `OR`. For these cases both left and right operands must be queries. Hint: this should only be a few more lines of code.

```def test_check_row_logical():
row = {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}
assert check_row(row, (('age', '=', 5),'OR',('eye colour', '=', 5)))
assert not check_row(row, (('age', '=', 5),'AND',('eye colour', '=', 5)))
```

# Filtering tables

Use you previously developed functions to impliment a function `filter_table` that selects certain rows of the table according to a query.

```def test_filter_table1():
assert filter_table(table,('age', '>', 0)) == [['name', 'age', 'eye colour'],
['Bob', '5', 'blue'],
['Mary', '27', 'brown'],
['Vij', '54', 'green']]

assert filter_table(table,('age', '<', 28)) == [['name', 'age', 'eye colour'],
['Bob', '5', 'blue'],
['Mary', '27', 'brown']]

assert filter_table(table,('eye colour', '=', 'brown')) == [['name', 'age', 'eye colour'],
['Mary', '27', 'brown']]

assert filter_table(table,('name', '=', 'Vij')) == [['name', 'age', 'eye colour'],
['Vij', '54', 'green']]

def test_filter_table2():
assert filter_table(table,(('age', '>', 0),'AND',('age','>','26'))) == [['name', 'age', 'eye colour'],
['Mary', '27', 'brown'],
['Vij', '54', 'green']]

assert filter_table(table,(('age', '<', 28),'AND',('age','>','26'))) == [['name', 'age', 'eye colour'],
['Mary', '27', 'brown']]

assert filter_table(table,(('eye colour', '=', 'brown'),
'OR',
('name','=','Vij'))) == [['name', 'age', 'eye colour'],
['Mary', '27', 'brown'],
['Vij', '54', 'green']]
```
Posted Fri 06 Nov 2020 04:30:00 PM Tags: /tags/python

# Discussion

Time
10 minutes
• Test coverage reports can be gotten with

``````  pytest-3 --cov=mymodule --cov-report=term-missing
``````

where mymodule is the name of the module in question

• Any other questions about A4?

# Splitting strings

Time
20 minutes
Activity
Individual

Our first approach to parsing CSV files uses builtin string methods. Use the split method, the splitlines method, and a list comprehension to implement the function `split_csv` (spoiler: this a bad CSV parser, missing many special cases). Be careful of introducing extra space when you copy and paste.

```from parse_csv import split_csv

test_string_1 = """OPEID,INSTNM,TUITIONFEE_OUT
02503400,Amridge University,6900
00100700,Central Alabama Community College,7770
01218200,Chattahoochee Valley Community College,7830
00101500,Enterprise State Community College,7770
00106000,James H Faulkner State Community College,7770
00101700,Gadsden State Community College,5976
00101800,George C Wallace State Community College-Dothan,7710
"""

table1 = [['OPEID', 'INSTNM', 'TUITIONFEE_OUT'],
['02503400', 'Amridge University', '6900'],
['00100700', 'Central Alabama Community College', '7770'],
['01218200', 'Chattahoochee Valley Community College', '7830'],
['00101500', 'Enterprise State Community College', '7770'],
['00106000', 'James H Faulkner State Community College', '7770'],
['00101700', 'Gadsden State Community College', '5976'],
['00101800', 'George C Wallace State Community College-Dothan', '7710']]

def test_split_1():
assert split_csv(test_string_1) == table1
```

# Regular Expressions

Time
10 minutes
Activity
Demo / Group discussion

To get familiar with regular expressions, we follow the street address Case Study.

Try the following evaluations in a python REPL.

```>>> '100 NORTH MAIN ROAD'.replace('ROAD', 'RD.')
>>> s = '100 NORTH BROAD ROAD'
>>> s.replace('ROAD', 'RD.')
# oops
>>> s[:-4] + s[-4:].replace('ROAD', 'RD.')
# ugh, that code
>>> import re
>>> re.sub('ROAD\$', 'RD.', s)
# what dark magic is this?
```

Regular expressions are a domain specific language that allow us to specify complicated string operations. In practice, the simple `\$` we used above is not enough.

```>>> s = '100 BROAD'
>>> re.sub('ROAD\$', 'RD.', s)
# New regex feature \b.
>>> re.sub('\\bROAD\$', 'RD.', s)
# Raw strings reduce \ overload
>>> re.sub(r'\bROAD\$', 'RD.', s)
# Our new regex is too "narrow"
>>> s = '100 BROAD ROAD APT. 3'
>>> re.sub(r'\bROAD\$', 'RD.', s)
>>> re.sub(r'\bROAD\b', 'RD.', s)
```

In the next part we will need to use a few fancier features.

```import re
rex=re.compile(r'([^0-9]+)')
for match in rex.findall('113abba999bjorn78910101benny888331dancing34234queen'):
print(match)
```

# Stripping Quotes

Time
20 minutes
Activity
Individual

In general entries of CSV files can have quotes, but these are not consider part of the content. In particular a correct version of `split_csv` should pass the following test.

```test_string_2 = '''OPEID,INSTNM,TUITIONFEE_OUT
02503400,"Amridge University",6900
00100700,"Central Alabama Community College",7770
01218200,"Chattahoochee Valley Community College",7830
00101500,"Enterprise State Community College",7770
00106000,"James H Faulkner State Community College",7770
00101700,"Gadsden State Community College",5976
00101800,"George C Wallace State Community College-Dothan",7710
'''

def test_split_2():
assert  split_csv(test_string_2) == table1
```

Ours doesn't yet, so let's try to fix that using regular expressions

• Fill in the regex in `strip_quotes` so that it passes the following test
```    def test_strip_quotes():
assert strip_quotes('"hello"') == 'hello'
assert strip_quotes('hello') == 'hello'
```
• Here is a skeleton for `strip_quotes`:
```    def strip_quotes(string):
strip_regex = re.compile(               )
search = strip_regex.search(string)
if search:
return search.group(1)
else:
return None
```
• You'll want to refer to regular expression features
• The use of the groups method means your regex solution should have exactly one set of `(…)` with a regex matching the non-quoted part.
• You can say something is optional by using `…?`, any number of repetitions with `…*`
• A character not in a given set can be matched with `[^…]`
• once you have a working `strip_quotes`, use it in `parse_csv` in order to make the test above pass.

# Handling quoted commas

Time
30 minutes
Activity
Individual

It turns out one of the main reasons for supporting quotes is to handle quoted commas. The function `split_row_3` is intended to split rows with exactly 3 columns.

```def test_split_row_3():
assert split_row_3('00101800,"George C Wallace State Community College, Dothan",7710') == \
['00101800', 'George C Wallace State Community College, Dothan', '7710']
```
• Read the discussion on verbose regular expressions

• Complete the definition of `split_row_3`. You'll want to figure out a regular expression that matches either a quoted or an unquoted column, and then repeat that 3 times. "Or" in regular expressions is implemented with `|`

• You will want to use `[^…]` once for each case; in one case for excluding `"` and in the other for excluding `,`.

```def split_row_3(string):
split_regex=re.compile(
r'''^   # start
("     "|     )     # column
,
("     "|     )     # column
,
("     "|     )     # column
\$''', re.VERBOSE)
search = split_regex.search(string)
if search:
return [ strip_quotes(col) for col in search.groups() ]
else:
return None
```
• Use your `split_row_3` function in `split_csv` to pass the following test
```test_string_3 = '''OPEID,INSTNM,TUITIONFEE_OUT
02503400,"Amridge University",6900
00100700,"Central Alabama Community College",7770
01218200,"Chattahoochee Valley Community College",7830
00101500,"Enterprise State Community College",7770
00106000,"James H Faulkner State Community College",7770
00101700,"Gadsden State Community College",5976
00101800,"George C Wallace State Community College, Dothan",7710
'''

table2 = [['OPEID', 'INSTNM', 'TUITIONFEE_OUT'],
['02503400', 'Amridge University', '6900'],
['00100700', 'Central Alabama Community College', '7770'],
['01218200', 'Chattahoochee Valley Community College', '7830'],
['00101500', 'Enterprise State Community College', '7770'],
['00106000', 'James H Faulkner State Community College', '7770'],
['00101700', 'Gadsden State Community College', '5976'],
['00101800', 'George C Wallace State Community College, Dothan', '7710']]

def test_split_3():
'''Check handling of quoted commas'''
assert  split_csv(test_string_3) == table2
```

# Parsing more columns

Time
20 minutes
Activity
Individual

Use your column matching regex, along with the `findall` method to match any number of columns. Call your new function `split_row`.

```def test_split_row():
assert split_row('00101800,"George C Wallace State Community College, Dothan",7710,",,,"') == \
['00101800', 'George C Wallace State Community College, Dothan', '7710',',,,']
```

Use your new function in place of `split_row_3` so that the following test (and all previous tests) pass

```test_string_4=\
'''OPEID,INSTNM,PCIP52,TUITIONFEE_OUT
00103800,Snead State Community College,0.0811,7830
00573400,H Councill Trenholm State Community College,0.0338,7524
00573300,"Bevill, State, Community College",0.0451,7800
00884300,Alaska Bible College,0,9300
00107100,Arizona Western College,0.0425,9530
00107200,"Cochise County Community College, District",0.0169,6000
'''

table3=[
['OPEID', 'INSTNM', 'PCIP52', 'TUITIONFEE_OUT'],
['00103800', 'Snead State Community College', '0.0811', '7830'],
['00573400', 'H Councill Trenholm State Community College', '0.0338', '7524'],
['00573300', 'Bevill, State, Community College', '0.0451', '7800'],
['00884300', 'Alaska Bible College', '0', '9300'],
['00107100', 'Arizona Western College', '0.0425', '9530'],
['00107200', 'Cochise County Community College, District', '0.0169', '6000']]

def test_split_4():
assert split_csv(test_string_4) == table3
```
Posted Wed 04 Nov 2020 08:30:00 AM Tags: /tags/python

# Discussion

Time
10 minutes
Activity
Discussion / Demo
• Questions about A4
• Time permitting, demo of `pudb3`, command line python debugger.

# Globbing and List comprehensions

Time
20 minutes
Activity
individual

List Comprehensions can be seen as a special kinds of for loops. Construct an equivalent list comprehension to the given for loop.

```#!/usr/bin/python3
import glob
import os
new_dir = os.path.expanduser("~/fcshome/cs2613/labs/test")

python_files_for = []

for file in glob.glob("*.py"):
python_files_for.append(os.path.join(new_dir,file))

python_files_comp = ____________________________________________________________
```

Here is a test to make sure your two constructions are really equivalent; the use of `sorted` is probably unneeded here, but we don't need to depend on the order returned by `glob` being consistent. Put the following in `~fcshome/cs2613/labs/L16/test_globex.py`.

```#!/usr/bin/python3
import globex

def test_for():
assert sorted(globex.python_files_for) == sorted(globex.python_files_comp)
```

In fact list comprehensions are really closer to a convenient syntax for `map`, which you may remember from Racket and JavaScript. Python also has `map` and `lambda`, although these are considered less idiomatic than using list comprehensions. Fill in the body of the `lambda` (should be similar or identical to your list comprehension expression).

```python_files_map = map(lambda file: __________________________, glob.glob("*.py"))
```

The following test should pass

```def test_map():
assert sorted(globex.python_files_comp) == sorted(globex.python_files_map)
```

# Dictionary Comprehensions

Time
20 minutes
Activity
Individual

Dictionary Comprehensions are quite similar to list comprehensions, except that they use

``````{ key: val for ...}
``````

Create a file `~/fcshome/cs2613/labs/L16/list2dict.py` with a function `list2dict` that transforms a list into a dictionary indexed by integers. Your function should use a dictionary comprehension and pass the following tests. One approach uses the python builtin range. It may help to write it first using a `for` loop.

```#!/usr/bin/python3

from list2dict import list2dict

def test_empty():
assert list2dict([]) == {}

def test_abc():
dictionary=list2dict(["a", "b", "c"])
assert dictionary == {1: 'a', 2: 'b', 3: 'c'}
```

# Filtered List Comprehensions

Time
25 minutes
Activity
individual

Looking at the discussion of list comprehensions, we can see that it is possible to filter the list of values used in the the list comprehension with an `if` clause. Use this syntax to re-implement the function `drop-divisible` from A1. Notice that the implementation of `sieve_with` is not suitable for a list comprehension because of the update of `lst` on every iteration (in Racket this could be done without mutation by tail recursion or for/fold). Python does have a `reduce` function (in the `functools` module), but most Python programmers will prefer the 2 line loop given here.

```#!/usr/bin/python3

from math import sqrt,ceil

def drop_divisible(n,lst):
return __________________________________

def sieve_with(candidates, lst):
for c in candidates:
lst=drop_divisible(c,lst)
return lst

def sieve(n):
return sieve_with(range(2,ceil(sqrt(n))+1), range(2,n))
```

Your implementation should pass the following tests.

```from sieve import drop_divisible

def test_drop_divisible():
assert drop_divisible(3, [2, 3, 4, 5, 6, 7, 8, 9, 10]) == [2, 3, 4, 5, 7, 8, 10]

def test_sieve():
assert sieve(10)== [2, 3, 5, 7]
```

# Using `format`

Time
25 minutes
Activity
individual

Like JavaScript, Python supports a simple way of constructing output using the overloaded operator `+`. Python also supports a more powerful format method (similar to Racket's format function) for combining values into a formatted output string. Use the `format` method and a list comprehension to write an equivalent value into `strings_format`.

```import os,glob

strings_plus = []
for p in glob.glob("*.py"):
size=os.stat(p).st_size
strings_plus.append(p + "\t" + str(size))

strings_format = __________________________________________________
```

Your code should pass the following test.

```import formatex

def test_equality():
assert sorted(formatex.strings_plus) == sorted(formatex.strings_format)
```
Posted Mon 02 Nov 2020 07:30:00 AM

# A first example

Time
15 min
Activity
individual
• Background for this section is DiP3 1.1

• Download humansize.py and save it as `~fcshome/cs2613/labs/L14/humansize.py`

• Run it from the command line

``````  \$ python3 humansize.py
``````
• Load it into `vscodium`, and run it in the debugger.

• Switch to the bottom prompt of the debugger window and run

``````  > from humansize import approximate_size
> approximate_size(10000)
> approximate_size(10000,False)
``````
• Repeat the previous step by running in a terminal

``````  \$ python3
``````

and then typing in the same lines.

# Pytest

Time
20 min
Activity
individual

In this part of the course we will be using pytest to write unit tests.

• download an initial test file and save it as `~fcshome/cs2613/labs/L14/test_humansize.py`

• open a terminal

``````  \$ pytest-3 test_humansize.py
``````
• Convert each working example in DiP3 1.2.1 into a test.

• Figure out how to run the tests from within `vscodium`. To get started:

• Ctrl-shift-P to bring up the command prompt
• type "Python: Configure Tests"

# Modules

Time
20 min
Activity
individual
• Create a new file `~fcshome/cs2613/labs/L14/client.py`

• Import the module `humansize` in `client.py`

• Define a new function `approximate_size` that calls the function `approximate_size` from the humansize module, but has a default of `False` for the parameter `a kilobyte_is_1024_bytes`

• Observe that the code in `humansize.py` guarded by ```if __name__ =='__main__'``` does not run when imported into `client.py` (you might see echos of Racket submodules from Lab 5). Create a similar block in `client.py`, and run it from the command line.

# More testing, docstrings

Time
15 min
Activity
individual
• create `test_client.py` by copying and modifying (if necessary) `test_humansize.py`

• from the command line, run

``````  \$ pytest-3
``````
• add a test to `test_client.py` to ensure that your new function has a docstring

• make sure your new function has a docstring (i.e. that the test you just added passes) and that it makes sense.

# Indentation

Time
20 min
Activity
individual
• One initially surprising aspect of Python is it's use of indentation to define blocks

• Start a new file `~/fcshome/cs2613/labs/L14/fizzbuzz.by`. Add the following code, and fix the indentation so that it runs

```    for i in range(1,101):
if (i%3 == 0 and i%5 == 0):
print("FizzBuzz")
elif (i%5==0):
print("Buzz")
else:
print(i)
```
• add the missing case for "Fizz" to this program. You might want to refer to L08 for the FizzBuzz problem definition

# Exceptions

Time
20 min
Activity
individual
• Python throws exceptions when dividing by zero, but suppose we are nostalgic for the laid-back JavaScript way of handling/ignoring errors.

• Start with the following code as `~/fcshome/cs2613/labs/L14/divisive.py`.

```    def fraction(a,b):
return a/b;
```
• Add a `try/except` block to `fraction` so that the following test-suite (saved as `test_divisive.py`) passes
```    from divisive import fraction

def test_fraction_int():
assert fraction(4,2) == 2;

def test_fraction_NaN():
assert fraction(4,0) == 'NaN';
```
Posted Wed 28 Oct 2020 08:30:00 AM

# Introduction

Debian is currently collecting buildinfo but they are not very conveniently searchable. Eventually Chris Lamb's buildinfo.debian.net may solve this problem, but in the mean time, I decided to see how practical indexing the full set of buildinfo files is with sqlite.

# Hack

1. First you need a copy of the buildinfo files. This is currently about 2.6G, and unfortunately you need to be a debian developer to fetch it.

`````` \$ rsync -avz mirror.ftp-master.debian.org:/srv/ftp-master.debian.org/buildinfo .
``````
2. Indexing takes about 15 minutes on my 5 year old machine (with an SSD). If you index all dependencies, you get a database of about 4G, probably because of my natural genius for database design. Restricting to debhelper and dh-elpa, it's about 17M.

`````` \$ python3 index.py
``````

You need at least `python3-debian` installed

3. Now you can do queries like

`````` \$ sqlite3 depends.sqlite "select * from depends where depend='dh-elpa' and depend_version<='0106'"
``````

where 0106 is some adhoc normalization of 1.6

# Conclusions

The version number hackery is pretty fragile, but good enough for my current purposes. A more serious limitation is that I don't currently have a nice (and you see how generous my definition of nice is) way of limiting to builds currently available e.g. in Debian unstable.

Posted Sat 02 Sep 2017 05:41:00 PM Tags: /tags/python

I could not find any nice examples of using the vobject class to filter an icalendar file. Here is what I got to work. I'm sure there is a nicer way. This strips all of the valarm subevents (reminders) from an icalendar file.

``````import vobject
import sys

cal=vobject.readOne(sys.stdin)

for ev in cal.vevent_list:
if ev.contents.has_key(u'valarm'):
del ev.contents[u'valarm']

print cal.serialize()
``````
Posted Sun 01 Jun 2008 12:00:00 AM Tags: /tags/python