This feed contains pages with tag "python".
Python Background
Octave Background
Most of the octave material here is covered in the Gnu Octave Beginner's Guide, in particular
- GOBG Chapter 2
- GOBG Chapter 4
- GOBG Chapter 5
For some parts, we will refer to the GNU Octave Manual
We'll refer to the online text by Robert Beezer for linear algebra background. We can happily ignore the stuff about complex numbers.
Before the lab
One of the main features of Octave we will discuss is vectorization. To understand it, we need some background material on Linear Algebra. If you've taken a linear algebra course recently, this should be easy, otherwise you will probably need to review
Vector Operations, particularly
- Addition
- Subtraction
- Scalar multiplication
-
- Addition
- Subtraction
- Scalar multiplication
If you don't have time before this lab, make sure you review any needed linear algebra before the next lab.
Counter, generators, and classes
- Time
- 15 minutes
- Activity
- Demo
Recall our generator based counter from L17.
def make_counter(x):
print('entering make_counter')
while True:
yield x
print('incrementing x')
x = x + 1
In
Lab 17 we almost simulated the generator behaviour with a
closure, except that next(counter)
was replaced by counter()
. We
can provide a compatible interface by using a python
class.
class Counter:
"Simulation of generator using only __next__ and __init__"
def __init__(self,x):
self.x = x
self.first = x
def __next__(self):
self.x = self.x + 1
return self.x - 1
print('first')
counter = Counter(100)
print('second')
print(next(counter))
print('third')
print(next(counter))
print('last')
- Observe that the implimentation is closer to the closure based version of L17 than the original generator version.
- It's a recurring theme in Python to have some
syntax/builtin-function tied to defining a special
__foo__
method
Fibonacci, again
- Time
- 35 minutes
- Activity
- individual
Save the generator based Fibonacci example as ~/fcshome/cs2613/L18/fibgen.py
Save the following in ~/fcshome/cs2613/L18/fib.py
#!/usr/bin/python3
class Fib:
def __init__(self,max):
self.max = max
self.a = 0
self.b = 1
def __next__(self):
if self.a < self.max:
else:
raise StopIteration
Complete the definition of __next__
so that following test passes.
Hint: Consider following the closure based version from
Lab 17
from fib import Fib
from fibgen import fibgen
def test_fib_list():
genfibs=list(fibgen(100))
fibber=Fib(100)
fibs=[]
while True:
try:
fibs.append(next(fibber))
except:
break
assert genfibs == fibs
We may wonder why this odd looking while loop is used to build fibs
rather than some kind of list comprehension. The following simpler version currently fails:
def test_fib_list_2():
genfibs=list(fibgen(100))
classfibs=list(Fib(100))
assert genfibs==classfibs
This failure is due the fact that we haven't really built an iterator yet. Remember Python works by duck-typing, which means that an iterator is something that provides the right methods.
Add the following method to your Fib
class. Observe the test above now
passes.
def __iter__(self):
return self
In addition to signal to the Python runtime that some object is an
iterator, the __iter__
serves to restart the traversal of our "virtual list".
If we want iterators to act as lists, then the following should really print the same list of Fibonacci numbers twice
if __name__ == '__main__':
fibber = Fib(100)
for n in fibber:
print(n)
for n in fibber:
print(n)
Since it doesn't, let's formalize that as a test. Modify the
__init__
and (especially) the __iter__
method so that following
test passes.
def test_fib_restart():
fibobj = Fib(100)
list1 = list(fibobj)
list2 = list(fibobj)
assert list1 == list2
Running Octave
There is a GUI accessible from
Activities -> GNU Octave
, or by running from the command line% octave --gui
There is also a REPL accessible from the command line by running
% octave
To sanity check your octave setup, run the following plots
>> surf(peaks) >> countourf(peaks)
Fibonacci
- Time
- 15 minutes
- Activity
- Demo/Group programming.
Let's dive in to programming in Octave with a straight-forward port of
our Python Fibonacci function. Save the following in
~/fcshome/cs2613/labs/L19/fib.m
. It turns out to be important that
the function name is the same as the file name.
function ret = fib(n)
a = 0;
b = 1;
for i=0:n
endfor
endfunction
%!assert (fib(0) == 0);
%!assert (fib(1) == 1);
%!assert (fib(2) == 1);
%!assert (fib(3) == 2);
%!assert (fib(4) == 3);
%!assert (fib(5) == 5);
We can avoid the need for a temporary variable by calling
deal
function, but it's not clear that would be faster. If you have time, try it.Note the
%! assert
. These are unit tests that can be run with>> test fib
The syntax for
%!assert
is a bit fussy, in particular the parentheses are needed around the logical test.
Fibonaccci as matrix multiplication
- Time
- 20 minutes
- Activity
- individual
The following is a
well known
identity about the Fibonacci numbers F(i)
.
[ 1, 1;
1, 0 ]^n = [ F(n+1), F(n);
F(n), F(n-1) ]
Since matrix exponentiation is built-in to octave, this is particularly easy to implement in octave
Save the following as ~/fcshome/cs2613/labs/L19/fibmat.m
, fill in the
two matrix operations needed to complete the algorithm
function ret = fibmat(n)
A = [1,1; 1,0];
endfunction
%!assert (fibmat(0) == 0);
%!assert (fibmat(1) == 1);
%!assert (fibmat(2) == 1);
%!assert (fibmat(3) == 2);
%!assert (fibmat(4) == 3);
%!assert (fibmat(5) == 5);
%!assert (fibmat(6) == 8);
%!assert (fibmat(25) == 75025);
Performance comparison
- Time
- 10 minutes
- Activity
- Demo / discussion
We can expect the second Fibonacci implementation to be faster for two distinct reasons
It's possible to compute matrix powers rather quickly (
O(log n)
comparedO(n)
), and since the fast algorithm is also simple, we can hope that octave implements it. Since the source to octave is available, we could actually check this.Octave is interpreted, so loops are generally slower than matrix operations (which can be done in a single call to an optimized library). This general strategy is called vectorization, and applies in a variety of languages, usually for numerical computations. In particular most PC hardware supports some kind of hardware vector facility.
Of course, the first rule of performance tuning is to carefully test
any proposed improvement. The following code gives an extensible way
to run simple timing tests, in a manner analogous to the Python
timeit
method, whose name it borrows.
# Based on an example from the Julia microbenchmark suite.
function timeit(func, argument, reps)
times = zeros(reps, 1);
for i=1:reps
tic(); func(argument); times(i) = toc();
end
times = sort(times);
fprintf ('%s\tmedian=%.3fms mean=%.3fms total=%.3fms\n',func2str(func), median(times)*1000,
mean(times)*1000, sum(times)*1000);
endfunction
What are the new features of octave used in this sample code?
tic
,toc
, from GOBG8, GO 36.1- Function Handles
- what else?
We can either use timeit
from the octave command line, or build a little utility function like
function bench
timeit(@fib, 42, 100000)
timeit(@fibmat, 42, 100000)
endfunction
Overview
This assignment is based on the material covered in Lab 14 and (particularly) Lab 15.
The goal of the assignment is to develop a simple query language that lets the user select rows and columns from a CSV File, in effect treating it like database.
General Instructions
- Every non-test function should have a docstring
- Feel free to add docstrings for tests if you think they need explanation
- Use list and dictionary comprehensions as much as reasonable.
Your code should pass all of the given tests, plus some of your own with different data. If you want, you can use some of the sample data from the US Government College Scorecard. I've selected some of the data into smaller files:
2013-1000.csv.gz
Sun 30 Oct 2022 11:22:46 AM2014-1000.csv.gz
Sun 30 Oct 2022 11:22:46 AM2014-100.csv.gz
Sun 30 Oct 2022 11:22:46 AM2013-100.csv.gz
Sun 30 Oct 2022 11:22:46 AM2015-1000.csv.gz
Sun 30 Oct 2022 11:22:46 AM2015-100.csv.gz
Sun 30 Oct 2022 11:22:46 AM- A marking rubric is available.
Reading CSV Files
We will use the builtin Python CSV module to read CSV files.
def read_csv(filename): '''Read a CSV file, return list of rows''' import csv with open(filename,'rt',newline='') as f: reader = csv.reader(f, skipinitialspace=True) return [ row for row in reader ]
Save the following as "~/fcshome/assignments/A5/test1.csv"; we will use it several tests. You should also construct your own example CSV files and corresponding tests.
name, age, eye colour Bob, 5, blue Mary, 27, brown Vij, 54, green
Here is a test to give you the idea of the returned data structure from
read_csv
.def test_read_csv(): assert read_csv('test1.csv') == [['name', 'age', 'eye colour'], ['Bob', '5', 'blue'], ['Mary', '27', 'brown'], ['Vij', '54', 'green']]
Parsing Headers
The first row most in most CSV files consists of column labels. We will use this to help the user access columns by name rather than by counting columns.
Write a function
header_map
that builds a dictionary from labels to column numbers.table = read_csv('test1.csv') def test_header_map_1(): hmap = header_map(table[0]) assert hmap == { 'name': 0, 'age': 1, 'eye colour': 2 }
Selecting columns
Use your implimentation of
header_map
to write a functionselect
that creates a new table with some of the columns of the given table.def test_select_1(): assert select(table,{'name','eye colour'}) == [['name', 'eye colour'], ['Bob', 'blue'], ['Mary', 'brown'], ['Vij', 'green']]
Transforming rows into dictionaries
Sometimes it's more convenient to work with rows of the table as dictionaries, rather than passing around the map of column labels everwhere. Write a function
row2dict
that takes the output from header_map, and a row, and returns a dictionary representing that row (column order is lost here, but that will be ok in our application).def test_row2dict(): hmap = header_map(table[0]) assert row2dict(hmap, table[1]) == {'name': 'Bob', 'age': '5', 'eye colour': 'blue'}
Matching rows
We are going to write a simple query languge where each query is a 3-tuple
(left, op, right)
, andop
is one of==
,<=
, and>=
. In the initial version,left
andright
are numbers or strings. Strings are interpreted as follows: if they are column labels, retrieve the value in that column; otherwise treat it as a literal string. With this in mind, write a functioncheck_row
that takes a row in dictionary form, and checks if it matches a query tuple.def test_check_row(): row = {'name': 'Bob', 'age': '5', 'eye colour': 'blue'} assert check_row(row, ('age', '==', 5)) assert not check_row(row, ('eye colour', '==', 5)) assert check_row(row, ('eye colour', '==', 'blue')) assert check_row(row, ('age', '>=', 4)) assert check_row(row, ('age', '<=', 1000))
Extending the query language
Extend
check_row
so that it supports operationsAND
andOR
. For these cases both left and right operands must be queries. Hint: this should only be a few more lines of code.def test_check_row_logical(): row = {'name': 'Bob', 'age': '5', 'eye colour': 'blue'} assert check_row(row, (('age', '==', 5),'OR',('eye colour', '==', 5))) assert not check_row(row, (('age', '==', 5),'AND',('eye colour', '==', 5)))
Filtering tables
Use you previously developed functions to impliment a function
filter_table
that selects certain rows of the table according to a query.def test_filter_table1(): assert filter_table(table,('age', '>=', 0)) == [['name', 'age', 'eye colour'], ['Bob', '5', 'blue'], ['Mary', '27', 'brown'], ['Vij', '54', 'green']] assert filter_table(table,('age', '<=', 27)) == [['name', 'age', 'eye colour'], ['Bob', '5', 'blue'], ['Mary', '27', 'brown']] assert filter_table(table,('eye colour', '==', 'brown')) == [['name', 'age', 'eye colour'], ['Mary', '27', 'brown']] assert filter_table(table,('name', '==', 'Vij')) == [['name', 'age', 'eye colour'], ['Vij', '54', 'green']] def test_filter_table2(): assert filter_table(table,(('age', '>=', 0),'AND',('age','>=','27'))) == [['name', 'age', 'eye colour'], ['Mary', '27', 'brown'], ['Vij', '54', 'green']] assert filter_table(table,(('age', '<=', 27),'AND',('age','>=','27'))) == [['name', 'age', 'eye colour'], ['Mary', '27', 'brown']] assert filter_table(table,(('eye colour', '==', 'brown'), 'OR', ('name','==','Vij'))) == [['name', 'age', 'eye colour'], ['Mary', '27', 'brown'], ['Vij', '54', 'green']]
Before the lab
Background
Discussion
- Time
- 5 minutes
- Any questions about A5?
Regular Expressions
- Time
- 15 minutes
- Activity
- Demo / Group discussion
To get familiar with regular expressions, we follow the street address Case Study.
Try the following evaluations in a python REPL.
>>> '100 NORTH MAIN ROAD'.replace('ROAD', 'RD.')
>>> s = '100 NORTH BROAD ROAD'
>>> s.replace('ROAD', 'RD.')
# oops
>>> s[:-4] + s[-4:].replace('ROAD', 'RD.')
# ugh, that code
>>> import re
>>> re.sub('ROAD$', 'RD.', s)
# what dark magic is this?
Regular expressions are a domain specific language that allow us to
specify complicated string operations. In practice, the simple $
we
used above is not enough.
>>> s = '100 BROAD'
>>> re.sub('ROAD$', 'RD.', s)
# New regex feature \b.
>>> re.sub('\\bROAD$', 'RD.', s)
# Raw strings reduce \ overload
>>> re.sub(r'\bROAD$', 'RD.', s)
# Our new regex is too "narrow"
>>> s = '100 BROAD ROAD APT. 3'
>>> re.sub(r'\bROAD$', 'RD.', s)
>>> re.sub(r'\bROAD\b', 'RD.', s)
In the next part we will need to use a few fancier features.
import re
rex=re.compile(r'([^0-9]+)')
for match in rex.findall('113abba999bjorn78910101benny888331dancing34234queen'):
print(match)
Stripping Quotes
- Time
- 25 minutes
- Activity
- Individual
parse_csv.py
:
def split_csv(string):
return [ row.split(",") for row in string.splitlines() ]
from parse_csv import split_csv
test_string_1 = """OPEID,INSTNM,TUITIONFEE_OUT
02503400,Amridge University,6900
00100700,Central Alabama Community College,7770
01218200,Chattahoochee Valley Community College,7830
00101500,Enterprise State Community College,7770
00106000,James H Faulkner State Community College,7770
00101700,Gadsden State Community College,5976
00101800,George C Wallace State Community College-Dothan,7710
"""
table1 = [['OPEID', 'INSTNM', 'TUITIONFEE_OUT'],
['02503400', 'Amridge University', '6900'],
['00100700', 'Central Alabama Community College', '7770'],
['01218200', 'Chattahoochee Valley Community College', '7830'],
['00101500', 'Enterprise State Community College', '7770'],
['00106000', 'James H Faulkner State Community College', '7770'],
['00101700', 'Gadsden State Community College', '5976'],
['00101800', 'George C Wallace State Community College-Dothan', '7710']]
def test_split_1():
assert split_csv(test_string_1) == table1
In general entries of CSV files can have quotes, but these are not
consider part of the content. In particular a correct version of
split_csv
should pass the following test.
test_string_2 = '''OPEID,INSTNM,TUITIONFEE_OUT
02503400,"Amridge University",6900
00100700,"Central Alabama Community College",7770
01218200,"Chattahoochee Valley Community College",7830
00101500,"Enterprise State Community College",7770
00106000,"James H Faulkner State Community College",7770
00101700,"Gadsden State Community College",5976
00101800,"George C Wallace State Community College-Dothan",7710
'''
def test_split_2():
assert split_csv(test_string_2) == table1
Ours doesn't yet, so let's try to fix that using regular expressions
- Fill in the regex in
strip_quotes
so that it passes the following test
def test_strip_quotes():
assert strip_quotes('"hello"') == 'hello'
assert strip_quotes('hello') == 'hello'
- Here is a skeleton for
strip_quotes
:
def strip_quotes(string):
strip_regex = re.compile( )
search = strip_regex.search(string)
if search:
return search.group(1)
else:
return None
- You'll want to refer to regular expression features
- The use of the groups method means your regex solution should
have exactly one set of
(…)
with a regex matching the non-quoted part. - You can say something is optional by using
…?
, any number of repetitions with…*
- A character not in a given set can be matched with
[^…]
- once you have a working
strip_quotes
, use it inparse_csv
in order to make the test above pass.
Handling quoted commas
- Time
- 30 minutes
- Activity
- Individual
It turns out one of the main reasons for supporting quotes is to handle quoted commas.
The function split_row_3
is intended to split rows with exactly 3 columns.
def test_split_row_3():
assert split_row_3('00101800,"George C Wallace State Community College, Dothan",7710') == \
['00101800', 'George C Wallace State Community College, Dothan', '7710']
Read the discussion on verbose regular expressions
Complete the definition of
split_row_3
. You'll want to figure out a regular expression that matches either a quoted or an unquoted column, and then repeat that 3 times. "Or" in regular expressions is implemented with|
You will want to use
[^…]
once for each case; in one case for excluding"
and in the other for excluding,
.
def split_row_3(string):
split_regex=re.compile(
r'''^ # start
(" "| ) # column
,
(" "| ) # column
,
(" "| ) # column
$''', re.VERBOSE)
search = split_regex.search(string)
if search:
return [ strip_quotes(col) for col in search.groups() ]
else:
return None
- Use your
split_row_3
function insplit_csv
to pass the following test
test_string_3 = '''OPEID,INSTNM,TUITIONFEE_OUT
02503400,"Amridge University",6900
00100700,"Central Alabama Community College",7770
01218200,"Chattahoochee Valley Community College",7830
00101500,"Enterprise State Community College",7770
00106000,"James H Faulkner State Community College",7770
00101700,"Gadsden State Community College",5976
00101800,"George C Wallace State Community College, Dothan",7710
'''
table2 = [['OPEID', 'INSTNM', 'TUITIONFEE_OUT'],
['02503400', 'Amridge University', '6900'],
['00100700', 'Central Alabama Community College', '7770'],
['01218200', 'Chattahoochee Valley Community College', '7830'],
['00101500', 'Enterprise State Community College', '7770'],
['00106000', 'James H Faulkner State Community College', '7770'],
['00101700', 'Gadsden State Community College', '5976'],
['00101800', 'George C Wallace State Community College, Dothan', '7710']]
def test_split_3():
'''Check handling of quoted commas'''
assert split_csv(test_string_3) == table2
Parsing more columns
- Time
- 20 minutes
- Activity
- Individual
Use your column matching regex, along with the findall
method to
match any number of columns. Call your new function split_row
.
def test_split_row():
assert split_row('00101800,"George C Wallace State Community College, Dothan",7710,",,,"') == \
['00101800', 'George C Wallace State Community College, Dothan', '7710',',,,']
Use your new function in place of split_row_3
so that the following test (and all previous tests) pass
test_string_4=\
'''OPEID,INSTNM,PCIP52,TUITIONFEE_OUT
00103800,Snead State Community College,0.0811,7830
00573400,H Councill Trenholm State Community College,0.0338,7524
00573300,"Bevill, State, Community College",0.0451,7800
00884300,Alaska Bible College,0,9300
00107100,Arizona Western College,0.0425,9530
00107200,"Cochise County Community College, District",0.0169,6000
'''
table3=[
['OPEID', 'INSTNM', 'PCIP52', 'TUITIONFEE_OUT'],
['00103800', 'Snead State Community College', '0.0811', '7830'],
['00573400', 'H Councill Trenholm State Community College', '0.0338', '7524'],
['00573300', 'Bevill, State, Community College', '0.0451', '7800'],
['00884300', 'Alaska Bible College', '0', '9300'],
['00107100', 'Arizona Western College', '0.0425', '9530'],
['00107200', 'Cochise County Community College, District', '0.0169', '6000']]
def test_split_4():
assert split_csv(test_string_4) == table3
Before the Lab
Background
Discussion
- Time
- 10 minutes
- Activity
- Discussion
Globbing and List comprehensions
- Time
- 20 minutes
- Activity
- individual
List Comprehensions can be seen as a special kinds of for loops. Construct an equivalent list comprehension to the given for loop.
#!/usr/bin/python3
import glob
import os
new_dir = os.path.expanduser("~/fcshome/cs2613/labs/test")
python_files_for = []
for file in glob.glob("*.py"):
python_files_for.append(os.path.join(new_dir,file))
python_files_comp = ____________________________________________________________
Here is a test to make sure your two constructions are really
equivalent; the use of sorted
is probably unneeded here, but we
don't need to depend on the order returned by glob
being consistent.
Put the following in ~fcshome/cs2613/labs/L15/test_globex.py
.
#!/usr/bin/python3
import globex
def test_for():
assert sorted(globex.python_files_for) == sorted(globex.python_files_comp)
In fact list comprehensions are really closer to a convenient syntax
for map
, which you may remember from Racket and JavaScript. Python
also has map
and lambda
, although these are considered less
idiomatic than using list comprehensions. Fill in the body of the
lambda
(should be similar or identical to your list comprehension
expression).
python_files_map = map(lambda file: __________________________, glob.glob("*.py"))
The following test should pass
def test_map():
assert sorted(globex.python_files_comp) == sorted(globex.python_files_map)
Dictionary Comprehensions
- Time
- 20 minutes
- Activity
- Individual
Dictionary Comprehensions are quite similar to list comprehensions, except that they use
{ key: val for ...}
Create a file ~/fcshome/cs2613/labs/L15/list2dict.py
with a function
list2dict
that transforms a list into a dictionary indexed by
integers. Your function should use a dictionary comprehension and
pass the following tests. One approach uses the python builtin
range. It
may help to write it first using a for
loop.
#!/usr/bin/python3
from list2dict import list2dict
def test_empty():
assert list2dict([]) == {}
def test_abc():
dictionary=list2dict(["a", "b", "c"])
assert dictionary == {1: 'a', 2: 'b', 3: 'c'}
Filtered List Comprehensions
- Time
- 25 minutes
- Activity
- individual
Looking at the discussion of
list comprehensions,
we can see that it is possible to filter the list of values used in
the the list comprehension with an if
clause. Use this syntax to
re-implement the function drop-divisible
from A1.
Notice that the implementation of sieve_with
is not suitable for a
list comprehension because of the update of lst
on every iteration
(in Racket this could be done without mutation by tail recursion or
for/fold). Python does have a reduce
function (in the functools
module), but most Python programmers will prefer the 2 line loop given
here.
#!/usr/bin/python3
from math import sqrt,ceil
def drop_divisible(n,lst):
return __________________________________
def sieve_with(candidates, lst):
for c in candidates:
lst=drop_divisible(c,lst)
return lst
def sieve(n):
return sieve_with(range(2,ceil(sqrt(n))+1), range(2,n))
Your implementation should pass the following tests.
from sieve import drop_divisible
def test_drop_divisible():
assert drop_divisible(3, [2, 3, 4, 5, 6, 7, 8, 9, 10]) == [2, 3, 4, 5, 7, 8, 10]
def test_sieve():
assert sieve(10)== [2, 3, 5, 7]
Using format
- Time
- 25 minutes
- Activity
- individual
Like JavaScript, Python supports a simple way of constructing output
using the overloaded operator +
. Python also supports a more
powerful format method (similar to Racket's format function) for
combining values into a formatted output string. Use the format
method and a list comprehension to write an equivalent value into
strings_format
.
import os,glob
strings_plus = []
for p in glob.glob("*.py"):
size=os.stat(p).st_size
strings_plus.append(p + "\t" + str(size))
strings_format = __________________________________________________
Your code should pass the following test.
import formatex
def test_equality():
assert sorted(formatex.strings_plus) == sorted(formatex.strings_format)
Before the Lab
Discussion
- Time
- 5 min =Activity= Group Discussion
- Questions about the quiz?
- Questions about A4
A first example
- Time
- 10 min
- Activity
- Demo
Background for this section is DiP3 1.1
Download humansize.py and save it as
~fcshome/cs2613/labs/L14/humansize.py
Run it from the command line
$ python humansize.py
Run the debugger
$ pudb3 humansize.py
Use
Ctrl-X
to switch to the bottom debugger window (Command line
) and run>>> from humansize import approximate_size >>> approximate_size(10000) >>> approximate_size(10000,False)
Repeat the previous step by running in a terminal
$ python
and then typing in the same lines.
Pytest
- Time
- 20 min
- Activity
- individual
In this part of the course we will be using pytest to write unit tests.
download an initial test file and save it as
~fcshome/cs2613/labs/L14/test_humansize.py
open a terminal
$ pytest-3 test_humansize.py
Convert each working example in DiP3 1.2.1 into a test.
Test coverage reports can be gotten with
pytest-3 --cov=mymodule --cov-report=term-missing
where mymodule is the name of the module in question. Most likely you will need to add more tests for complete coverage. You can do this after the lab.
Modules
- Time
- 15 min
- Activity
- individual
Create a new file
~fcshome/cs2613/labs/L14/client.py
Import the module
humansize
inclient.py
Define a new function
approximate_size
that calls the functionapproximate_size
from the humansize module, but has a default ofFalse
for the parametera kilobyte_is_1024_bytes
Observe that the code in
humansize.py
guarded byif __name__ =='__main__'
does not run when imported intoclient.py
(you might see echos of Racket submodules from Lab 5). Create a similar block inclient.py
, and run it from the command line.
More testing, docstrings
- Time
- 15 min
- Activity
- individual
create
test_client.py
by copying and modifying (if necessary)test_humansize.py
from the command line, run
$ pytest-3
add a test to
test_client.py
to ensure that your new function has a docstringmake sure your new function has a docstring (i.e. that the test you just added passes) and that it makes sense.
Indentation
- Time
- 15 min
- Activity
- individual
One initially surprising aspect of Python is it's use of indentation to define blocks
Start a new file
~/fcshome/cs2613/labs/L14/fizzbuzz.by
. Add the following code, and fix the indentation so that it runs
for i in range(1,101):
if (i%3 == 0 and i%5 == 0):
print("FizzBuzz")
elif (i%5==0):
print("Buzz")
else:
print(i)
Exceptions
- Time
- 20 min
- Activity
- individual
Python throws exceptions when dividing by zero. Suppose we want to return
NaN
, JavaScript style.Start with the following code as
~/fcshome/cs2613/labs/L14/divisive.py
.
def fraction(a,b):
return a/b;
- Add a
try/except
block tofraction
so that the following test-suite (saved astest_divisive.py
) passes
from divisive import fraction
import math
def test_fraction_int():
assert fraction(4,2) == 2;
def test_fraction_NaN():
assert math.isnan(fraction(4,0))
Hint: you can use float('nan')
or math.nan
to generate a NaN
On your own
Lookup up the definition of the FizzBuzz problem and add the missing case for "Fizz" to the program discussed above.
Add enough tests to get complete test coverage for
humansize.py
Introduction
Debian is currently collecting buildinfo but they are not very conveniently searchable. Eventually Chris Lamb's buildinfo.debian.net may solve this problem, but in the mean time, I decided to see how practical indexing the full set of buildinfo files is with sqlite.
Hack
First you need a copy of the buildinfo files. This is currently about 2.6G, and unfortunately you need to be a debian developer to fetch it.
$ rsync -avz mirror.ftp-master.debian.org:/srv/ftp-master.debian.org/buildinfo .
Indexing takes about 15 minutes on my 5 year old machine (with an SSD). If you index all dependencies, you get a database of about 4G, probably because of my natural genius for database design. Restricting to debhelper and dh-elpa, it's about 17M.
$ python3 index.py
You need at least
python3-debian
installedNow you can do queries like
$ sqlite3 depends.sqlite "select * from depends where depend='dh-elpa' and depend_version<='0106'"
where 0106 is some adhoc normalization of 1.6
Conclusions
The version number hackery is pretty fragile, but good enough for my current purposes. A more serious limitation is that I don't currently have a nice (and you see how generous my definition of nice is) way of limiting to builds currently available e.g. in Debian unstable.
I could not find any nice examples of using the vobject class to filter an icalendar file. Here is what I got to work. I'm sure there is a nicer way. This strips all of the valarm subevents (reminders) from an icalendar file.
import vobject
import sys
cal=vobject.readOne(sys.stdin)
for ev in cal.vevent_list:
if ev.contents.has_key(u'valarm'):
del ev.contents[u'valarm']
print cal.serialize()