## Introduction

See the general rules about assignments in this course.

**DUE**: 2018-02-28, 09:15 in the assignment box.

## Questions

### 1 Huffman coding

From an exercise in DPV.

Consider a discrete probability distribution on symbols `1`

to `n`

with probabilities `p₁ … pₙ`

where each `pᵢ`

is an integer power of
`1/2`

. Suppose a long sequence `S`

of samples is drawn from the
distribution such that symbol `i`

occurs `pᵢ|S|`

times. Prove that if
a Huffman code based on the probabilities `pᵢ`

is used, the total
number of bits to encode S is

```
|S|∑ { pᵢ log (1/pᵢ) | 1 ≤ i ≤ n }
```

**Hint** Show that the two smallest probabilities must be the same; or
for part marks assume this holds.

### 2 MST

Prove that Kruskals algorithm (below) is correct by using the
*Cut Property* of minimum spanning trees.

```
∀u ∈ V makeset(u)
X ← {}
sort edges by weight
for (u, v) ∈ E do
if find(u) ≠ find(v) then
X ← X ∪ {(u, v)}
union(u,v)
end if
end for
```

### 3 Disjoint-Sets

(Ex 5.12 in DPV) Suppose you implement the disjoint-sets data
structure using union-by-rank but not path compression. Give a
sequence of m union and find operations on n elements that take
`Ω(m log n)`

time.