4 Structures

8.1

4 Structures

\(\newcommand{\la}{\lambda} \newcommand{\Ga}{\Gamma} \newcommand{\ra}{\rightarrow} \newcommand{\Ra}{\Rightarrow} \newcommand{\La}{\Leftarrow} \newcommand{\tl}{\triangleleft} \newcommand{\ir}[3]{\displaystyle\frac{#2}{#3}~{\textstyle #1}}\)

Racket provides a general mechanism called structures to create new datatypes. In the teaching languages, structures primarily permit the "bundling" of several values into one. In many situations, data is naturally grouped, and most programming languages provide some mechanism to do this. (Structures have additional features in full Racket.) Although this may seem like a small addition, it considerably increases the expressivity and flexibility of our programs.

4.1 Syntax and semantics

As a modest initial example, consider the representation of points on the two-dimensional plane. Mathematically, these are written with two numerical coordinates, separated by a comma and surrounded by parentheses, as in the example \((3,4)\). In Racket, we can create a new type of structure to hold two coordinates with the following expression.

(define-struct point (x y))

This does not create any structures directly, but sets up the ability to create point structures with two fields named x and y. In effect, it creates the support functions for a new data type. Those functions can then be used to create and work with instances of the new type.

The define-struct expression above provides functions to create a point structure holding two values, to test whether something is a point structure, and to access the values within a point structure.

We create a point with the constructor function make-point (defined for us by the define-struct expression above) applied to two numeric values.

(define p (make-point 3 4))

We access a value bundled in a point with the accessor functions, again defined for us by define-struct. There is one function for each field. The name of the function is derived from the name of the structure and the name of the field, with a hyphen in between.

(point-x p) \(\Rightarrow\) 3 (point-y p) \(\Rightarrow\) "4"

The type predicate point? (defined for us) tests for point-ness.

(point? p) \(\Rightarrow\) #true (point? 17) \(\Rightarrow\) #false ]

An expression such as (make-point 8 1) is considered a value. This expression will not be rewritten by the Stepper or our semantic rules. The expression (make-point (+ 4 4) (- 3 2)) would be rewritten to (eventually) yield (make-point 8 1).

Here are the rewrite rules needed, specialized to point structures. We’ll discuss the general rules shortly, but you can probably figure them out.

(point-x (make-point vx vy)) \(\Rightarrow\) vx

(point-y (make-point vx vy)) \(\Rightarrow\) vy

(point? (make-point vx vy)) \(\Rightarrow\) #true

(point? v) \(\Rightarrow\) #false for other v

Functions can consume and produce structures.

; distance: Point Point -> Number

; computes the Euclidean distance

; between p1 and p2

; Example:

(check-expect (distance (make-point 1 1)

(make-point 4 5))

(define (distance p1 p2)

(sqrt (+ (sqr (- (point-x p1) (point-x p2)))

(sqr (- (point-y p1) (point-y p2))))))

; scale: Point Number -> Point

; scales the point by the given factor

; Example:

(check-expect (scale (make-point 3 4) 0.5)

(make-point 1.5 2))

(define (scale p factor)

(make-point (* factor (point-x p))

(* factor (point-y p))))

Here is a trace of a use of scale.

(scale (make-point 3 4) 0.5)
\(\Rightarrow\) (make-point (* 0.5 (point-x (make-point 3 4))) (* 0.5 (point-y (make-point 3 4))))
\(\Rightarrow\) (make-point (* 0.5 3) (* 0.5 (point-y (make-point 3 4))))
\(\Rightarrow\) (make-point 1.5 (* 0.5 (point-y (make-point 3 4))))
\(\Rightarrow\) (make-point 1.5 (* 0.5 4))
\(\Rightarrow\) (make-point 1.5 2)

Although we have been using numbers as coordinates, make-point can accept any two Racket values. This is true of structures in general; they do not impose any restrictions on their arguments.

(distance (make-point 'Spider 'Man)

(make-point 'Peter 'Parker))

This causes a run-time error only when the subtraction function in the body of distance tries to subtract 'Peter from 'Spider.

We address this with a data definition, a comment in the program that makes clear what we consider proper.

; A Point is a (make-point n m), where n and m are Numbers.

For the very small programs we write for ourselves initially, this might suffice, but for larger programs or for code that will be used by others, we might want more safeguards against accidental or deliberate misuse. For example, we could write a "guarded" version of the make-point constructor which raises an error on misuse.

(define (g-point m n)

(cond

[(and (number? m) (number? n)) (make-point m n)]

[else (error 'g-point "Points can contain only numbers")]))

We’ll use the data definition to introduce another element of the design recipe. A template is a code skeleton based on a data definition, to aid in writing the body of functions that consume that type of data. From the data definition for points, ,we can see that if p is a (make-point n m) then n is (point-x p), and m is (point-y p). We put these into a template for functions that consume point structures.

; my-point-fn : Point -> ?

(define (my-point-fn p)

... (point-x p) ...

... (point-y p) ...)

The template is available to us as a starting point when writing specific functions. While this may seem unnecessary for points, the idea will be useful when our use of structures becomes more complex.

Let’s nail down the general semantics of structures. Consider a structure s with \(n\) fields.

(define-struct s (f1 ... fn))

This introduces a constructor function make-s, accessor functions s-f1 through s-fn, and a type predicate s?, with the following rewrite rules:

(make-s v1 ... vn) is a value.

(s-fi (make-s v1 ... vi ... vn)) \(\Rightarrow\) vi for all i.

(s? (make-s v1 ... vn)) \(\Rightarrow\) true

(s? v) \(\Rightarrow\) false for other v.

Exercise 8: Using the clock struct defined by:

(define-struct clock (hours mins secs))

to represent 24-hour time (midnight is 00:00:00, one second before is 23:59:59), write the function tick that consumes a clock struct c and produces the clock struct representing the time one second after the time represented by c.

As an example, (tick (make-clock 23 59 59)) (representing the time 23:59:59) should evaluate to (make-clock 0 0 0) (representing the time 00:00:00).

Strive to make your solution brief and readable. There are many different solutions to this question, so as a challenge, try to think of two different ways to solve it. \(\blacksquare\)

Exercise 9: The set of numbers \(a+c\sqrt{b}\), where \(b\) is a natural number greater than 1 without repeated prime factors, and \(a\) and \(c\) are rational numbers, forms a quadratic field. Such numbers can be represented exactly in Racket (without using inexact numbers or the sqrt function) as (make-qfe a c b) using the following definition:

(define-struct qfe (rat irrat base))

Write the function add which consumes two qfe structs and produces the qfe struct representing their sum. Also write sub, mult, and div, each of which consume two qfe structs and produce one qfe struct, and which implement subtraction, multiplication, and division, respectively.

As an example, the number \(1-\sqrt{5}\) is represented by (make-qfe 1 -1 5), and the result of multiplying this number by itself, \(6-2\sqrt{5}\), is represented by (make-qfe 6 -2 5). Thus (mult (make-qfe 1 -1 5) (make-qfe 1 -1 5)) should produce (make-qfe 6 -2 5).

These operations require the bases of the two operands to be the same. Your code should signal an error using the predefined error function, which you can read about in the documentation for Beginning Student (in the "Misc" section). Similarly, your code should test for and signal an error in the case of attempted division by zero. You do not need to check that the base has no repeated prime factors (but this is an interesting task to think about). \(\blacksquare\)

4.2 Simulating natural numbers

Natural numbers are provided by Racket as part of the built-in number system. But if they were not provided, we could simulate them using structures. We will spend some time exploring this idea. This exercise gives us insight into the foundations of mathematics and into the close relationship between mathematics and computer science.

We will choose to use the following structure definitions:

(define-struct Z ())

(define-struct S (pred))

The Z structure has no fields. We can make one with the expression (make-Z). The S structure has one field. We will use that field to hold either a Z structure or another S structure. It may not have occurred to you to put a structure inside another of the same type, but this idea is key to being able to handle unbounded data.

You may be used to the natural numbers starting at 1. But in computer science, and in formal logic and the foundations of mathematics, the natural numbers start at 0. If we don’t make this choice, we have to add special cases for 0, and everything gets messier.

We will choose the following interpretations. The Racket expression (make-Z) represents the natural number \(0\). What an S structure represents is a bit more complicated. "S" in this case stands for "successor". The successor of the natural number \(0\) is the natural number \(1\).

The Racket expression (make-S (make-Z)) is our representation of the successor of \(0\), namely \(1\). Similarly, (make-S (make-S (make-Z))) represents \(2\), because \(2\) is the successor of \(1\).

In general, if e is the representation of the natural number \(n\), then (make-S e) is the representation of the successor of \(n\), namely the natural number \(n+1\).

We will call our representation of natural numbers Nat.

Data definition: a Nat is either (make-Z) or it is (make-S p), where p is a Nat.

This data definition is different from the one we had for Point. It is self-referential: it contains a reference to the thing being defined. In computer science, we call this a recursive definition. It is not a circular definition, just as our grammar rule for expressions was not circular. Part of the definition, referring to (make-Z), is not self-referential. The self-referential part refers to how to create a new Nat value from an already-justified Nat value.

The predecessor of a natural number \(n\) is \(n-1\). Zero has no predecessor. How might we write a Racket function to compute the predecessor of a Nat?

The contract will be

; predecessor : Nat -> Nat

Since a Nat is either a Z or an S structure, this suggests the following skeleton:

(define (pred nat)

(cond

[(Z? nat) ...]

[(S? nat) ...]))

Filling in the ellipses is straightforward.

(define (pred nat)

(cond

[(Z? nat) (error "can't apply pred to Z")]

[(S? nat) (S-pred nat)]))

(check-error (pred (make-Z)) "can't apply pred to Z")

(check-expect (pred (make-S (make-Z)))

(make-Z))

(check-expect (pred (make-S (make-S (make-Z))))

(make-S (make-Z)))

You can now see why pred is the name of the field in a S structure. It holds the representation of the predecessor of the number that the whole S structure represents (provided that it satisfies the definition for Nat).

The previous example suggests the following template for functions that consume a Nat. It combines the idea of the Point template (list the accessor functions) with a cond for the two cases in the Nat definition.

(define (my-nat-fn nat)

(cond

[(Z? nat) ...]

[(S? nat) ... (S-pred nat) ...]))

This will work for some functions, but the next example shows that in some cases, it would help to be a little more precise. Let’s try to work out addition for Nats.

; plus : Nat Nat -> Nat

The template only mentions one parameter. We can add a second one.

(define (plus nat1 nat2)

(cond

[(Z? nat1) ... nat2 ...]

[(S? nat1) ... (S-pred nat1) ... nat2 ...]))

Filling in the first case is easy.

(define (plus nat1 nat2)

(cond

[(Z? nat1) nat2]

[(S? nat1) ... (S-pred nat1) ... nat2 ...]))

The second case seems harder.

(define (plus nat1 nat2)

(cond

[(Z? nat1) nat2]

[(S? nat1) ... (S-pred nat1) ... nat2 ...]))

In the case where nat1 is an S structure, what can we do with its predecessor and nat2?

We can add them together and compute the successor of the result.

(define (plus nat1 nat2)

(cond

[(Z? nat1) nat2]

[(S? nat1) (make-S (plus (S-pred nat1) nat2))]))

This is a recursive function: it uses itself in its own body. Like the recursive definition of Nat, it is not circular. Traces will give us confidence that it works. Here is the trace of adding the Nat representations of zero and zero, which should result in the Nat representation of zero.

(plus (make-Z) (make-Z))
\(\Rightarrow\)

1	Introduction
2	Starting out with Racket
3	Syntax and semantics
4	Structures
5	Lists
6	Functional Abstraction
7	Efficient Representations
8	Trees
9	Generative Recursion
10	Interpreters

4.1	Syntax and semantics
4.2	Simulating natural numbers
4.3	Correctness

4 Structures

4.1 Syntax and semantics

4.2 Simulating natural numbers

4.3 Correctness