UNB/ CS/ David Bremner/ David Bremner's Blog
2D array C CGTA DAG ILP JavaScript LP ObjectContaining PostgreSQL SAT absolute value ae aljazeera alleged-humour amarok ampl android anonymous function application argyll array arrays assignment asymptotics asynchronous audio backtracking backup beamer bibutils binary file binding bipartite bisect bitmap blogs blorg brag broadcasting buildinfo business byte code caff cartesian cell array censorship cheney-mta class classification closures collector colorhug colour colour management column generation combinatorics combinators commonjs modules comprehensions const continuations convolution cpan cplusplus cs2613 cs2613 review cs3383 cs3383 lecture cs4613 curry dantzig data structure de Bruijn debian deepcopy definitions diet digikam divide and conquer djvu docstrings drracket duality duplicity dynamic memory allocation dynamic multithreaded dynamic programming ebnf emacs email encryption enumeration environments es2015 ethics eval example exceptions expressions fibonacci file find first class functions flang flow fork-join forms free free-list frog fun function functions garbage collection gdb generational generator generators geometry git git-annex glpk glpsol gmpl gpg graph graphics greedy hack haha hardware hash-table haskell health records higher-order-functions highlight horn-clause huffman ical ikiwiki image processing immutability immutable data structures include file incremental integer program intellectual property internet remembers issue tracking iterator jasmine journal json jvm knn labs lambda latex lazy lenovo lexer lexical scope life linear programming linearization linked list linux list lists literate programming logrotate longest common subsequence m4a macros makefile manners maps mark and sweep marketing match matching matrix media memoization metaevaluators methods modules mongolia mps mst multiple compilation units music mutability mutation mutator networking news nodejs modules norm notmuch numeric error object equality objects octave open-source open content opencl opencourseware optimization org-mode oz packaging parser-tools parsing pathname pdf pdftk perl photo photography pictures pim plai plai-gc2 plait planet plot pointers policy politics preprocessor privacy product programming languages properties pushmi pytest python python glob python os quilt quoting racket rackunit ragg range rant recursion regex regression regular expressions repl reshaping review rfc2822 rss sbuild scheduling scheme security separation set shlibs shortest path simplex slashdot slideshow source-highlight spam sql sqlite ssh stack-smash statistics strings struct submodule substitution superfish svn syntax syntax rule tail-recursion tail recursion teaching test test coverage testing tests threshold timer topgit topological sort tutorial type type checking type inference type rules types unification union unit-test unit tests university university computing valgrind varargin vcs-pkg vector vectorization wanderlust web applications whinge whistleblower with working-directory x61 xml xorg yak-shaving

Welcome to my blog. Have a look at the most recent posts below, or browse the tag cloud on the right. An archive of all posts is also available.


So apparently there's this pandemic thing, which means I'm teaching "Alternate Delivery" courses now. These are just like online courses, except possibly more synchronous, definitely less polished, and the tuition money doesn't go to the College of Extended Learning. I figure I'll need to manage share videos, and our learning management system, in the immortal words of Marie Kondo, does not bring me joy. This has caused me to revisit the problem of sharing large files in an ikiwiki based site (like the one you are reading).

My goto solution for large file management is git-annex. The last time I looked at this (a decade ago or so?), I was blocked by git-annex using symlinks and ikiwiki ignoring them for security related reasons. Since then two things changed which made things relatively easy.

  1. I started using the rsync_command ikiwiki option to deploy my site.

  2. git-annex went through several design iterations for allowing non-symlink access to large files.


In my ikiwiki config

    # attempt to hardlink source files? (optimisation for large files)
    hardlink => 1,

In my ikiwiki git repo

$ git annex init
$ git annex add foo.jpg
$ git commit -m'add big photo'
$ git annex adjust --unlock                 # look ikiwiki, no symlinks
$ ikiwiki --setup ~/.config/ikiwiki/client  # rebuild my local copy, for review
$ ikiwiki --setup /home/bremner/.config/ikiwiki/rsync.setup --refresh  # deploy

You can see the result at photo

Posted Sat 18 Jul 2020 09:09:00 AM Tags:

I have lately been using org-mode literate programming to generate example code and beamer slides from the same source. I hit a wall trying to re-use functions in multiple files, so I came up with the following hack. Thanks 'ngz' on #emacs and Charles Berry on the org-mode list for suggestions and discussion.

(defun db-extract-tangle-includes ()
  (goto-char (point-min))
  (let ((case-fold-search t)
        (retval nil))
    (while (re-search-forward "^#[+]TANGLE_INCLUDE:" nil t)
      (let ((element (org-element-at-point)))
        (when (eq (org-element-type element) 'keyword)
          (push (org-element-property :value element) retval))))

(defun db-ob-tangle-hook ()
  (let ((includes (db-extract-tangle-includes)))
    (mapc #'org-babel-lob-ingest includes)))

(add-hook 'org-babel-pre-tangle-hook #'db-ob-tangle-hook t)

Use involves something like the following in your org-file.

#+SETUPFILE: presentation-settings.org
#+SETUPFILE: tangle-settings.org
#+TANGLE_INCLUDE: lecture21.org
#+TITLE: GC V: Mark & Sweep with free list

For batch export with make, I do something like

%.tangle-stamp: %.org
    emacs --batch --quick  -l org  -l ${HOME}/.emacs.d/org-settings.el --eval "(org-babel-tangle-file \"$<\")"
    touch $@


I previously posted about my extremely quick-and-dirty buildinfo database using buildinfo-sqlite. This year at DebConf, I re-implimented this using PostgreSQL backend, added into some new features.

There is already buildinfo and buildinfos. I was informed I need to think up a name that clearly distinguishes from those two. Thus I give you builtin-pho.

There's a README for how to set up a local database. You'll need 12GB of disk space for the buildinfo files and another 4GB for the database (pro tip: you might want to change the location of your PostgreSQL data_directory, depending on how roomy your /var is)

Demo 1: find things build against old / buggy Build-Depends

select distinct p.source,p.version,d.version, b.path
      binary_packages p, builds b, depends d
      p.suite='sid' and b.source=p.source and
      b.arch_all and p.arch = 'all'
      and p.version = b.version
      and d.id=b.id and d.depend='dh-elpa'
      and d.version < debversion '1.16'

Demo 2: find packages in sid without buildinfo files

select distinct p.source,p.version
      binary_packages p
        select p.source,p.version
from binary_packages p, builds b
      and p.version=b.version
      and ( (b.arch_all and p.arch='all') or
            (b.arch_amd64 and p.arch='amd64') )


Work in progress by an SQL newbie.

Posted Tue 23 Jul 2019 12:00:00 AM Tags:

1 Background

Apparently motivated by recent phishing attacks against @unb.ca addresses, UNB's Integrated Technology Services unit (ITS) recently started adding banners to the body of email messages. Despite (cough) several requests, they have been unable and/or unwilling to let people opt out of this. Recently ITS has reduced the size of banner; this does not change the substance of what is discussed here. In this blog post I'll try to document some of the reasons this reduces the utility of my UNB email account.

2 What do I know about email?

I have been using email since 1985 1. I have administered my own UNIX-like systems since the mid 1990s. I am a Debian Developer 2. Debian is a mid-sized organization (there are more Debian Developers than UNB faculty members) that functions mainly via email (including discussions and a bug tracker). I maintain a mail user agent (informally, an email client) called notmuch 3. I administer my own (non-UNB) email server. I have spent many hours reading RFCs 4. In summary, my perspective might be different than an enterprise email adminstrator, but I do know something about the nuts and bolts of how email works.

3 What's wrong with a helpful message?

3.1 It's a banner ad.

I don't browse the web without an ad-blocker and I don't watch TV with advertising in it. Apparently the main source of advertising in my life is a service provided by my employer. Some readers will probably dispute my description of a warning label inserted by an email provider as "advertising". Note that is information inserted by a third party to promote their own (well intentioned) agenda, and inserted in an intentionally attention grabbing way. Advertisements from charities are still advertisements. Preventing phishing attacks is important, but so are an almost countless number of priorities of other units of the University. For better or worse those units are not so far able to insert messages into my email. As a thought experiment, imagine inserting a banner into every PDF file stored on UNB servers reminding people of the fiscal year end.

3.2 It makes us look unprofessional.

Because the banner is contained in the body of email messages, it almost inevitably ends up in replies. This lets funding agencies, industrial partners, and potential graduate students know that we consider them as potentially hostile entities. Suggesting that people should edit their replies is not really an acceptable answer, since it suggests that it is acceptable to download the work of maintaining the previous level of functionality onto each user of the system.

3.3 It doesn't help me

I have an archive of 61270 email messages received since 2003. Of these 26215 claim to be from a unb.ca address 5. So historically about 42% of the mail to arrive at my UNB mailbox is internal 6. This means that warnings will occur in the majority of messages I receive. I think the onus is on the proposer to show that a warning that occurs in the large majority of messages will have any useful effect.

3.4 It disrupts my collaboration with open-source projects

Part of my job is to collaborate with various open source projects. A prominent example is Eclipse OMR 7, the technological driver for a collaboration with IBM that has brought millions of dollars of graduate student funding to UNB. Git is now the dominant version control system for open source projects, and one popular way of using git is via git-send-email 8

Adding a banner breaks the delivery of patches by this method. In the a previous experiment I did about a month ago, it "only" caused the banner to end up in the git commit message. Those of you familiar with software developement will know that this is roughly the equivalent of walking out of the bathroom with toilet paper stuck to your shoe. You'd rather avoid it, but it's not fatal. The current implementation breaks things completely by quoted-printable re-encoding the message. In particular '=' gets transformed to '=3D' like the following

-+    gunichar *decoded=g_utf8_to_ucs4_fast (utf8_str, -1, NULL);
-+    const gunichar *p = decoded;
++    gunichar *decoded=3Dg_utf8_to_ucs4_fast (utf8_str, -1, NULL);

I'm not currently sure if this is a bug in git or some kind of failure in the re-encoding. It would likely require an investment of several hours of time to localize that.

3.5 It interferes with the use of cryptography.

Unlike many people, I don't generally read my email on a phone. This means that I don't rely on the previews that are apparently disrupted by the presence of a warning banner. On the other hand I do send and receive OpenPGP signed and encrypted messages. The effects of the banner on both signed and encrypted messages is similar, so I'll stick to discussing signed messages here. There are two main ways of signing a message. The older method, still unfortunately required for some situations is called "inline PGP". The signed region is re-encoded, which causes gpg to issue a warning about a buggy MTA 9, namely gpg: quoted printable character in armor - probably a buggy MTA has been used. This is not exactly confidence inspiring. The more robust and modern standard is PGP/MIME. Here the insertion of a banner does not provoke warnings from the cryptography software, but it does make it much harder to read the message (before and after screenshots are given below). Perhaps more importantly it changes the message from one which is entirely signed or encrypted 10, to one which is partially signed or encrypted. Such messages were instrumental in the EFAIL exploit 11 and will probably soon be rejected by modern email clients.


Figure 1: Intended view of PGP/MIME signed message


Figure 2: View with added banner



On Multics, when I was a high school student


IETF Requests for Comments, which define most of the standards used by email systems.


possibly overcounting some spam as UNB originating email


In case it's not obvious dear reader, communicating with the world outside UNB is part of my job.


Some important projects function exclusively that way. See https://git-send-email.io/ for more information.


Mail Transfer Agent

Author: David Bremner

Created: 2019-05-22 Wed 17:04


Posted Wed 22 May 2019 12:00:00 AM Tags:



  • NMUed cdargs
  • NMUed silversearcher-ag-el
  • Uploaded the partially unbundled emacs-goodies-el to Debian unstable
  • packaged and uploaded graphviz-dot-mode


  • packaged and uploaded boxquote-el
  • uploaded apache-mode-el
  • Closed bugs in graphviz-dot-mode that were fixed by the new version.
  • filed lintian bug about empty source package fields


  • packaged and uploaded emacs-session
  • worked on sponsoring tabbar-el


  • uploaded dh-make-elpa



Wrote patch series to fix bug noticed by seanw while (seanw was) working working on a package inspired by policy workflow.


  • Finished reviewing a patch series from dkg about protected headers.


  • Helped sean w find right config option for his bug report

  • Reviewed change proposal from aminb, suggested some issues to watch out for.


  • Add test for threading issue.



  • uploaded nullmailer backport


  • add "envelopefilter" feature to remotes in nullmailer-ssh




  • Forwarded #704527 to https://rt.cpan.org/Ticket/Display.html?id=125914


  • Uploaded libemail-abstract-perl to fix Vcs-* urls
  • Updated debhelper compat and Standards-Version for libemail-thread-perl
  • Uploaded libemail-thread-perl


  • fixed RC bug #904727 (blocking for perl transition)

Policy and procedures


  • seconded #459427


  • seconded #813471
  • seconded #628515


  • read and discussed draft of salvaging policy with Tobi


  • Discussed policy bug about short form License and License-Grant


  • worked with Tobi on salvaging proposal
Posted Sun 29 Jul 2018 01:58:00 PM Tags:


Debian is currently collecting buildinfo but they are not very conveniently searchable. Eventually Chris Lamb's buildinfo.debian.net may solve this problem, but in the mean time, I decided to see how practical indexing the full set of buildinfo files is with sqlite.


  1. First you need a copy of the buildinfo files. This is currently about 2.6G, and unfortunately you need to be a debian developer to fetch it.

     $ rsync -avz mirror.ftp-master.debian.org:/srv/ftp-master.debian.org/buildinfo .
  2. Indexing takes about 15 minutes on my 5 year old machine (with an SSD). If you index all dependencies, you get a database of about 4G, probably because of my natural genius for database design. Restricting to debhelper and dh-elpa, it's about 17M.

     $ python3 index.py

    You need at least python3-debian installed

  3. Now you can do queries like

     $ sqlite3 depends.sqlite "select * from depends where depend='dh-elpa' and depend_version<='0106'"

    where 0106 is some adhoc normalization of 1.6


The version number hackery is pretty fragile, but good enough for my current purposes. A more serious limitation is that I don't currently have a nice (and you see how generous my definition of nice is) way of limiting to builds currently available e.g. in Debian unstable.

Posted Sat 02 Sep 2017 05:41:00 PM Tags:

Today I received some marketing bumpf from my employer, which on top of being a charming way to spend money while cutting my unit budget, is frankly an embarassment from the point of view of security and privacy.

Every link in this supposed communication from UNB is a link to a third party site, with the host name consisting mainly of digits. When we receive large scale phishing attacks every week so, training people to ignore funny looking urls doesn't seem like a great idea. All of these URLs contain tracking cookies, presumably so that Eloqua can sell UNB information about the mail reading habits of its employees and alumni.

It finishes with the following text.

UNB occasionally sends out important announcements to the UNB community. To unsubscribe from these emails, please click
here <http://s1961286906.t.en25.com/e/cu?s=1961286906&elqc=11&elq=my_cookie_deleted>.
To unsubscribe from all future UNB emails, please click here
Privacy Statement
UNB, the UNB Advancement Office and third party host Eloqua/Oracle are committed to protecting the personal information of
all UNB Alumni. The information collected will be used for the purposes of promoting and supporting UNB events, activities,
and endeavours and will be accessible to UNB Advancement database administrators. Connection to third party host is via
Secure Socket Layer (SSL) technology. For more information on the protection of personal information at UNB please consult
the University Secretariat, University of New Brunswick, PO Box 4400, Fredericton, NB, E3B 5A3 www.unb.ca/secretariat (506)

Can you spot the lie? Of course I mean the technical error about about http and https. What kind of cynic do you take me for?

Never mind what the government said
They're either lying or they've been misled...

Bruce Coburn, Burn, 1986

Posted Mon 14 Mar 2016 10:07:00 PM Tags:

Today I was wondering about converting a pdf made from scan of a book into djvu, hopefully to reduce the size, without too much loss of quality. My initial experiments with pdf2djvu were a bit discouraging, so I invested some time building gsdjvu in order to be able to run djvudigital.

Watching the messages from djvudigital I realized that the reason it was achieving so much better compression was that it was using black and white for the foreground layer by default. I also figured out that the default 300dpi looks crappy since my source document is apparently 600dpi.

I then went back an compared djvudigital to pdf2djvu a bit more carefully. My not-very-scientific conclusions:

  • monochrome at higher resolution is better than coloured foreground
  • higher resolution and (a little) lossy beats lower resolution
  • at the same resolution, djvudigital gives nicer output, but at the same bit rate, comparable results are achievable with pdf2djvu.

Perhaps most compellingly, the output from pdf2djvu has sensible metadata and is searchable in evince. Even with the --words option, the output from djvudigital is not. This is possibly related to the error messages like

Can't build /Identity.Unicode /CIDDecoding resource. See gs_ciddc.ps .

It could well be my fault, because building gsdjvu involved guessing at corrections for several errors.

  • comparing GS_VERSION to 900 doesn't work well, when GS_VERSION is a 5 digit number. GS_REVISION seems to be what's wanted there.

  • extra declaration of struct timeval deleted

  • -lz added to command to build mkromfs

Some of these issues have to do with building software from 2009 (the instructions suggestion building with ghostscript 8.64) in a modern toolchain; others I'm not sure. There was an upload of gsdjvu in February of 2015, somewhat to my surprise. AT&T has more or less crippled the project by licensing it under the CPL, which means binaries are not distributable, hence motivation to fix all the rough edges is minimal.

Version kilobytes per page position in figure
Original PDF 80.9 top
pdf2djvu --dpi=450 92.0 not shown
pdf2djvu --monochrome --dpi=450 27.5 second from top
pdf2djvu --monochrome --dpi=600 --loss-level=50 21.3 second from bottom
djvudigital --dpi=450 29.4 bottom


Posted Tue 29 Dec 2015 12:57:00 PM Tags:

After a mildly ridiculous amount of effort I made a bootable-usb key.

I then layered a bash script on top of a perl script on top of gpg. What could possibly go wrong?



 keys=$(gpg --with-colons  $infile | sed -n 's/^pub//p' | cut -f5 -d: )

 gpg --homedir $HOME/.caff/gnupghome --import $infile

 caff -R -m no "${keys[*]}"

 today=$(date +"%Y-%m-%d")
 for key in ${keys[*]}; do
     (cd $HOME/.caff/keys/;   tar rvf "$output" $today/$key.mail*)

The idea is that keys are exported to files on a networked host, the files are processed on an offline host, and the resulting tarball of mail messages sneakernetted back to the connected host.

Posted Wed 23 Dec 2015 09:20:00 AM Tags:

Umm. Somehow I thought this would be easier than learning about live-build. Probably I was wrong. There are probably many better tutorials on the web.

Two useful observations: zeroing the key can eliminate mysterious grub errors, and systemd-nspawn is pretty handy. One thing that should have been obvious, but wasn't to me is that it's easier to install grub onto a device outside of any container.

Find device

 $ dmesg

Count sectors

 # fdisk -l /dev/sdx

Assume that every command after here is dangerous.

Zero it out. This is overkill for a fresh key, but fixed a problem with reusing a stick that had a previous live distro installed on it.

 # dd if=/dev/zero of=/dev/sdx bs=1048576 count=$count

Where $count is calculated by dividing the sector count by 2048.

Make file system. There are lots of options. I eventually used parted

 # parted

 (parted) mklabel msdos
 (parted) mkpart primary ext2 1 -1
 (parted) set 1 boot on
 (parted) quit

Make a file system

 # mkfs.ext2 /dev/sdx1
 # mount /dev/sdx1 /mnt

Install the base system

 # debootstrap --variant=minbase jessie /mnt http://httpredir.debian.org/debian/

Install grub (no chroot needed)

 # grub-install --boot-directory /mnt/boot /dev/sdx1

Set a root password

 # chroot /mnt
 # passwd root
 # exit

create up fstab

# blkid -p /dev/sdc1 | cut -f2 -d' ' > /mnt/etc/fstab

Now edit to fix syntax, tell ext2, etc...

Now switch to system-nspawn, to avoid boring bind mounting, etc..

# systemd-nspawn -b -D /mnt

login to the container, install linux-base, linux-image-amd64, grub-pc

EDIT: fixed block size of dd, based on suggestion of tg. EDIT2: fixed count to match block size

Posted Mon 21 Dec 2015 08:43:00 AM Tags:

This wiki is powered by ikiwiki.