Wed, 22 Oct 2014 09:21:39 GMT
Last night I mostly patched my local copy of less to build and link against the PCRE regular expression library.
I've wanted to do that for a while, and reading Raymond Chen's blog post last night made me try it out.
The patch was small and pretty neat, and I'm familiar with GNU less having patched it in the past. But it doesn't contain tests.
Test cases are hard. Many programs, such as less, are used interactively which makes writing a scaffold hard. Other programs suffer from a similar fate - I'm not sure how you'd even test a web browser such as Firefox these days - mangleme would catch some things, eventually, but the interactive stuff? No clue.
In the past MySQL had a free set of test cases, but my memory is that Oracle locked them up. SQLite is famous for its decent test coverage. But off the top of my head I can't think of other things.
As a topical example there don't seem to be decent test-cases for either bash or openssl. If it compiles it works, more or less.
I did start writing some HTTP-server test cases a while back, but that was just to automate security attacks. e.g. Firing requests like:
GET /../../../etc/passwd HTTP/1.0
GET //....//....//....//etc/passwd HTTP/1.0
(It's amazing how many toy HTTP server components included in projects and products don't have decent HTTP-servers.)
I could imagine that being vaguely useful, especially because it is testing the protocol-handling rather than a project-specific codebase.
Anyway, I'm thinking writing test cases for things is good, but struggling to think of a decent place to start. The project has to be:
- Open source.
- Widely used - to make it a useful contribution.
- Not written in some fancy language.
- Open to receiving submissions.
Comments welcome; but better yet why not think about the test-coverage of any of your own packages and projects...?
Tags: less, misc, testing.
Sat, 18 Oct 2014 23:03:06 GMT
Yesterday I received a small rush of SPAM mails, all of which were 419 scams, and all of them sent by "Mrs Elizabeth PETERSEN".
It struck me that I can't think of ever receiving a legitimate mail from a "Mrs XXX [YYY]", but I was too busy to check.
Today I've done so. Of the 38,553 emails I've received during the month of October 2014 I've got a hell of a lot of mails with a From address including a "Mrs" prefix:
"Mrs.Clanzo Amaki" <email@example.com>
"Mrs Sarah Mamadou"<firstname.lastname@example.org>
"Mrs Abia Abrahim" <email@example.com>
"Mrs. Josie Wilson" <firstname.lastname@example.org>
"Mrs. Theresa Luis"<email@example.com>
There are thousands more. Not a single one of them was legitimate.
I have one false-positive when repeating the search for a Mr-prefix. I have one friend who has set his sender-address to "Mr Bob Smith", which always reads weirdly to me, but every single other email with a Mr-prefix was SPAM.
I'm not going to use this in any way, since I'm happy with my mail-filtering setup, but it was interesting observation.
Names are funny. My wife changed her surname post-marriage, but that was done largely on the basis that introducing herself as "Doctor Kemp" was simpler than "Doctor Foreign-Name", she'd certainly never introduce herself ever as Mrs Kemp.
Trivia: In Finnish the word for "Man" and "Husband" is the same (mies), but the word for "Woman" (nainen) is different than the word for "Wife" (vaimo).
Tags: email, names, random, spam.
Wed, 8 Oct 2014 19:03:34 GMT
Before our recent trip to Poland I took the time to create my own e-book, containing the names/addresses of people to whom we wanted to send postcards.
Authoring ebooks is simple, and this was a useful use. (Ordinarily I'd have my contacts on my phone, but I deliberately left it at home ..)
I did mean to copy and paste some notes from wikipedia about transport, tourist destinations, etc, into a brief guide. But I forgot.
In other news the toy virtual machine I hacked together got a decent series of updates, allowing you to embed it and add your own custom opcode(s) easily. That was neat, and fell out naturely from the switch to using function-pointers for the opcode implementation.
Tags: kindle, random, simple-vm, travel.
Sun, 5 Oct 2014 08:34:30 GMT
Before I forget I had meant to write about a toy virtual machine which I'ce been playing with.
It is register-based with ten registers, each of which can hold either a string or int, and there are enough instructions to make it fun to use.
I didn't go overboard and write a complete grammer, or a real compiler, but I did do enough that you can compile and execute obvious programs.
First compile from the source to the bytecodes:
$ ./compiler examples/loop.in
Mmm bytecodes are fun:
$ xxd ./examples/loop.raw
0000000: 3001 1943 6f75 6e74 696e 6720 6672 6f6d 0..Counting from
0000010: 2074 656e 2074 6f20 7a65 726f 3101 0101 ten to zero1...
0000020: 0a00 0102 0100 2201 0102 0201 1226 0030 ......"......&.0
0000030: 0104 446f 6e65 3101 00 ..Done1..
Now the compiled program can be executed:
$ ./simple-vm ./examples/loop.raw
[stdout] register R01 = Counting from ten to zero
[stdout] register R01 = 9 [Hex:0009]
[stdout] register R01 = 8 [Hex:0008]
[stdout] register R01 = 7 [Hex:0007]
[stdout] register R01 = 6 [Hex:0006]
[stdout] register R01 = 5 [Hex:0005]
[stdout] register R01 = 4 [Hex:0004]
[stdout] register R01 = 3 [Hex:0003]
[stdout] register R01 = 2 [Hex:0002]
[stdout] register R01 = 1 [Hex:0001]
[stdout] register R01 = 0 [Hex:0000]
[stdout] register R01 = Done
There could be more operations added, but I'm pleased with the general behaviour, and embedding is trivial. The only two things that make this even remotely interesting are:
- Most toy virtual machines don't cope with labels and jumps. This does.
- Even though it was a real pain to go patching up the offsets.
- Having labels be callable before they're defined is pretty mandatory in practice.
- Most toy virtual machines don't allow integers and strings to be stored in registers.
- Now I've done that I'm not 100% sure its a good idea.
Anyway that concludes todays computer-fun.
Tags: random, simple-vm.
Sat, 4 Oct 2014 12:20:45 GMT
We returned safely from Kraków, despite a somewhat turbulent flight home.
There were many pictures taken, but thus far I've only posted a random night-time shot. Perhaps more will appear in the future.
In other news I've just made a new release of the chronicle blog compiler, So 5.0.7 should shortly appear on CPAN.
The release contains a bunch of minor fixes, and some new facilities relating to templates.
It seems likely that in the future there will be the ability to create "static pages" along with the blog-entries, tag-clouds & etc. The suggestion was raised on the github issue tracker and as a proof of concept I hacked up a solution which works entirely via the chronicle plugin-system, proving that the new development work wasn't a waste of time - especially when combined with the significant speedups in the new codebase.
(ObRandom: Mailed the Debian package-mmaintainer to see if there was interest in changing. Also mailed a couple of people I know who are using the old code to see if they had comments on the new code, or had any compatibility issues. No replies from either, yet. *shrugs*)
Tags: chronicle, travel.
Fri, 26 Sep 2014 17:20:04 GMT
Next week my wife and I shall be mostly visiting Poland, and spending a week in Kraków.
It has been a while since I've had a non-Helsinki-based holiday, so I'm looking forward to the trip.
In other news I've been rationalising DNS entries and domain names recently, all being well this zone should be served by Amazon shortly, subject to the usual combination of TTLs and resolution-puns.
Thu, 25 Sep 2014 19:11:19 GMT
Much has already been written about the recent bash security problem,
allocated the CVE identifier CVE-2014-6271, so I'm not even going to touch it.
It did remind me to double-check my systems to make sure that I didn't have any packages installed that I didn't need though, because obviously having fewer packages installed and fewer services running reduces the potential attack surface.
I had noticed in the past I had python installed and just though "Oh, yeah, I must have python utilities running". It turns out though that on 16 out of 19 servers I control I had python installed solely for the
So I hacked up a horrible replacement for `lsb_release in pure shell, and then became cruel:
~ # dpkg --purge python python-minimal python2.7 python2.7-minimal lsb-release
That horrible replacement is horrible because it defers detection of all the names/numbers to the
/etc/os-release which wasn't present in earlier versions of Debian. Happily all my Debian GNU/Linux hosts run Wheezy or later, so it all works out.
So that left three hosts that had a legitimate use for Python:
- My mail-host runs
- So I purged it.
- I replaced it with isync.
- My host-machine runs KVM guests, via
qemu-kvm depends on Python solely for the script
- I'm not pleased about that but will tolerate it for now.
- The final host was my ex-mercurial host.
- Since I've switched to git I just removed tha package.
So now 1/19 hosts has Python installed. I'm not averse to the language, but given that I don't personally develop in it very often (read "once or twice in the past year") and by accident I had no python-scripts installed I see no reason to keep it on the off-chance.
My biggest surprise of the day was that now that we can use
dash as our default shell we still can't purge
bash. Since it is marked as
Essential. Perhaps in the future.
Tags: kvm, offlineimap, python.
Tue, 23 Sep 2014 20:42:56 GMT
I (grudgingly) use the Calibre e-book management software to handle my collection of books, and copy them over to my kindle-toy.
One thing that has always bothered me was the fact that when books are imported their ratings are too. If I receive a small sample of ebooks from a friend their ratings are added to my collections.
I've always regarded ratings as things personal to me, rather than attributes of a book itself; as my tastes might not match yours, and vice-versa.
On that basis the last time I was importing a small number of books and getting annoyed at having to manually reset all the imported ratings I decided to do something about it. I started hacking and put together a simple Calibre plugin to automatically zero ratings when books are imported to the collection (i.e. set the rating to be zero).
Sadly this work wasn't painless, despite the small size, as an unfortunate bug in Calibre meant my plugin method wasn't called. Happily Kovid Goyal helped me work through the problem, and he committed a fix that will be in the next Calibre release. For the moment I'm using today's git-snapshot and it works well.
Similarly I've recently started using extended file attributes to store metadata on my desktop system. Unfortunately the GNU
findutils package doesn't allow you to do the obvious thing:
$ find ~/foo -xattr user.comment
There are several
xattr patches floating around, but I had to bundle my own in
debian/patches to get support for finding files that have particular attribute names.
Maybe one day extended attributes will be taken seriously. (
cp, etc will preserve them. I'm hazy on the compatibility with
tar, but most things seem to be working.)
Tags: calibre, findutils, python, wikipedia.
Wed, 17 Sep 2014 17:23:08 GMT
Assuming this post shows up then I'll have successfully migrated from Chronicle to a temporary replacement.
Chronicle is awesome, and despite a lack of activity recently it is not dead. (No activity because it continued to do everything I needed for my blog.)
Unfortunately though there is a problem with chronicle, it suffers from a bit of a performance problem which has gradually become more and more vexing as the nubmer of entries I have has grown.
When chronicle runs it :
- It reads each post into a complex data-structure.
- Then it walks this multiple times.
- Finally it outputs a whole bunch of posts.
In the general case you rebuild a blog because you've made a entry, or received a new comment. There is some code which tries to use
memcached for caching, but in general chronicle just isn't fast and it is certainly memory-bound if you have a couple of thousand entries.
Currently my test data-set contains 2000 entries and to rebuild that from a clean start takes around 4 minutes, which is pretty horrific.
So what is the alternative? What if you could parse each post once, add it to an SQLite database, and then use that for writing your output pages? Instead of the complex data-structure in-RAM and the need to parse a zillion files you'd have a standard/simple SQL structure you could use to build a tag-cloud, an archive, & etc. If you store the contents of the parsed-blog, along with the
mtime of the source file you can update it if the entry is changed in the future, as I sometimes make typos which I only spot once Ive run
make steve on my blog sources.
Not surprisingly the newer code is significantly faster if you have 2000+ posts. If you've imported the posts into SQLite the most recent entries are updated in 3 seconds. If you're starting cold, parsing each entry, inserting it into SQLite, and then generating the blog from scratch the build time is still less than 10 seconds.
The downside is that I've removed features, obviously nothing that I use myself. Most notably the calendar view is gone, as is the ability to use date-based URLs. Less seriously there is only a single theme, which is what is used upon this site.
In conclusion I've written something last night which is a stepping stone between the current
chronicle2 which will appear in due course.
PS. This entry was written in
markdown, just because I wanted to be sure it worked.
Tags: chronicle, markdown.
Tue, 16 Sep 2014 19:42:11 GMT
Personally I believe that any application packaged for Debian should neither phone home, attempt to download plugins over HTTP at run-time, or update itself.
On that basis I've filed #761828.
As a project we have guidelines for what constitutes a "serious" bug, which generally boil down to a package containing a security issue, causing data-loss, or being unusuable.
I'd like to propose that these kind of tracking "things" are equally bad. If consensus could be reached that would be a good thing for the freedom of our users.
(Ooops I slipped into "us", "our user", I'm just an outsider looking in. Mostly.)
Tags: debian, misc.
- 22 October 2014
- 18 October 2014
- 8 October 2014
- 5 October 2014
- 4 October 2014
- 26 September 2014
- 25 September 2014
- 23 September 2014
- 17 September 2014
- 16 September 2014