Skip to content

Entries tagged "software".

So I failed at writing some clustered code in Perl

Until this time next month I'll be posting code-based discussions only.

Recently I've been wanting to explore creating clustered services, because clusters are definitely things I use professionally.

My initial attempt was to write an auto-clustering version of memcached, because that's a useful tool. Writing the core of the service took an hour or so:

  • Simple KeyVal.pm implementation.
  • Give it the obvious methods get, set, delete.
  • Make it more interesting by creating a read-only append-log.
  • The logfile will be replayed for clustering.

At the point I was done the following code worked:

use KeyVal;

# Create an object, and set some values
my $obj = KeyVal->new( logfile => "/tmp/foo.log" );
$obj->incr( "steve" );
$obj->incr( "steve" );

print $obj->get( "steve" ) # prints 2.

# Now replay the append-only log
my $replay = KeyVal->new( logfile => "/tmp/foo.log" );
$replay->replay();

print $replay->get( "steve" ) # prints 2.

In the first case we used the primitives to increment a value twice, and then fetch it. In the second case we used the logfile the first object created to replay all prior transactions, then output the value.

Neat. The next step was to make it work over a network. Trivial.

Finally I wanted to autodetect peers, and deploy replication. Each host would send out regular messages along the lines of "Do you have updates made since $time?". Any that did would replay the logfile from the given unixtime offset.

However here I ran into problems. Peer discovery was supposed to be basic, and I figured I'd write something that did leader election by magic. Unfortunately Perls threading code is .. unpleasant:

  • I wanted to store all known-peers in a singleton.
  • Then I wanted to create threads that would announce and receive updates.

This failed. Majorly. Because you cannot launch the implementation of a class-method as a thread. Equally you cannot make a variable which is "complex" shared across threads.

I wrote some demo code which works without packages and a shared singleton:

The Ruby version, by contrast, is much more OO and neater. Meh.

I've now shelved the project.

My next, big, task was to make the network service utterly memcached compatible. That would have been fiddly, but not impossible. Right now I just use a simple line-based network protocol.

I suspect I could have got what I wanted using EventMachine, or similar, but that's a path I've not yet explored, and I'm happy enough with that decision.

 

Some software releases to change the topic.

Now it is time for me to go silent for a while, and not talk about jobs, unemployment, or puppies.

This past week has also been full of software releases. Some of the public ones include:

Lumail - My console mail client, with integrated lua scripting

After three months of slow work I've issued a new release today. This release features several bugfixes for dealing with malformed MIME messages, and similar fun.

The core set of lua primitives hasn't changed very much for a good six months now, which means I guess rightly what kind of things would be useful.

Templer - My perl-based static-site generator.

This was recently updated to add two new plugins to the core:

  • A redis plugin to allow you to set variables to values retrieved from redis.
  • An RSS plugin to allow you to inline (remote) RSS feeds into your static HTML. Useful for building news-pages, etc.

Although there are a million static-site generators I still think mine has value, and I am consistently using it.

Months ago when I said "I'm writing a mail-client", all I need to do is handle three cases:

  • Display a list of folders.
  • Display index of messages.
  • Display a single message.

Then some new things like "Compose", "Reply", "Forward", I remember somebody commented along the lines of "Yeah, but MIME will make you hate your life" I laughed. Now I know better. Still it works, it works well, and I'm glad I did it.

 

A mixed week with minor tweaks

As previously mentioned I was looking to package pwsafe for Wheezy, as this is one of the few tools that I rely upon which isn't present.

There are now packages available, with the source on github.

I've also been doing some minor scripting because I've run into a few common problems recently:

run-parts

run-parts is a simple utility which will run every executable in a directory, more or less.

In Debian-land run-parts is the mechanism for /etc/cron.daily and /etc/cron.hourly - and that is where I've had problems recently.

Imagine you run a backup via cron.daily. Further imagine that you run a post-backup rsync and that this might take many many hours. If your backup takes >=24 hours you're screwed.

To that end I've patched my run-parts tool to alert and exit if a prior invocation is still running.

silent-run

I think everybody has this script - hide all output when running a command, unless the command fails. Looking today I see chronic from Joey's excellent moreutils does this. D'oh.

I think I've done more, but I cannot remember. In conclusion software is both easy and hard - easy because these two trivial changes were within my reach, but hard because years after encountering GNU/Linux we still have to add in the missing pieces.

Still could be worse, I spent four/five hours yesterday evening fighting with MS-SQL server, and that is time I'm never going to get back.

 

Steve, in brief

In brief:

Finally having recently bought the Canon 70-200mm f/2.8 lens for a King's ransom I've agreed to buy the 24-105mm f/4.0 lens from a friend - that will be my new portrait lens of choice, and I'll sell my existing 85mm f/1.8.

ObQuote: "I could help you cross your yard." - Up

 

My node-reverse-proxy is both stable and public

I posted a brief snippet of code on Friday which was my initial stab at a reverse HTTP proxy in Javascript (using node.js).

Over the past couple of days I've tidied it up, added a command line parser, and made it flexible enough that it works for me.

My node reverse HTTP proxy is now both documented ( a little ) and available for further eyeballs.

Usage is pretty much:

$ node ./node-reverse-proxy.js --config ./path/to/config.file.js

The configuration file defines lists of virtual hosts along with the destination back-ends to proxy to - which is usually going to be a server running upon a high port on the loopback adapter, but might not be.

In addition to that we can perform rewrites such as:

/**
  * Handler for wildcard host: *.repository.steve.org.uk
  *
  */
'([^.]*).repository.steve.org.uk':
    {
        /**
         * Rewrites for static files - these will be handled via a
         * separate virtual host.
         */
        'rules': {
            '^/robots.txt':  'http://repository.steve.org.uk/robots.txt',
            '^/favicon.ico': 'http://repository.steve.org.uk/favicon.ico',
        },
     },

That says requests for http://chronicle.repository.steve.org.uk/robots.txt will be redirected to http://repository.steve.org.uk/robots.txt.

Alternatively we can invoke javascript for each request matching a pattern: /** * static.steve.org.uk will mostly proxy to 127.0.0.1:1008 * but files beneath /private/ have an IP-based ACL. */ 'static.steve.org.uk': { host: 'localhost', port: '1008', 'functions': { '/private': (function(orig_host, vhost,req,res) { var remote = req.connection.remoteAddress;; if ( ( remote != "80.68.85.46" ) && ( remote != "82.41.51.252" ) && ( remote != "89.16.161.34") && ( remote != "89.16.161.98" ) ) { res.writeHead(403); res.write( "Denied access to " + req.url + " from " + remote ); res.end(); } }), } },

Fun stuff. It was live for my server, replacing apache, for a few hours today. I need to add some trivial HTTP Basic-Auth handling then it will go back.

Otherwise I hope it is vaguely useful to others, and that the provided examples explain things neatly.

ObQuote: "Only one thing alive with less than four legs can hear this frequency" - Superman.

 

It would be nice if we could record which files populate or read

It would be really neat if there were some tool which recorded which dotfiles an application read, used, or created.

As an example emacs uses .emacs, but won't create it. However firefox will create and fill ~/.mozilla if it isn't present, and links will create ~/.links2.

What would we do with that data? I'm not sure off the top of my head, but I think it is interesting to collect regardless. Perhaps a simple tool such as apt-file to download the data and let you search:

who-creates ~/.covers
who-creates ~/.dia

Obviously the simple use is to purge user-data when matching packages are removed - e.g. dpkg-posttrigger hook. But that's a potentially dangerous thing to do.

Anyway I'm just pondering - I expect that over time applications will start switching to using "centralised" settings such as ~/.gconf2 etc.

In the menatime I've started cleaning up ~/ on my own machines - things like ~/.spectemurc, ~/.grip, etc.

ObQuote: What a long sword. I like that in a man - Blood of the Samurai (Don't be tempted; awful film.)

 

A good cockerel always points north

I spent a while yesterday thinking over the software projects that I'm currently interested in. It is a reasonably short list.

At the time I just looked over the packages that I've got installed and the number of bugs. I'm a little disappointed to see that the bugfixes that I applied to GNU screen have been mostly ignored.

Still I have the day off work on Thursday and Friday this week and would probbly spend it releasing the pending advisories I've got in my queue, and then fixing N bugs in a single package.

The alternative is to build a quick GPG-based mailing list manager.

I'd like a simple system which allowed users to subscribe, and only accepted GPG-signed mails. The subscriber could choose to receive their messages either signed (as-is) by the submitter or encrypted to them.

So to join you'd do something like this:

subscribe foo@example.org [encrypted]
--BEGIN PUBLIC KEY --
...
--ND PUBLIC KEY--

There is the risk, with a large enough number of users, that a list could DOS the host if it had to encrypt each message to each subscribers. But if the submissions were validated as being signed by a user with a known key it should be minimal, unless there is a lot of traffic.

The cases are simple:

  • foo-subscribe => Add the user to the list, assuming valid key data found
  • foo-unsubscribe => Do the reverse.
  • foo:
    • If the message is signed accept and either mail to each recipient, or encrypt on a per-recipient basis.
    • If the message is not signed, or signed by a non-subscriber drop it.

There are some random hacks out there for this, including a mailman patch (did I mention how much I detest mailman yet today?) but nothing recent.

 

So here it is Merry Christmas

Lars Wirzenius recently released, and packaged for Debian, a simple script to make release tarballs. He calls it Unperish.

It makes me wonder how many other people use that kind of system?

Of the top of my head the only similar thing I can recall using is Brad Fitzpatrick's ShipIt - another moduler/plugin-based system (Perl rather than Python this time.)

For my needs I tend to just write a Makefile which has a "dist" target, and then I have a simple script called "release". This runs:

  1. make dist / make release.
  2. creates a gpg signature of the release.
  3. scp's the resulting files to a remote source.

All this is configurable via a per-project .release file.

The configuration files are very simple, the script itself is almost trivial but being able to sit in a random project directory and have a new tarball on my webserver just by typing "release" is enormously useful.

There are times when I think I should make it a mini-project of its own, with the ability to auto-build Debian packages, etc. Other times I just think .. well its a hell of a lot better than my previous ad-hoc solution.

At the very least I think I will make the cosmetic change of updating the script to run "make test" if there is a test/ or t/ directory inside the generated tarball.

In real news - tomorrow I leave for a two week holiday with my partner's parents. Yesterday I got back from a night spent with her in York. The Bytemark staff night out. Lots of fun. Over too soon, but lots of fun.