Steve Kemp's Blog Writings relating to Debian & Free Software

A busy few weeks - bah humbug

Sun, 25 Nov 2012 12:32:38 GMT

The following companies are amongst those showing Christmas Adverts on television before the start of December:

  • Tesco
  • Homebase.
  • M&S
  • Waitrose.
  • John Lewis.

I will boycott these companies until next year.

In happier news I've spent the past week or two replacing the monitoring system that we use at work.

Our previous monitoring system had been struggling to keep up with the sheer number of tests it was being asked to process. This was partly because we carry out many ping-tests, ssh-tests, http-tests, dns-tests, etc. The other reason was that our monitoring system was a behemoth of threaded-ruby, which all ran upon a single host. This made adding another monitoring host a complex undertaking.

The new solution uses a work-queue:

  • Tests to apply are parsed and inserted into a single, global, beanstalkd queue.
  • Workers continuously poll the queue for tests to execute. They then execute them, and alert on failures as appropriate.

The code is open-source, written in Ruby, and available here:

I've completed the process of tidying up the code to the extent I'm happy with it, and I believe I've also abstracted away the work-specific pieces of the code.

That said I'd not be surprised if it needs a few minor tweaks before it it useful for other people.

| 2 comments.

 

Comments On This Entry

[gravitar] Roberto

Submitted at 21:58:17 on 25 November 2012

One question, just out of curiosity: did you try already existing monitoring systems before implementing custodian? Why did you decide to deploy your own?

[author] Steve Kemp

Submitted at 22:30:40 on 25 November 2012

Custodian was designed to replace a previous in-house tool, (the name is a synonym of the previous projects' name).

Prior to that we'd looked at using other things but not found anything suitable, largely because of the size of the tests and the desire to have a "good" configuration file to specify things.

We use nagios/netflow for some monitoring, but they weren't even serious contenders for this role.

 

Comments are closed on posts which are more than ten days old.

Spiral Logo

Links

Search

Search prior entries:

RSS Feed

  • Subscribe to feed