Skip to content

Entries tagged "lazyweb".

That friend promises his undying friendship if you would do him a small favour.

Perl & Apache?

Once upon a time, within the past year, I saw mention of a simpler version of mod_perl - an apache module which let you write code to run within the context of a persistent perl process.

However my DuckDuckGofu is weak, and I'm struggling to find this project.

Did I dream it, or could somebody tell me where it lives?

Dynamic Picture Frames

So I've been taking pictures recently. Lots of pictures.

Many times many images have been printed and hung upon my walls, and the price of frames is starting to become onerous.

I'd love to see some kind of "dynamic" picture wall - but the two alternatives I considered fail:

Metal & Magnets

Place a huge sheet of metal upon your wall. Then put wee magnets inside your frames.

Corkboard

Imagine a full wall that was paneled with what is essentially a large notice-board..

Both of these would look ugly; the metal one perhaps less so.

But the idea of having a wall which could have pictures mounted upon it, without having big nail holes if you rearranged and which could cope with dynamic repositioning and sizes is nice ..

Invent it for me? I'll buy one. Probably even two...

ObFilm: The Godfather

 

I go down with one helluva bang.

Right now I have a lot of music, and I primarily interact with it via playlists.

I have a cronjob that generates, and populates, ~/Playlists/ every night. I generate playlists on multiple criterion:

  • ~/Playlists/Artist/
  • ~/Playlists/Albums/
  • ~/Playlists/Titles/
  • ~/Playlists/Keywords/

Playlists for specific artists & albums are probably self-explanatory, but the others might be interesting.

For every unique songtitle I have a playlist. In most cases that means there is a playlist called "Song Title" having one entry. But, as an explicit example, I have a playlist called "Under The Bridge" with two entries:

All Saints/Under The Bridge.mp3
Red Hot Chili Peppers/Under The Bridge.mp3

Similarly I break each song title into words, and generate one playlist for each distinct word discovered.

As a matter of randomness I have:

TermCount
Girl83
Boy31

(e.g. Songs containing "girl" in their title: "Madonna:Material Girl", "Amy Whitehouse:Hey Little Rich Girl", "Garbage:Stupid Girl"..)

There are times when I want something specific and my playlist approach doesn't work. For example "All songs which are 2 minutes long, and happy". I guess the problem is working out which meta-data is worth searching/storing, and then working out how to jump from that data to a playlist.

Today, whilst walking into town to buy some new pies, I wondered "How many songs do I have that end in a chuckle, or laughter?"

If I wanted an "ends in laughter" playlist right now I'm screwed. Yet no system I've ever seen allows you to add that level of detail. (To be honest I'd probably give up even entering it.)

In conclusion, my music collection is vast and various, and dealing with it is sometimes harder than I'd like.

How do you handle the music on your computer(s)? (When it comes to mobile-music I just use an ipod telling it to play all, randomly. If a song comes on I don't like I just skip it.)

ObFilm: Lolita

 

You think we just work at a comic book store for our folks, huh?

I'm only a minimal MySQL user, but I've got a problem with a large table full of data and I'm hoping for tips on how to improve it.

Right now I have a table which looks like this:

CREATE TABLE `books` (
  `id` int(11) NOT NULL auto_increment,
  `owner` int(11) NOT NULL,
  `title` varchar(200) NOT NULL,
  ....
  PRIMARY KEY  (`id`),
  KEY( `owner`)
)  ;

This allows me to lookup all the BOOKS a USER has - because the user table has an ID and the books table has an owner attribute.

However I've got hundreds of users, and thousands of books. So I'm thinking I want to be able to find the list of books a user has.

Initially I thought I could use a view:

CREATE VIEW view_steve  AS select * FROM books WHERE owner=73

But that suffers from a problem - the table has discountinuous IDs coming from the books table, and I'd love to be able to work with them in steps of 1. (Also having to create a view for each user is an overhead I could live without. Perhaps some stored procedure magic is what I need?)

Is there a simple way that I can create a view/subtable which would allow me to return something like:

|id|book_id|owner | title      |....|
|0 | 17    | Steve| Pies       | ..|
|1 | 32    | Steve| Fly Fishing| ..|
|2 | 21    | Steve| Smiles     | ..|
|3 | 24    | Steve| Debian     | ..|

Where the "id" is a consecutive, incrementing number, such that "paging" becomes trivial?

ObQuote: The Lost Boys

Update: without going into details the requirement for known, static, and ideally consecutive identifiers is related to doing correct paging.

 

No, no, no, no.

I'm going to admit up front here that I'm pushing my luck, and that I anticipate the chances of success are minimal. But that aside .. There are a lot of people who read my entries, because of syndication, and I'm optimistic that somebody here in the UK will have a copy of the following three books they could send me:

  • Flash Gordon vol 3: Crisis on Citiadel II
  • Flash Gordon vol 5: Citadels under attack
  • Flash Gordon vol 6: Citadels on Earth

(All three are cheap paperback pulp fiction novels from the 1980s written by Alex Raymond.)

If you have a copy of any of those three books, and are willing to part with them, then I'd love to hear from you. Either as a comment or via email.

I'm certainly expecting to pay for them up to around £5 for each volume.

Backstory: I read the first when I was 10-12, then mostly forgot about it.

A while back I remembered enjoying it and bought volumes 1, 2, 3, & 4 from an online store. I got screwed and volume 3 hasn't arrived, but possibly that will be rectified soon.

Here in the UK the last two volumes are either extremely rare or extremely in demand. Typically they seem to sell for £15-30 - I'm frustrated to not have the conclusion, but not desperate to spend so much money upon them, (been there, done that).

So if anybody has some or all of these books and can bear to part with them please do let me know.

</luck:pushing>

 

I got the poison

I've two video-related queries, which I'd be greatful if people could help me out with:

Mass Video Uploading

Is there any tool, or service, which will allow me to upload a random movie to multiple video-download sites? Specifically I'm curious to learn whether there is a facility to transcode as necessary a given input file and then upload to youtube, google video, and other sites as a one-step operation.

Mass Video Searching

Relating to that is there a service which will allow me to search for vidoes with given titles/tags/keywords across multiple video-hosting networks?

Regarding the searching I see that YouTube has support for "OpenSearch", but Google's video hosting has neither that nor a sitemap.xml file: Irony Is ...

 

Its a lot like life

Assume for a moment that you have 148 hosts logging, via syslog-ng, to a central host. That host is recording all log entries into an MySQL database. Assume that each of these machines is producing a total of 4698816 lines per day.

(Crazy random numbers pulled from thin air; globviously).

Now the question: How do you process, read, or pay attention to those logs?

Here is what we've done so far:

syslog-ng

All the syslog-ng client machines are logging to a central machine, which inserts the records into a database.

This database may be queried using the php-syslog-ng script. Unfortunately this search is relatively slow, and also the user-interface is appallingly bad. Allowing only searches, not a view of most recent logs, auto-refreshing via AJAX etc.

rss feeds

To remedy the slowness, and poor usability of the PHP front-end to the database I wrote a quick hack which produces RSS feeds via queries, against that same database, accessed via URIs such as:

  • http://example.com/feeds/DriveReady
  • http://example.com/feeds/host/host1

The first query returns and RSS feed of log entries containing the given term. The second shows all recent entries from the machine host1.

That works nicely for a fixed set of patterns, but the problem with this approach, and that of php-syslog-ng in general, is that it will only show you things that you look for - it won't volunteer trends, patterns, or news.

The fundamental problem is a lack of notion in either system of "recent messages worth reading" (on a global or per-machine basis).

To put that into perspective given a logfile from one host containing, say, 3740 lines there are only approximately 814 unique lines if you ignore the date + timestamp.

Reducing logentries by that amount (78% decrease) is a significant saving, but even so you wouldn't want to read 22% of our original 4698816 lines of logs as that is still over a million log-entries.

I guess we could trim the results down further via a pipe through logcheck or similar, but I can't help thinking that still isn't going to give us enough interesting things to view.

To reiterate I would like to see:

  • per-machine anomolies.
  • global anomolies.

To that end I've been working on something, but I'm not too sure yet if it will go anywhere... In brief you take the logfiles and tokenize, then you record the token frequencies as groups within a given host's prior records. Unique pairings == logs you want to see.

(i.e. token frequency analysis on things like "<auth.info> yuling.example.com sshd[28427]: Did not receive identification string from 1.3.3.4"

What do other people do? There must be a huge market for this? Even amongst people who don't have more than 20 machines!

 

I just don't understand

Whilst I'm very pleased with my new segmented network setup, and the new machine, I'm extremely annoyed that I cannot get a couple of (graphical) Xen guest desktop guests up and running.

The initial idea was that I would setup a 64-bit installation of Etch and then communicate with it via VNC - xen-tools will do the necessary magic if you create your guest with "--role=gdm". Unfortunately it doesn't work.

When vncserver attempts to start upon an AMD64 host it dies with a segfault - meaning that I cannot create a scratch desktop environment to play with.

All of this works perfectly with a 32-bit guest, and that actually is pretty neat. It lets me create a fully virtualised, restorable, environment for working with flash/java/etc.

The bug was filed over three years ago as #276948, but there doesn't appear to be a solution.

Also, only on the amd64 guest, I'm seeing errors when I try to start X which mention things like "no such file or directory /dev/tty0". I've no idea whats going on there - though it could be a vt (virtual terminal) thing?.

The upshot of all this is that I currenly have fewer guests than I was expecting:

skx@gold:~/blog/data$ xm list
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3114     2 r-----   1180.6
cfmaster.services.xen                      1      256     1 -b----      1.0
etch32.desktop.xen                         2      256     1 -b----      1.4
etch32.security-build.xen                  3      128     1 -b----      1.4
etch64.security-build.xen                  4      128     1 -b----      1.4
sarge32.security-build.xen                 5      128     1 -b----      1.0

 

You're making me live

Is there an existing system which will allow me to query Apache logfiles via an SQL string? (Without importing into a database first).

I've found the perl library SQL::YASL - but that has a couple of omissions which mean it isn't ideal for my task:

  • It doesn't understand DISTINCT
  • It doesn't understand COUNT
  • It doesn't understand SUM

Still it did allow me to write a simple shell which works nicely for simple cases:

SQL>LOAD /home/skx/hg/engaging/logs/access.log;
SQL>select path,size from requests where size > 10000;
path size 
/css/default.css 13813 
/js/prototype.js 71261 
/js/effects.js 37872 
/js/dragdrop.js 30645 
/js/controls.js 28980 
/js/slider.js 10403 
/view/messages 15447 
/view/messages 15447 
/recent/messages 25378 

It does mandate the use of a "WHERE" clause, but that was easily fixed with "WHERE 1=1". If I could just have support for count I could do near realtime interesting things...

Then again maybe I should just log directly and not worry about it. I certainly don't want to create my own SQL engine .. it just seems that Perl doesn't have a suitable library already made which is a bit of a shocker!

 

Open your eyes, look up to the skies and see

If you have a (public) revision controlled ~/bin/, or bash/shell scripts I'd love to see them. Feel free to post links to your repositories as comments.

I'm certain there are some great tools and utilities out there with I could be using. Right now the only external thing I'm using is Martin Krafft's pub script. I don't use it often, but it is very neat and handy when I do want it. (Something that I'd never have considered writing myself, which suggests there are many more gems I'm missing!)

In other news my migration to mercurial is going extremely well. With only minimal downtime. Downtime for services really comes about because I have several websites which are powered entirely with a CVS checkout of remote repositories, so the process looks a little like this:

  • Convert CVS repository to hg.
  • Archive "live" CVS checkout from the server.
  • Move the local CVS checkout somewhere temporary.
  • Checkout from the new mercurial repository.
  • Fix any broken symlinks.
  • Do a recursive diff to make sure there are no unexpected changes.
  • Remove the previously archived local CVS checkout
  • Done!

 

No, I don't want your number

I'm still in the middle of a quandry with regards to revision control.

90% of my open code is hosted via CVS at a central site.

I wish to migrate away from CVS in the very near future, and having ummed and ahhed for a while I've picked murcurial as my system of choice. There is extensive documentation, and it does everything I believe I need.

The close-runner was git, but on balance I've decided to choose mercurial as it wins in a few respects.

Now the plan. I have two options:

  • Leave each project in one central site.
  • Migrate project $foo to its own location.

e.g. My xen-tools could be hosted at mercurial.xen-tools.org, my blog compiler could live at mercurial.steve.org.uk.

Alternatively I could just leave the one site in place, ignoring the fact that the domain name is now inappropriate.

The problem? I can't decide which approach to go for. Both have plusses and minuses.

Suggestions or rationales welcome - but no holy wars on why any particular revision control system is best...

I guess ultimately it matters little, and short of mass-editing links its 50/50.

 

I love this hive employee

Russell Coker wants something to save and restore file permissions en masse.

That exists already:

apt-get install acl

Once installed you can dump the filesystem permissions of, for example, /etc/ recursively with this:

 getfacl -R  /etc > orig.perms

Want to see what is different? First change something:

steve@steve:~$ sudo chmod 0 /etc/motd

Now see what would be restored:

setfacl --test -R --restore=./orig.perms /etc | grep -v "\*,\*"
etc/motd : u::rw-,g::r--,o::r--,*

Finally lets make it do the restoration:

steve:/# setfacl -R --restore=./orig.perms /etc

Job done.

 

For you the sun will be shining

Thanks to the people who commented on my post about a decent apt cacher, it was good to see that I'm not alone.

Thanks to RobertH for recommending the new tool acng - I've not used it yet, instead I gave it a quick look and reported a potentially serious bug. Hopefully that'll be fixed in the next release.

In the meantime apt-cacher actually appears to be holding up quite nicely and the nice HTML report it generates is cute!

Now onto the next challenge...

I would like some kind of tool to convert a random hierarchy of images (jpg) into a small gallery. (Utterly non-dynamic - but ideally with tagging support and RSS feeds).

There seem to be a plethora of options to the problem, suprisingly many of them involving Python ..

If anybody has any pointers I'd appreciate a link.

For reference my current galleries tend to look like this - warning fluffy animals!

Using "apt-cache search static gallery" I find three programs:

bins - Very heavyweight. Unattractive.

photon - Pretty. Requires GIMP for creating thumbnails - unsuitable for my lightweight webhost.

jigl - Looks great. Does 90% of what I want - specifically misses tags & rss.

 

Are you talking to me?

My GNOME desktop is broken upon my primary machine, and it has taken me too long to get it sorted out.

Short version: metacity will not run:

skx@vain:~$ metacity
metacity: symbol lookup error: /usr/lib/libgthread-2.0.so.0: undefined symbol: g_thread_gettime

The .so file referenced is a symlink to libgthread-2.0.so.0.1200.13, and using nm I can see there are no symbols listed:

skx@vain:~$ nm /usr/lib/libgthread-2.0.so.0.1200.13
nm: /usr/lib/libgthread-2.0.so.0.1200.13: no symbols

That seems weird to me, but libraries are mysterious beasts, so I might be expecting this behaviour?

Anyway dpkg claims this file is installed by libglib2.0-0, and the package hasn't had an upload since July 17th, so I can't believe this is the reason for the recent breakage (Even given that I don't logout often..)

Reinstalling both packages (metacity + libglib2.0-0) has failed to fix the problem so I'm lost.

Right now I'm running GNOME with a different window manager, icewm, via a ~/.gnome2/session file:

gnome-wm --default-wm /usr/bin/icewm-gnome --sm-client-id default0

This works almost perfectly - it is better than metacity in the sense that new windows don't overlap existing ones if there is spare screen space, but worse in that alt-TAB shows two windows "Top extended Edge Panel" and "Bottom Extended Edge Panel" - which I don't need/want to see.

I'd be happy to stay with IceWM if I could fix those two problems, but I'd love to know why metacity is broken, and how I can fix it. I can't see any obvious bug reports - and I'm not 100% certain that the gthread package is the source of the error...

Any suggestions welcome.

ii  metacity       1:2.18.5-1     A lightweight GTK2 based Window Manager
ii  libglib2.0-0   2.12.13-1      The GLib library of C routines

 

Now some men like the fishing

Xen Migration

This afternoon I mostly migrated Xen guests from their old host to their new. (As part of a an upgrade of facilities. Upgrading in place would have been much fiddlier and more annoying!)

The migration took almost three hours, which was longer than anticipated but shorter than I'd feared. In the future I'll know to do it differently, but I managed to script it fairly well after the first couple were done manually.

Everything appears to be working correctly so I will soon nip out for some high quality beer.

Xen Help?

One thing that I wanted to do with the new host was track bandwidth usage upon a per-guest basis.

This should be possible with something like vnstat - however solutions counting traffic by interface name are not a good mesh with Xen - since by default a guest will have an interface with a name like 'vif20.0' - and no means of mapping that to a specific guest.

Each of my guests has been allocated three IPs which are defined like this in the Xen configuration file:

vif = [ 'ip=1.2.3.4 1.2.3.5 1.2.3.6' ]

This works prefectly.

This also works:

vif = [ 'ip=1.2.3.4,vifname=foo 1.2.3.5 1.2.3.6' ]

Unfortunately anything else I've tried to give each IP a static interface name fails. I've seen reports of this online but no solutions.

Given a configuration file like this the Xen guest doesn't receive any traffic upon the second + third address:

vif = [ 'ip=1.2.3.4,vifname=foo1',
        'ip=1.2.3.5,vifname=foo2',
        'ip=1.2.3.6,vifname=foo3' ]

Any suggestions welcome.

 

She said she'd teach me 'bout voodoo

So I've been very happy with exaile - the media player - for the past week or so.

I think I'm going to switch to it full time.

The "random play" is suprisingly random. Despite listening to music 24x7 I'm finding myself hearing new music. I can only conclude that xmms and xmms2 have poor random functionlity ..

The bigger issue is the handling of plugins. How do plugins get loaded? Via an external website.

You do the pointy-clicky dance with the user-interface, and the system downloads arbitary code from exaile.org, installs it into ~/.exaile/plugins and executes it.

Double-plus ungood.

 download_url = "http://www.exaile.org/plugins/plugins.py?version=%s&plugin=%s" \
    % (self.app.get_plugin_location(), file)
  xlmisc.log('Downloading %s from %s' % (file, download_url))

Let us hope they never lose control of that domain, (and never implement automatic plugin updates) otherwise all current users will hit the site, be persuaded there are newer plugins available and be compromised en masse...

In other news, even with my planet-searching script, I cannot find the blog entry I wanted to refer people to. It involved people looking pretty and acting miserable. Possibly on buses?

 

Other things just make you swear and curse

I find myself in need of a simple "blogging system" for a small non-dynamic site I'm putting together.

In brief I want to be able to put simple text files into "blog/", and have static HTML files build from them, with the most recent N being included in an index - and each one individually linked to.

At a push I could just read "entries/*.blog", then write a perl script to extract a date + title and code it myself - but I'm sure such a thing must already exist? I vaguely remember people using debian/changelog files as blogs a while back - that seems similar?

Update: NanoBlogger it is.