Skip to content

Entries tagged "apache".

That friend promises his undying friendship if you would do him a small favour.

Perl & Apache?

Once upon a time, within the past year, I saw mention of a simpler version of mod_perl - an apache module which let you write code to run within the context of a persistent perl process.

However my DuckDuckGofu is weak, and I'm struggling to find this project.

Did I dream it, or could somebody tell me where it lives?

Dynamic Picture Frames

So I've been taking pictures recently. Lots of pictures.

Many times many images have been printed and hung upon my walls, and the price of frames is starting to become onerous.

I'd love to see some kind of "dynamic" picture wall - but the two alternatives I considered fail:

Metal & Magnets

Place a huge sheet of metal upon your wall. Then put wee magnets inside your frames.

Corkboard

Imagine a full wall that was paneled with what is essentially a large notice-board..

Both of these would look ugly; the metal one perhaps less so.

But the idea of having a wall which could have pictures mounted upon it, without having big nail holes if you rearranged and which could cope with dynamic repositioning and sizes is nice ..

Invent it for me? I'll buy one. Probably even two...

ObFilm: The Godfather

 

If you were a comic book character, what character would you be?

I've been overhauling the way that I am host a number of virtual websites upon my main box. Partly to increase security, and partly for a cleaner separation or roles, ownership, and control. (In general everything on my box is "mine", but some things are "ours"...)

After a fair amount of experimentation I decided that I wasn't willing or able to rewrite all my Apache mod_rewrite rules just yet. So my interim plan was to update each existing virtual host:

  • Add a dedicated user & group to run it under.
  • Launch it via a minimal server listening upon the loopback adapter.
  • Have Apache 2.x proxy through to it.
    • Expanding any mod_rewrite rules prior to the proxying.

To make it clear what the users were for I decided that every hosting-user would have an "s-" prefix. So the virtual host "static.steve.org.uk" was initially going to be served by the s-static user.

The thttpd configuration file would look like this, and would be located in /etc/thttpd/sites/static.steve.org.uk:

host=127.0.0.1
port=1008
dir=/home/www/static.steve.org.uk/htdocs/
chroot
user=s-static
throttles=/etc/thttpd/throttle.conf
logfile=/home/www/static.steve.org.uk/logs/thttpd.log
pidfile=/home/www/static.steve.org.uk/pid/file

(I wrote a trivial script to stop/start all the sites en mass, and removed the default thttpd init script, logrotation job, and similar things.)

How did I decide which port to run this instance under? By taking the UID of the user:

steve@skx:~$ id s-static
uid=1008(s-static) gid=1009(s-static) groups=1009(s-static)

With this in place I could then update the Apache configuration file from serving the site directly to merely proxying to the back-end server:

<VirtualHost *>
    ServerName  static.steve.org.uk

    # Proxy ACL
    <Proxy *>
        Order allow,deny
        Allow from all
    </Proxy>

    # Proxy directives
    ProxyPass          /   http://localhost:1008/
    ProxyPassReverse   /   http://localhost:1008/
    ProxyPreserveHost on
</VirtualHost>

So was that all there is to it? Sadly not. There were a couple of minor issues, some of which were:

cronjobs

I have various cron-jobs in my main steve account which previously updated blog indexes, etc. (I use namazu2 to make my blog searchable.)

I had to change the ownership of the existing indexes, the scripts themselves, and move the cronjob to the new s-blog user.

cross-user dependencies

I run a couple of sites which pull in content from other locations. For example a couple of list summaries, and archives. These are generally fed from a ~/.procmail snippet under my primary login.

Since my primary login no longer owns the web-tree it is no longer able to update things directly. Instead I had to duplicate a couple of subscriptions and move this work under the UID of the site-owner.

I'm no longer running apache

For a day or two I'd forgotten I was using the apache facility to include snippets in my site; such as links to my wishlist.

Since I'm not using Apache in the back-end server-parsed files no longer work. Happily I'm using a simple template-based setup for my main sites, so I updated the template-parser to understand "##include ./path/to/file". For example this source file produces my donation page.

The upshot is my "static" site is even more static, which is a good thing.

uploads are harder

Several of my domains host entirely static content which is generated on my main desktop machine, and then uploaded via rsync post-build.

I had to add some more accounts and configure SSH keys, then update the uploading routines/Makefiles appropriately. Not a major annoyance, but suddenly my sshd_config file has gone from "PermitUser steve,backup" to including many additional accounts.

The single biggest pain was handling my my mercurial repositories - overhauling that took a bit of creativity to ensure that nothing was broken for existing or new checkouts. I wish that a backport of mercurial-server was trivial because I'd love to be using that.

In general though watching the thttpd logs has been sufficient to spot problems. I had to tweak things a little to generate statistics properly, but otherwise all is good.

Why thttpd? Well small, lightweight, and the ability to run CGI scripts. Something missing from nginx for example.

I'm still aiming to remove apache2 from the front-end - it is mostly just a dumb proxy, but it does perform some ACL operations and expand mod_rewrite rules. I could port those to another engine .. but not today.

The most likely candidates are nginx, perlbal, or lighttpd - each of these should be capable of doing simple ACL checks, and performing mod_rewrite-like rules.

ObFilm: Mallrats

 

I am lightened, can we drop this?

As part of some house-keeping I've been checking over my systems and ensuring they're all tickity-boo for the past couple of days.

One thing that I'm getting increasingly tempted by is converting my kvm guest to a 64-bit system.

I've not quite sold myself on the prospect of what will be a fair amount of downtime, but I'm 90% there.

I do think that a lot of my setup needs an overhaul, for example:

  • Running all my websites under www-data is beginning to worry me.
  • Running services as root is beginning to make me more and more paranoid.

One possible plan is to wipe my system, and then restore data from backups. A perhaps saner approach is divide my guest into two smaller ones, and migrate services over one by one (e.g. website1, website2, .. websiteN, email, etc).

For the moment I've taken a complete dump of my existing guest, and I'm running it with an IP in the 10.0.0.0/24 range on my desktop. That's at least given me a clear idea of the amount of work involved.

I'm still a little unclear on how best to manage running N websites with the intention they'll each run under their own UID. I guess it comes down to having a few instances of nginx/lighttpd/apache and then proxy from *:80 to the actual back-end. Precisely which mixture of services to use is a little overwhelming. Though at some point soon I need to start enabling IPv6 support, and that changes things a little.

(Not least because nginx has no IPv6 support present in the Lenny release - I've got a backported package which I run on the Debian Administration website.)

It's possible I could hack mod_vhost_alias to redirect/proxy to a local port based upon the virtual hostname present in the request - that's pretty trivial and I've already done something similar for work purposes. Though something like that should presumably already exist? I would expect a map of some form:

example.org: 127.0.0.1:8080
example.net: 127.0.0.1:9090

That has to be about the minimum necessary information to make the decision; a pair of vhost name & local destination.

/me googles some..

Update

OK quick update I've added local users for some of my sites, and now have them running under thttpd.

skx:/etc/thttpd# ls -ltr /home/www/ | tail -n 4
drwxr-sr-x  4 s-static   s-static   4096 Jan 15 01:41 static.steve.org.uk
drwxr-sr-x  5 s-openid   s-openid   4096 Feb 16 21:31 openid.steve.org.uk
drwxr-sr-x  6 s-images   s-images   4096 Feb 16 21:52 images.steve.org.uk
drwxr-sr-x  5 s-packages s-packages 4096 Feb 16 22:03 packages.steve.org.uk

That seems to work well, with a small wrapper script to start N instances of thttpd instead of a single one. Minor issues are that I'm using mod_proxy to forward requests to the thtpd instances running upon the loopback - and it was initially logging 127.0.0.1 as the source IP. A quick patch later all is well.

I'll leave it running a couple of the simple sites for the next few days and see if it kills children. If it does I'll convert the rest.

Probably will aim to have nginx in front of thttpd, instead of Apache, but this way I don't have to worry about mod_rewrite rules just yet.

ObFilm: Cruel Intentions

 

This is the part where you tell me what matters is on the inside

Some technology evolves very quickly, for example the following things are used by probably 80% of readers of this page:

  • A web browser.
  • A mail client.
  • A webserver.

But other technology is stuck in the past and only sees laclustre updates and innovations (not that inovation is mandatory or automatically a good thing)

Right now I'm looking at my webserver logs, trying to see who is viewing my sites, where they came from, and what their favourite pie is.

In the free world we have the choice of awstats, webalizer, and visitors (possibly more that I'm unaware of). In the commercial world everybody and their dog uses Google's analytics.

On the face of it a web analysis package is trivial:

  • Read in some access.log files.
  • Process to some internal database representaqtion.
  • Generate static/dynamic HTML output from your intermediate form, optionally including graphs, images, and pie-charts.

If you add javascript-fu to each of your pages you can track page titles, exit links, screen resolutions, and other data to record too. (Though I guess thats a seperate problem; trying to merge that data in with the data you have in your access log without making nasty links like "GET /trackin.gif?x_res=800;y_res=600". Anyway I guess with cookies you could correlate reasonably carefully.)

In conclusion why are my web statistics so dull, boring, and less educational than I desire?

I'd be tempted to experiment, but I suspect this is a problem which has subtle issues I'm overlooking and requires an artistic slant to make pretty.

(ObLink: asql is my semi-solution to logfile analysis.)

ObFilm: Bound

 

You're making me live

Is there an existing system which will allow me to query Apache logfiles via an SQL string? (Without importing into a database first).

I've found the perl library SQL::YASL - but that has a couple of omissions which mean it isn't ideal for my task:

  • It doesn't understand DISTINCT
  • It doesn't understand COUNT
  • It doesn't understand SUM

Still it did allow me to write a simple shell which works nicely for simple cases:

SQL>LOAD /home/skx/hg/engaging/logs/access.log;
SQL>select path,size from requests where size > 10000;
path size 
/css/default.css 13813 
/js/prototype.js 71261 
/js/effects.js 37872 
/js/dragdrop.js 30645 
/js/controls.js 28980 
/js/slider.js 10403 
/view/messages 15447 
/view/messages 15447 
/recent/messages 25378 

It does mandate the use of a "WHERE" clause, but that was easily fixed with "WHERE 1=1". If I could just have support for count I could do near realtime interesting things...

Then again maybe I should just log directly and not worry about it. I certainly don't want to create my own SQL engine .. it just seems that Perl doesn't have a suitable library already made which is a bit of a shocker!