Skip to content

Entries tagged "public bugfixing".

What can you do? Sparta will need sons.

Since I've ranted a little recently lets do another public bugfix. The last few times people seemed to like them, and writing things down helps me keep track of where I am, what I'm doing, and how soon it will be "beer o'clock".

So I looked over the release critical bug list, looking for things that might be easy to fix.

One bug jumped out at me:

I installed the package:

skx@gold:~$ apt-get install gnomad2
..
The following NEW packages will be installed
  gnomad2 libmtp7 libnjb5 libtagc0
..

Once I copied an .mp3 file to my home directory, with the .ogg suffix I got a segfault on startup:

skx@gold:~$ cp /home/music/Audio/RedDwarf-back-to-reality.mp3 ~/foo.ogg
skx@gold:~$ gnomad2
..
LIBMTP_Get_First_Device: No Devices Attached
PDE device NULL.
TagLib: Ogg::File::packet() -- Could not find the requested packet.
TagLib: Vorbis::File::read() - Could not find the Vorbis comment header.
[segfault]
skx@gold:~$

So I downloaded the source ("apt-get source gnomad2"), and the dependencies for rebuilding ("apt-get build-dep gnomad2"). This allowed me to rebuild it locally:

skx@gold:/tmp/gnomad2-2.9.1$ ./configure --enable-debug && make
skx@gold:/tmp/gnomad2-2.9.1$ cd src/

And now it can be run under GDB.

skx@gold:/tmp/gnomad2-2.9.1/src$ gdb ./gnomad2
GNU gdb 6.8-debian
...
(gdb) run
Starting program: /tmp/gnomad2-2.9.1/src/gnomad2
...
LIBMTP_Get_First_Device: No Devices Attached
PDE device NULL.
[New Thread 0x41dcd950 (LWP 23593)]
TagLib: Ogg::File::packet() -- Could not find the requested packet.
TagLib: Vorbis::File::read() - Could not find the Vorbis comment header.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x41dcd950 (LWP 23593)]
0x00007fc89e0340b0 in taglib_tag_artist () from /usr/lib/libtag_c.so.0
(gdb)

Interestingly the crash comes from a library, libtag_c.so.0. So either:

  • The bug is realy in the libtag module.
  • The gnomad2 package doesn't handle a failure case it should.

Time to guess which that is. Charitably we'll assume any library that segfaults will be quickly fixed, because it will have more users than a single program..

Moving on we can look at the gnomad2 source for mentions of taglib. Several mentions are found, but src/tagfile.c is clearly the correct place to look. That file contains the following code:

void
get_tag_for_file (metadata_t *meta)
{
  gchar *tmppath = filename_fromutf8(meta->path);
  TagLib_Tag *tag;
  const TagLib_AudioProperties *properties;
  TagLib_File *file = taglib_file_new(tmppath);

  if(file == NULL) {
    g_printf("could not open file %s", tmppath);
    g_free(tmppath);
    return;
  }
  g_free(tmppath);

  tag = taglib_file_tag(file);
  properties = taglib_file_audioproperties(file);

  gchar* artist = taglib_tag_artist(tag);
  ..

This looks like a great place to explore because opening a file to read the tags is most likely where the crash is going to be coming from.

Interestingly we see the code:

  • Calls taglib_file_new to get a handle of some kind relating to the file.
  • Tests for errors with that operation.
  • Calls taglib_file_tag to fetch tag information using the handle, and indirectly the file.
  • But then uses taglib_tag_artist to fetch the artist (?I guess?) without testing the handle previously obtained was valid.

Let us guess that the file opening is succeeding but that the tag structure fetched via taglib_file_tag is NULL - and this causes the crash.

A quick edit:

  g_free(tmppath);

  tag = taglib_file_tag(file);
  if(tag == NULL) {
    g_printf("tag was null");
    return;
  }

  properties = taglib_file_audioproperties(file);
  gchar* artist = taglib_tag_artist(tag);

Rebuild, and the segfault is gone. We have a winner. Now we just need to file a patch...

ObFilm: 300

 

Guess he wasn't too popular at the end, huh?

Previously I've posted a couple of running commentries describing how I've examined and gone about fixing a couple of bugs in Debian packages I use, enjoy, or have stumbled upon by accident.

Each of these commentries has resulted in a change or two to the affected software to make the bug vanish.

Fixing the bug is usually the hard part, but obviously it isn't the only thing that you need to do. While it is fun to have a locally fixed piece of software if you don't share it then the very next release will have the same bug again - and you'll spend your life doing nothing more than fixing the same bug again and again.

Generally the way that I report my fixes is by sending email to the Debian bug tracker - because I usually only try to fix bugs that I see reported already. Specificially I tend to only care about:

  • Bugs in packages that I maintain.
  • Bugs that affect me personally. I'm selfish.
  • Bugs in packages I use, even if they don't affect me.
  • Segfaults. Because segfaults and security issues often go hand in hand.
This entry got a little bit long .. click to read more

ObFilm: Ghostbusters 2

 

Don't you just hate loose ends?

Today I spent a while fixing some more segfault bugs. I guess that this work qualifies as either fixing RC bugs, or potential security bugs.

Anyway I did an NMU of libpam-tmpdir a while back to fix all but one of the open bugs against it.

I provided a patch for #461625 yelp: segfault while loading info documentation, which fixes the symptoms of bad info-parsing, and avoids the segfault.

I also looked into the #466771 busybox cpio: double free or corruption during cpio extraction of hardlinks - but it turns out that was already fixed in Sid.

Finally I found a segfault bug open against ftp:

To reproduce this bug run:

skx@gold:~$ ftp ftp.debian.org
220 saens.debian.org FTP server (vsftpd)
Name (ftp.debian.org:skx): anonymous
331 Please specify the password
Password: foo@bar.com
ftp> cd debian/doc
250 Directory successfully changed.
ftp> get dedication-2.2.cn.txt dedication-2.2.de.txt dedication-2.2.es.txt ..
local: dedication-2.2.de.txt remote: dedication-2.2.cn.txt
Segmentation fault

You need to repeat the arguments about 50 times. But keep adding more and more copies of the three files to the line until you get the crash.

It isn't interesting as a security issue as it is client side only; but as a trivially reproducable issue it becomes fun to solve.

Click to read the rest of the entry

I mailed the maintainer of FTP and said unless I heard differently I'd NMU and cleanup the package in a week.

All being well this entry will be nicely truncated in the RSS feeds as support for the <cut> tag was the main new feature in my previous upload of chronicle - the blog compiler I use/wrote/maintain.

ObQuote: Razor Blade Smile

 

Some people get by with a little understanding

Since my last example of fixing a bug received some interesting feedback (although I notice no upload of the package in question ..) we'll have another go.

Looking over my ~/.bash_history file one command I use multiple times a day is make. Happily GNU make has at least one interesting bug open:

I verified this bug by saving the Makefile in the report and running make:

skx@gold:~$ make
make: file.c:84: lookup_file: Assertion `*name != '\0'' failed.
Aborted

(OK so this isn't a segfault; but an assertion failure is just as bad. Honest!)

So I downloaded the source to make, and rebuilt it. This left me with a binary with debugging symbols. The execution was much more interesting this time round:

skx@gold:~$ ./make
*** glibc detected ***
  /home/skx/./make: double free or corruption (fasttop): 0x00000000006327b0 ***
======= Backtrace: =========
/lib/libc.so.6[0x2b273dbdd8a8]
/lib/libc.so.6(cfree+0x76)[0x2b273dbdf9b6]
/home/skx/./make[0x4120a5]
/home/skx/./make[0x4068ee]
/home/skx/./make[0x406fb2]
...
[snip mucho texto]

And once I'd allowed core-file creation ("ulimit -c 9999999") I found I had a core file to help debugging.

Running the unstripped version under gdb showed this:

(gdb) up
#5  0x00000000004120a5 in multi_glob (chain=0x1c, size=40) at read.c:3106
3106			    free (memname);

So it seems likely that this free is causing the abort. There are two simple things to do here:

  • Comment out the free() call - to see if the crash goes away (!)
  • Understand the code to see why this pointer might be causing us pain.

To get started I did the first of these: Commenting out the free() call did indeed fix the problem, or at least mask it (at the cost of a memory leak):

skx@gold:~$ ./make
make: *** No rule to make target `Erreur_Lexicale.o', needed by `compilateur'.  Stop.

So, now we need to go back to read.c and see why that free was causing problems.

The function containing the free() is "multi_glob". It has scary pointer magic in it, and it took me a lot of tracing to determine the source of the bug. In short we need to change this:

free (memname);

To this:

free (memname);
memname = 0;

Otherwise the memory is freed multiple times, (once each time through the loop in that function. See the source for details).

Patch mailed.

 

Listen to me when I'm telling you

So today I'm a little bit lazy and I've got the day off work. As my previous plan suggested I wanted to spend at least some of the day tackling semi-random bugs. Earlier I picked a victim: less.

less rocks, and I use it daily. I even wrote an introduction to less once upon a time.

So lets take a look at two bugs from the long-neglected pile. These two issues are basically the same:

They seem like simple ones to fix, with the same root cause. Here's an example if you want to play along at home:

 cp /dev/null testing
 gzip testing
 zless testing.gz

What do you see? I see this:

"testing.gz" may be a binary file.  See it anyway?

When I select "y" I see the raw binary of the compressed file.

So, we can reproduce it. Now to see why it happens. /bin/zless comes from the gzip package and is a simple shell script:

#!/bin/sh
# snipped a lot of text
LESSOPEN="|gzip -cdfq -- %s"; export LESSOPEN
exec less "$@"

So what happens if we run that?

$ LESSOPEN="|gzip -cdfq -- ~/testing.gz" /usr/bin/less ~/testing.gz
"/home/skx/testing.gz" may be a binary file.  See it anyway?

i.e. it fails in the same way. Interestingly this works just fine:

gzip -cdfq -- ~/testing.gz | less

So we've learnt something interesting and useful. We've learnt that when LESSOPEN is involved we get the bug. Which suggests we should "apt-get source less" and then "rgrep LESSOPEN ~/less-*/".

Doing so reveals the following function in filename.c:

	public char *
open_altfile(filename, pf, pfd)
	char *filename;
	int *pf;
	void **pfd;
{

/* code to test whether $LESSOPEN is set, and attempt to run the
   command if it is */

		/*
		 * Read one char to see if the pipe will produce any data.
		 * If it does, push the char back on the pipe.
		 */
		f = fileno(fd);
		SET_BINARY(f);

		if (read(f, &c, 1) != 1)
		{
			/*
			 * Pipe is empty.  This means there is no alt file.
			 */
			pclose(fd);
			return (NULL);
		}
		ch_ungetchar(c);
		*pfd = (void *) fd;
		*pf = f;
		return (save("-"));

That might not be digestible, but basically less runs the command specified in $LESSOPEN. If it may read a single character of output from that command it replaces the file it was going to read with the output of the command instead!

(i.e. Here less failed to read a single character, because our gzipped file was zero-bytes long! So instead it reverted to showing the binary gzipped file.)

So we have a solution: If we want this to work we merely remove the "read a single character test". I can't think of circumstance in which that would do the wrong thing, so I've submitted a patch to do that.

Bug(s) fixed.

Incidentally if you like these kind of "debuggin by example" posts, or hate them, do let me know. So I'll know whether to take notes next time or not..