David Baron's weblog: January 2006

Friends & Colleagues

Sunday 2006-01-22

The danger of extensions (16:25 -0800)

I've been concerned for a while about the quality of Firefox extensions, and I'd like to explain why.

To start, I think one of the original motivations for having an extensions system in Firefox was to reduce demands for feature additions that are only used by a small number of people or are experimental. I have no problem with this. What I'm concerned about is that extensions are being promoted to large numbers of users as one of the advantages of Firefox. I think this may come back to haunt the Mozilla community.

I'm under the impression that extensions are used quite commonly, perhaps even beyond the more technical users of Firefox. This means that problems with extensions change user perception of quality of Mozilla products as a whole. But extensions are not of the same quality as the applications they extend.

But what do I mean by quality? Everybody has different definitions. And that's actually the key point here. It could mean that the program does what it's supposed to do. Or that its interface is easy for users to understand. Or that it is free of security holes. Or that it doesn't use excessive amounts of memory, or leak memory over time. Or that it is fast. Or that it fits well with the conventions of the operating environment (Windows, MacOS, GNOME). Or many other things.

The Mozilla applications are developed in an environment with tools and conventions that lead to openness and accountability. This helps different people work on improving different measures of quality and thus improves user perception of overall quality. The source tree is accessible in many ways, people can easily monitor changes being made to it (while keeping in mind the reputations of those making the changes), and most of those changes have been discussed publicly on bugs and code reviewed by reviewers who are accountable for their reviews. It's generally clear what's changing, why it's changing, and who is responsible for those changes. And enough people (eyeballs) with these different ideas of quality pay attention to current changes that lapses below the expected quality in many of these areas are often caught quite quickly.

Furthermore, with the Mozilla source code all in one repository, it is possible to improve the level of quality in one of these areas or in a new area. It's easy to search the entire tree for uses of strcat that cause buffer overflows. It's easy to search the entire tree for memory leaks caused by calls to addObserver not matched by calls to removeObserver. And so on, for many of the measures of quality that I listed above.

Extensions don't benefit from these community mechanisms. There's no central source code repository for all the extensions hosted on addons.mozilla.org. There's no general mechanism for monitoring all the changes. There's no way of knowing who reviewed the code of an extension, although I've heard there is a review process before extensions can be listed. (Review for what? But remember that formal pre-checkin review is only a small part of the review that happens for the Mozilla core and applications.)

Thus the pressures to improve extension quality come from fewer sources: the programming knowledge of just the author and the reviewer, and feedback about issues the extension's users know are related to the extension rather than the application (generally only its correct operation and user interface). Thus many dimensions of quality will be missed. And many of these dimensions can be spoiled by one rotten apple. If a user has ten extensions installed, one of them crashes often, and one of them leaks a lot of memory, then the user's perception of the application as a whole will be that it crashes a lot and leaks a lot of memory.

There are further problems with testing. Just as some people look over source code for a specific type of problem, other people test in specific areas, such as speed, memory use, various types of testing for security problems, or just basic functioning and usability. Extensions pose a different problem here: there's a combinatorial explosion of configurations to test. The person who cares enough about memory leaks to do a bit of memory leak testing might not have the leakiest extension installed. The person who tests for security bugs using mangleme might not have the extension with the most serious vulnerabilities installed.

And this problem gets even worse because extension versions are separate from application versions. There's no need to test the View Source window from Firefox 1.0.3 with the browser from Firefox 1.5. But users can use different versions of an extension with the same version of the browser, so each combination might get only a fraction of the total combined testing.

And it's worse still because extensions can interfere with each other. In the most obvious case, two extensions might try to add a button in the same place and cause broken user interface, if only because the buttons don't all fit anymore. But the same thing could happen with memory leaks, or slowdowns loading pages, or any of the other measures of quality. I doubt there's been much, if any, serious testing of memory leaks, performance, and many other important dimensions of quality across combinations of extensions.

This combinatorial explosion leads to large numbers of small problems. And large numbers of small problems are less likely to be noticed and fixed than small numbers of large problems.

I worry that these aren't just theoretical concerns. How many of the reports of huge memory leaks in Firefox 1.5 are because N of the M different versions/forks of AdBlock leak every page you visit? (I've been told N is nonzero, but haven't tested it for myself.) And I worry that the Mozilla community is staking too much of its communal reputation on something that the community as a whole has too little control over.

Saturday 2006-01-14

Please file good memory leak bugs, part 2 (09:45 -0800)

It didn't even occur to me when writing leak-gauge.pl that many Windows users wouldn't already have perl installed. I'm so used to having it available everywhere, since whenever I deal with a Windows machine it's as a development environment.

So I've now written a version of the same script in JavaScript and HTML, so that users don't need to go through the trouble of installing perl to use leak-gauge.pl. (Coincidentally, the early versions of leak-gauge.pl also didn't work on Windows; that's fixed now.)

The process of using this script is a little more complicated, since you need to either:

This script is also Mozilla-specific and must be downloaded to a local file (with .htm or .html extension) and granted additional privileges when run (something that's equivalent to installing software).

(Note that if file upload controls allowed script in the page to read the file, the elevated privileges this script needs would not be needed -- it's using them only to read a file in.)

Tuesday 2006-01-10

Please file good memory leak bugs (08:42 -0800)

Firefox and SeaMonkey trunk builds from 2006-01-06 or newer have logging code that should make it easier to file useful leak bug reports.

What do I mean by a useful leak bug report? The same two things that are critical for all bug reports:

  1. The bug report needs to have enough information that other people can see the bug on their own machine (as easily as possible).
  2. The bug report needs to describe a problem specific enough that it is possible to tell that the bug is fixed (as easily as possible).

Saying "I ran Firefox for 5 hours and it started using a lot of memory" isn't very useful. It doesn't meet either of the above criteria.

Leaks might occur only when loading a certain Web site, only when clicking a certain button on a certain Web site, only when opening a certain dialog, etc. Figuring out which Web page, which button, or which dialog is usually enough to meet (1) above. But it's sometimes not enough to meet (2). When the real bug is that "loading Web site X and moving the mouse pointer over the upper half of the page leaks", then "loading Web site X sometimes leaks" meets (1), but not (2). If the bug is that opening the view source window by operating the main menu using the keyboard (i.e., Alt-V, O) leaks, then a bug report that opening the view source window leaks meets neither (1) nor (2).

Since the particular thing that causes a leak can be one of these details, when filing leak bugs, it doesn't hurt to give very precise steps to see the bug. But they shouldn't be too much larger than necessary. You can make the list of steps shorter by trying to remove steps from the list: if the problem doesn't go away, then the step can be removed. Repeat until the list is simple enough that repeating the steps can be done quickly, and then file the bug (but see below about duplicates).

Anyway, the point of this logging code is that it allows a log to be stored during normal use of the browser to see what leaks. If you download this perl script, follow the instructions right below the license header when starting the browser, and then run the perl script over the resulting log, you can see a summary of which pages had leaks of certain objects that are associated with many of the larger memory leaks in Mozilla. This is not giving information about all memory leaks: the logging needed for that couldn't be in a release build. But it is information about some of the most significant ones.

Once you see a Web page in the log, that doesn't mean it's time to file a bug. If you see a Web page in the log, try loading the page again and then just closing the browser, and see if the same leak happens. If it does, that's a useful bug. If it doesn't, try to remember what you were doing on the page, and see if you can get the leak to happen again. If you can, then you probably have a useful bug to file.

Those are useful bugs, with one catch. It may be hard to tell if the bug is a duplicate. If it's a site that a lot of users testing leaks will notice, then it's probably worth having a bug on file even without a simplified testcase, just as a duplicate-catcher. In general, there's not much need to have simplified testcases for leak bugs (as long as the steps to see the leak can be done quickly), where a simplified testcase means turning a Web page into the minimum set of HTML, CSS, JavaScript, etc., needed to cause the leak. However, the reason to want simplified testcases for leak bugs is that it's often needed to tell whether leaks on two (or a hundred) different Web pages are the same underlying leak bug. It also may be useful to have a saved copy of the Web page, perhaps simplified, in case the page changes in a way that makes the leak go away. (For example, the leak could be related to the JavaScript code associated with a particular banner advertisement.)

So, whether to file memory leak bugs without simplified testcases really depends on how many bugs accumulate and how quickly they're fixed. If people start filing bugs faster than they can be debugged and fixed, then simplified testcases will be needed to track duplicates. If the bugs are fixed quickly after they're filed, then simplified testcases won't be needed.

But right now, our biggest problem is that people complain that they use the browser for hours and it leaks a lot of memory. So if you have more specific leak bugs to file, file away, but watch out of the corner of your eye to make sure the filing doesn't get out of hand.

A few final notes: