Friday, November 28, 2008

Coders, Unit Tests, and Testers

On the QAGuild group on LinkedIn, Prasad Narayan asked "Does the Dev team in your organization indulge in Unit Testing? I would appreciate some details, if the answer is in the affirmative.".

Here's my response:

I've worked in plenty of teams where the coders have written unit tests. Typically at organisations which explicitly care about coding. Less so at banks, service or entertainment organisations. On a reasonable proportion, and most especially on agile teams, the coders have written very large numbers of unit tests that act as a scaffold to the code (that is, very large as compared to the expectations of teams that don't use unit tests, so perhaps this is tautological) . More than a simple framework, the tests also act as a way to frame thinking about the code, to consider it before it is made, and to experiment with it when it is under construction.

Such teams are (in my experience) universally proud of their unit tests, and actively show them off. Their code tends to be better, too. When I've worked with teams who are shy of showing me their unit tests, and shy of letting me review them, then (in my experience) the code is universally duff. I'd rather work in a team that has no unit tests than a team that says it does, but won't show me (as a project member who is interested in testing) the tests.

I have worked with teams that write unit tests, but don't run them. This sounds bizarre, but tends to be a problem that creeps up on teams. I've seen it as a result of commenting out unit tests that break because of a non-code-related change (without replacing the test). Be aware also of unit tests that don't test anything important, or which pass with or without the code in place. I have heard of (but not worked on) teams where the testers wrote all the unit tests, and the coders wrote the code. I guess I'd hope that the two groups work closely enough that tests and code could be written together.

It's all to easy to let good unit testing and the resultant relatively-clean code lull one into a false sense of security about the viability of the system as a whole. If I work with a team that is using lots of unit tests, I (very broadly) take this approach:

  • reviewing the existing unit tests

  • being sure they're run regularly

  • trying not to duplicate them too much with my own / the customer's / other team's confirmatory scripted tests

  • using exploratory / experimental / diagnostic approaches to pick up and dig into all those unexpected risks and surprises

  • working closely with the coders to enhance, streamline, and otherwise improve their unit tests

Monday, November 24, 2008

Exploration

"Do not go where the path may lead; go instead where there is no path and leave a trail"
Ralph Waldo Emerson

Saturday, November 22, 2008

Can you start using Exploratory Testing without needing an expert?

On Software Testing Club, Anna Baik asked "Can you *bootstrap* your team into using a more exploratory testing approach without needing in-house expertise? Or are you likely to have problems unless you have either a consultant or an already experienced exploratory tester on staff?", which struck me as a reasonable question. Here's my answer (also posted as a reply on Anna's blog, mostly).

You can do exploratory testing without an experienced exploratory tester. Indeed, that's what all exploratory testers do with a new system / technology / customer etc.; start from a position of ignorance and get exploring.

There are a couple of ideas that I'd recommend keeping in mind:
  • one of the things you'll be doing is gaining experience, and (through reflection) expertise. This is part of all exploration, but it's particularly true when getting going on something new. Expertise will help you find more/better information – and you'll find fewer/poorer without expertise – but it takes time to build. However useful it is, it is neither necessary, nor sufficient, and even expert teams explore better with a newbie in the numbers. This is because...
  • if a few of you are exploring, there will be great diversity in your approaches. Learn from each other - and try to take those lessons in a way that doesn't flatten that diversity.
With those thoughts, here are a couple of recipes. Adapt as you see fit. For both, you'll need to set aside some time. Regard this time as a gamble, and assess its value when you're done. If I was doing this, I'd prefer to work in a group (so for a 3 person-hour recipe, schedule 2 people for 90 minutes). I might assess value by looking at whether I'd changed my bug rate, or found any particularly useful information / nasty bugs.

1        Exploring via Diagnosis: 3 person-hours
  • Pick out some current bugs that aren't clear, or that aren't reproducible. Doesn't matter if they're in your bug tracking system or not, but they should be already known, and not yet fixed.
  • Explore those bugs, seeking information that will clarify them or make them more reliably reproducible. Keep track of your activity.
  • Review what you've done, collate the information gained. Log new bugs, update clarified bugs.
  • If you can generalise from the investigative approaches you're used, then do so.
  • Tell the whole team your results.
  • Schedule another 3 hours on a different day and repeat!
2        The Usual Suspects: 2 person-hours + 10 minutes preparation
  • Spend 10 minutes writing down lots of different ways that bugs manifest / get into your software (Use any or all of cause, effect, location, conditions, etc.). Aim for diversity and volume, not completeness or clarity. This might be fun as a whole-team exercise.
  • Leaving enough time for the review+generalise+share steps at the end, split the remaining time in two.
  • In one half of the time, pick out problems that you've not yet seen in this release, and look for them. Keep track of your activity.
  • In the other half, pick out places that haven't yet seen many problems, and look in those for problems. Keep track of your activity.
  • Review, collate, log, update.
  • Generalise and tell the whole team your results.
  • Schedule another chunk of time on a different day and repeat!
  • Make your trigger list publicly accessible. Invite people to contribute, share, refine etc.
That said, ET is a skilled approach, and it's easier to get those skills into a team with a bit of reading / taking a class / involving a coaching consultant. There are plenty of sources around about getting started with exploration. Niel vanEeden and I wrote a paper called Adventures in Session-Based Testing which may help. It's here: http://www.workroom-productions.com/papers.html. For pithy heuristics, Michael Bolton has recently blogged the results of a fine EuroSTAR workshop here, and you'll also want to check out Elisabeth Hendrickson's cheat sheet, James Bach's mnemonics, Jon Bach's questioning approach, James Whittaker's tours (hopefully turning up on his blog soon).

If you have a favourite bootstrapping-into-exploration source, post it here or on Anna's original posting.

Note - I have an upcoming, no frills, public class on Exploratory Testing, in London, December 8-9. Lots of exercises, discussions, and learning by example.

Thursday, October 30, 2008

Checklists

A regular correspondent (hello, thanks for triggering this) asked me about checklists and testing. I had a half-written blog entry on some checklist rules of thumb, and shared it with him. Here it is:

Personal
  • Checklists I make are typically more useful than checklists I get. I expect this is more closely-related to ownership than quality.
  • I tend to do items at the top first, so ordering may become important even if it isn't meaningful.
  • I can't take in more than around 20 items on a list without giving it some structure. To help make a list comprehensible, I use outliners, item codes etc, and group items into categories.
General
  • Laundry-type lists are different from shopping-type lists. Laundry lists tend to be set up as general resources, tend to be long, many items don't apply to current circumstances and so one picks a few, can be inspiring. Shopping lists are often made for a specific need and discarded later, one tries to cover everything on the list. I can use each type to enhance the other, but it's generally a bad thing if someone else uses one of my laundry lists as a shopping list and vice versa. Testing Example: "All my charters" is a laundry-type list. "All my charters for today" is a shopping-type list.
  • The value given to a laundry-type list can come from an assumption that it is exhaustive, and that its items are mutually exclusive. Few lists are either – there are often missing items, and existing items overlap. Testing example: Lists of non-functional testing qualities. The real value in this laundry-type list is often that it inspires a shopping-type list.
  • When categorising, it's important not to use categories as proxy items. Even if the list is exhaustive, many items can be re-categorised – so doing none of, or all of, a category can be more arbitrary than it seems. Testing example: charters grouped in a hierarchy.
My correspondent indicated I might be interested in a not-so-recent New Yorker article by Atul Gawande about checklists in medicine. Surfacing briefly into the rolling boil of tester blogs, it turns out that the article has triggered a gentle meme-ripple through the industry.

Gawande's article describes shopping-type lists, to aid memory and set out minimum requirements. It describes how those lists were used by medical professionals to help them do their jobs more reliably. The checklists covered the simple stuff, and called for no judgement or skill in their use (of course, plenty of skill is needed in their construction, and in following whatever comes next to the little tickety box). The results were impressive – but just as impressive was that the medics actually used the simple things.

It seems that the people involved were responsible for making their own lists (perhaps collectively rather than individually) and also for finding out if the lists were working. They were supported at multiple levels – nurses checked that the checklist was in use and the boxes ticked off, executives made sure that necessary supplies were available for the tasks on the list – so that there were few reasons not to actively use the checklist.

I'll add another note to my 'General' list above:
  • When you're doing a difficult job under pressure, checklists help you concentrate on the job, by allowing you to expend less attention on avoiding mistakes. Testing Example: a list of possible input means (type, mouse, paste, drag in, cause default, undo delete, refresh).
The trick in this counter-intuitive heuristic is the difference between concentrate on, and do. A checklist can let your mind work more freely because the standard stuff isn't going to be forgotten. Indeed, I make shopping lists to go shopping with so I can multitask (for which read listen to The Goons on my iPod).

The article doesn't deal with two important ideas;
  • Change; when+why do items come off a checklist (important for shopping-type lists).
  • Use; what kinds of situation are most amenable to lists.
Aside from recommending regular reviews, I have nothing to say here about changing checklists.
Checklists generally help in situations which are:
      • well-known
      • busy (in the sense of being dense with stimulus)
It's clear that the massive confirmatory unit tests (and 'acceptance' tests) that characterise agile development can be seen as shopping-type lists, and are all the more powerful for it. The subject is well known (and the tests describe that knowledge) and the environment busy (in the sense of very many tests being run in quick succession). As a list, it helps exactly because it allows one to expend less attention on mistakes. The great (though arguable) strengths of massive confirmatory tests are, however, a special case.

Software development is certainly busy, but much of it is not all that well known. From a test point of view, often we're looking for the unexpected. From an engineering point of view, it's often hard to know what is reliably effective, sufficient and harm-free. Partly as a consequence of this, we tend to start out with laundry-type lists rather than shopping-type lists.

To pick out a few busy+well-known areas specific to testing, one might look at test environment setup, and the list of information on the top of a session form. Both these have a kinship with pre-flight checklists, and if you're not already checklisting in these situations, I expect you would find it valuable.

However helpful checklists are, they are frequently resisted as 'dumbing-down' a skilled task. As one who has resisted in the past, and who is likely resist in the future – I feel this is an entirely fair criticism. Perhaps the best bet is to take an approach similar to that taken by Peter Pronovost, the article's protagonist:

  1. get whoever is making the {mistakes you're trying to avoid} to put together their own mnemonic/minimum standard list

  2. get them to measure themselves with and without the list in use

  3. provide strategic support to help ensure that there's no practical reason why something on the list can't be done, tactical support to help ensure that the list is actually used and used honestly.

Monday, October 13, 2008

Applied Improvisation

I just got back from the "First in a series of Applied Improvisation Network (AIN) London Events", and thought I'd share some impressions.

The AIN describes itself as "Spreading the Transforming Power of Improvisation". At this particular event, AIN founder Paul Jackson was going to "use improvisation activity to introduce a business theme ... [using] complexity/emergence as the business theme example". I decided to go along because it sounded fun. Re-reading this, I wonder if perhaps my taste for fun has become a little over-sophisticated. Never mind.

The evening was pretty much as described. Just under a dozen people turned up to the usual oddly-shaped room in a re-purposed building. We talked, interacted as individuals and as a group in a set of well-structured exercises, and pushed off for a pint/meal. All was fine and dandy, and I'll go again. I found the exercises interesting, and may adapt* some for my own consultancy and groups - LEWT people, expect to gather in self-selecting groups sometime soon.

However, in this evening's exercises, there was a frustrating focus on game over content. I was reminded of peer events I have attended which degenerate (and I mean degenerate) into good teachers swapping their favourite lessons. Enjoyable and informative, but I took much more away about facilitation exercises and ways to get people to engage in improv than about the structures and ideas of improvisation. Emergent behaviours were discussed, but as personal lessons emerging from an exercise, rather than as properties emerging from a system. Business was discussed, but in terms of getting business people up and interacting in workshops, not in terms of translating improvisational skills into their working environments.

Reacting to this, I revisited some of the ideas about improvisation that I was playing with over the summer, and present them here, tidied and decorated for your amusement.

I'm interested in the improvisation involved in exploring a city, making an extempore speech, singing harmony to an unfamiliar tune. We improvise when we cook a meal with whatever is in the fridge, when we need to get a USB key from behind that hotel radiator, when someone falls off a ladder in front of us, when we get lost - especially when we get lost. To be expert is to be able to improvise with confidence.

It's clear that my interest in improvisation is, in particular, the improvisation we do as individuals under pressure from external circumstances. Perhaps I'm just not a team player; after all, my sports are/were skiing, swimming and fencing. Ask the Choir if they agree before you jump directly to any conclusions...

Improvisation as AIN addresses it is useful, interesting, but seems (on the strength of a single meeting and a swift half) to be biased towards shortform group improvisation under circumstances imposed by the group. This is more complex in at least two ways, and a wonderful field of study - but the interests I list above would be poorly served if this was where improvisation stopped. Conversations indicated that, perhaps, improv was the only improvisation the group could discuss with engagement. I think there's more, and I look forward to interacting with the group and its approaches.

One-line summary: improvisation ≠ improv. Who knew!?


* at the workshop, no ownership was claimed, and no attribution given. One could use the viral meme (pace GPL) and apply the same rules when passing it on, or apply one's own standards if more stringent. I choose to apply my own standards - these exercises were facilitated, and may have been devised, by Paul Z Jackson. However, if it is the practice within this industry to change and neither claim nor attribute, I many yet adjust those standards to fit the context. For those interested in improv exercises, http://improvencyclopedia.org/ is a resource with more than enough (500+) to tickle your fancy.

Decisions and responsibility

There's plenty of mileage in group decisions, and in the wisdom of crowds. I presume that, as technology enables the convening of groups, we'll see more decisions made collectively. I hope - and believe - that in general those decisions will be better.

However, making a decision is not the whole picture. There is a degree of responsibility that goes with a decision, and I'm coming to the conclusion that a group decision is worth nothing if the individuals in the group are not prepared to take individual responsibility for their decision.

One-line summary: watch out for decisions made by groups whose members are disengaged

Tuesday, October 07, 2008

Agile2008 (and Glasto 2008)

Under the inspired guidance of Rachel Davies, Agile2008 modelled itself on Glastonbury Festival. I found myself performing at both, which came as a bit of a surprise.

Aside: I'll leave the muck and general debauchery of Glasto to your imaginations - suffice it to say that I was entirely sober, stayed at my Mum's, and was on-stage with my wife*. With 25 stages and 150000 revellers spread across a Somerset valley, Glastonbury's scale is as staggering now as it was when I first went in 1987 (no wife, no sleep, not sober). It turns out I'm no less impressionable at 40 than I was at 19. I loved it.

With 1500 people, Agile2008 was a couple of orders of magnitude smaller than Glastonbury. Like Glastonbury, it was as fascinating as it was overwhelming. Attendance was about as big as a relatively-technical hotel conference gets, but the truly staggering element was the 25 concurrent tracks. Being a power-law thing, this of course did not mean 60 people in each session, but hundreds in a few, and a small handful of patient listeners in most others. It meant that nobody saw more than a tiny fraction of the material on offer. However, the keynotes** were attended by a vast majority of participants, and served to align the subjects of conversation. With this, and the attention given to breakout areas, triggers for discussion, and informal entertainment/events there was a clear feeling of community. 

Performing or not (unless you're headlining), these events are at least as much about going and being with the crowd as they are about seeing the stars. You're as likely to love an act you stumble upon as an act you've waited years to see. I'm perversely proud of wandering away from the mighty (and very favourite) Massive Attack at the height (depth?) of their Other Stage thunder and into the elderly groove of the wonderful (and utterly new to me) Ethiopiques. I'm happy to have voted with my feet in Ron Jeffries / Chet Hendrickson's surprisingly artificial Natural Laws of Software Development and just as happy to have made the temporary acquaintance of that embittered sage, Brian Foote. I crossed paths with Toby Mayer half-a-dozen times, each time coming away with insight and inspiration.

The most interesting element for me was the degree to which the (formal) practices of Agility were not only reinvented by each team, but to a significant extent rejected. Two talks highlighted this particularly well:

Bob Martin's keynote gave the most visceral example, as he asked everyone in the room to put their hands up if they were involved with an agile project, then read a list of common practices and asked people to put their hands down as he listed practices which they did *not* do. By the time he had got about five items down his list, 1500 hands in the air had reduced to just one group, and a couple of dozen isolated hands across the hall. His next point; 'keep your hands up if all your tests are automated' took out the group (oddly enough, a gang of testers from Menlo Innovations) and only very few individuals remained in the game.

Scott Ambler's talk (which packed the 'Questioning Agile' stage/room) put real numbers on this phenomenon with a survey from Dr Dobbs in February. You can read his conclusions ("Agile in Practice: What Is Actually Going On Out There?") on his site, but better yet see a video of the talk (from about 5 feet from where I was squashed in) on InfoQ. You might be interested to know that he's made his data available for analysis. I've been looking through it from a testing point of view for a financial client, and his conclusion seems supported: "The easier practices, particularly those around project management, seem to have a higher adoption rate than the more difficult practices (such as TDD) ... For all the talk about TDD that we see in the agile community, it’s nowhere near as popular as doing a bit of up-front modeling, which we rarely hear anything positive about.". Indeed, I'd be tempted to say that the numbers indicate that practices related to testing are typically among the less-likely to be used. None the less, 80% of respondents felt they had better quality and happier customers. 

For those of you who are interested, my own talk (also on the Questioning Agile stage) went well (rather better than the Guardian Stage at Glasto), captured a good audience and generated some fruitful discussions. I took my slightly-jetlaggy part in the pre-conference "functional test tools" workshop (a physical extension of the ongoing discussion on yahoo group aa-ftt) which was worthwhile, but not terribly conclusive.

An excellent event - great for new perspectives, for new people, and for fun. I'd certainly go to Glastonbury again - and with any luck, Agile20xx.

~ o ~



* she leads, and I sing in, the London Bulgarian Choir. The lovely British Sea Power lent us part of their acoustic spot in the Guardian tent, and the girls sang with them for songs on the John Peel stage and the Left Field stage.

** James Surowiecki on diversity/wisdom of crowds, Alan Cooper on engineering user experience and iterative/incremental methods, Bob Martin on a 5th line to the Agile Manifesto ("we value craftsmanship over crap" - although I think there are efforts to make this more boardroom-friendly)

Sunday, September 21, 2008

Fun for explorers

If you like physics, or playing with stuff, you'll like this: FantasticContraption

A perfect example of exploring solutions, alternatives and refinements. For those of you who have been on my exploratory testing course, I couldn't teach ET with it, but I urge any readers interested in exploration to watch themselves - or someone else - working towards solutions, general applications, principles, components. I've just lost a chunk of time myself in a gentle whirl of levers and engines. Reminds me of Meccano, but it's fast, flexible, and never runs out of bits (or seizes up with friction). Must get AquaForest back on to the iPod Touch.

Wednesday, September 17, 2008

Not every revolution is part of an evolution.

Evolution is always revolutionary to those caught up in it. Only hindsight makes a revolution part of an ongoing evolution.

If you're not part of the revolution, you won't evolve. If you don't evolve, you're stuck in a dead end of ever-decreasing resources.

Of course, this is to confuse the individual and the tribe.

An alternate way of putting this - no individual survives (Darwinian) evolution. Thank goodness culture is Lamarckian.

Monday, September 15, 2008

Exploration and experimentation

I thought I'd share this, from The Economist:

In the 19th century it was commonplace to do an experiment simply to see what would happen. That was, in part, because experimenters were often amateurs who were spending private money. In these days of taxpayer-financed science, most experiments are executed with a pretty clear idea of what the outcome ought to be, especially when they are part of wars and campaigns against this or that. The paradox is that, although such efforts do not eliminate Becquerel-like discoveries, they risk limiting the chances of making them.

Tuesday, June 17, 2008

Tools for Exploratory Testing

I am often asked what tools I use for Exploratory Testing – most recently, by Patrick O'Beirne while doing a series of talks at SoftTest.

It rather depends on what I need to explore, but my own explorer's toolkit (with a probable unix bias) includes:

- for recording visual stuff: I most often use a camera, which allows screensnaps and brief audio+video without being platform-dependent. For screenshots, on a given platform, I use SnapzPro on Macintosh/OS X, and HyperSnap on Windows. I've been know to rig up a video camera to watch keystrokes and screen. I also use Mouseposé, which makes clicks obvious and shows keystrokes (including modifiers) in a big translucent bezel. 

- an aside: I also use Mouseposé when I'm showing people what I'm doing, and even to help me get visual feedback on the buttons I'm pressing. Butterfingers = happy accidents.

- for tools that specifically help with recording manual exploratory testing (ie records keystrokes and timings, allows annotation of screen movies etc.), I've used SiriusSQA's TestExplorer and BlueBerry's BBTestAssistant. For my purposes, they're often a bit heavy and windows-centric, but you may find one or the other is just what you need. You can try them yourselves, as both have a great attitude to limited-use trials. Spector - the outrageously intrusive spy-software - was often put forward as an alternative before tester-oriented tools became available. It seems un-disruptive, and offers comprehensive monitoring, but I found it hard to use for testing. You may find the terms of the license prevent you from using it well in a test environment, and there are difficulties in saving the information you've captured. Finally, it's so very much oriented to sneaking on spouses and employees that having the license makes me feel slightly creepy myself.

- for recording what I do, I was pretty-much paper based until a couple of years ago. I now use a dual system, keeping most of my scribbles and diagrams on paper as an extension of my thought processes, but using OmniOutliner to keep track of bugs found, to put timestamps on observations, arrange stuff hierarchically / foldably, and to allow me to search my notes.

- for monitoring system activity, I use "top" on UNIX-based systems. I've recently discovered the joy of Process Explorer and perfmon (when I got the logging finally turned on) on windows. On the Mac, Activity Monitor is reasonable, but Instruments (a wrapper for DTrace) is for serious work. Crash monitors for specific applications/OSes are good, too, as is a knowledge of where the logs are hidden (and how to extract them).

- for looking at differences, I use unix/windows tools comm and diff. Unix tools can be introduced to windows environments with Cygwin, UnixtUtils or many others. You could also try kdiff, which is pretty comprehensive and runs on everything, or FileMerge (in xCode) on the Mac. It's good to know one's way around regular expressions, so here's a link to ilovejackdaniels' great Regular Expressions cheat sheet. If you need to do specific Windows Registry checks, Process Monitor and TestExplorer will help you out.

- for influencing system activity, it's got to be Holodeck - the freebie on the back of "how to break software security" is less buggy than the version on "how to break software", and (being free) is $$$ less than the retail version. Roll on the day when developers expect this sort of capability to be built into OSs.

- for input data, I often have a library of useful data about the place - large files, empty files, different sizes of different formats of picture, files that have meaning to something I'm testing (ie an Excel file for testing within MS Word). I sometimes use Bach's perlclip for creating data. I use Excel for creating test data to fit requirements, and load it into SQL tables with CSV

- for output data, I primarily use Excel for most parsing/analysis work - with a little bit of fiddling to get in amongst datasets that break Excel's 65K row limit. I very much prefer DataGraph for exploring datasets graphically. It's Mac-only, but IMHO worth buying  a mac for.

- for automating , I use whatever's lying about at the client's site. Last time I bought a tool, it was Vermont High Test, so my licence is a bit out of date these days.

- for web testing, I push hard for Firefox to be within the browser gamut, and use FireBug. iMacros was in my toolkit for a while, and I wasn't over-familiar with it, but I've recently tried it out for loading web forms, and it's fine. The firefox version is still free (as in beer).

- for cross-browser testing, I use the free web service netrenderer if I've not got my kit with me. It opens a publicly-accessible web page in the browser of your choice, takes a screensnap, and shows it to you.

- for putting load into an app and measuring output (although not graphing it, because Excel's far better, and DataGraph better still), I use (cross-platform) JMeter. You might want to consider The Grinder.

- for emulators (dead handy), I've had great results with VirtualPC on PCs. The field is moving fast, and these days I'd look specifically at VMWare and Parallels. I've used Parallels on the Mac lots, but not for testing (yet).

I'm sure you use different tools. When I teach Exploratory Testing, the tools workshop is often most eye-opening in closed teams, where one person is often sitting on a tool whose use is immediately apparent to other participants - yet they've never shared their toolsets until the class. Try it yourself – I'd be fascinated to know what tools you use.

[Edited to include OmniOutliner, Mouseposé and to clarify the TextExplorer / BBTA entry]
[Edited to include netrenderer]

Tuesday, May 13, 2008

Heathrow T5 and software testers

CIO magazine pulls no punches when they say that inadequate software testing (was) behind T5 problems. I won't make easy assumptions based on that article - you can read it and make your own.

However, if you've been following the story, you'll be interested to know that Huub van der Wouden and Roger Derksen from Transfer Solutions in the Netherlands will be giving the opening keynote on Testing the Heathrow Terminal 5 Baggage Handling System at the London SIGiST on June 18th.

Saturday, May 10, 2008

Searchable library - content, not title

So I was going to buy Gerard Meszaros' book xUnit Test Patterns at StarEast, but the excellent selection from Breakpoint Books had sold out by midway through day 2.

I find that I already rent the online, searchable edition. If it wasn't such a pig to use, I'd love Safari books online.

I'd like to have an online search for my existing luggable library, to let me find what I need in books where I don't have an index, or where the index is rubbish. Google Books gets close, but only gives one return per book. If you know how to get more than one, do let me know.

Thursday, May 01, 2008

How much TV does a Wikipedia cost?

Clay Shirky has written a book called "Here Comes Everybody: The Power of Organizing Without Organizations". Here's a couple of localised Amazon links - amazon.co.uk, amazon.com. Now, I'm not about to recommend the book without reading it. I am about to buy it so I can read it on my way to teach at STAREast.

However, the interconnected world allows us all to read / see what the book's about. Here's a fine blog post of his summarising talk. For those more visually/aurally oriented, here's a the talk itself.

Answer to the question above: Clay says that each year, the people in the USA spend 2,000 times more time than has been needed, so far, to construct Wikipedia. Except that he says it better. Read the article.

Let's work these numbers a tad to give us a feel for their reasonableness. If there are 500,000,000 people in the US, then the work done on Wikipedia so far is equivalent to the TV-watching habits of 250,000 people. That's round about the population of Boise (Idaho) or Daytona Beach. For those of us closer to GMT, think Belfast. If the average citizen spends 10% of their waking time watching TV (The NYT says it's over 4 hours, which is closer to 30%, but let's make life simple and not cut out TV entirely - I assume that many have the TV on while eating, talking, making macramé wallhangings), then it would take the waking time of 25,000 people for around a year. English-only wikipedia has 2.5 million articles, so that's around 3 articles a day per person - I'd have said 1 or less, but we would seem to be around the right kind of numbers.

Of course, this may be a circular analysis if Clay started with 2.5 million articles at around 4 hours a piece to write.

A question for you: Do you know anyone who actually understands the numbers, ie can apply them in their everyday lives? Who understands fundamentally how large numbers of people and small commitments/risks/expenditures actually add up? It doesn't exactly come up in conversation much, but I don't think I know anyone who has a clear feel for this. Perhaps it's my generation. Then again, I know plenty of people who have a fine handle on atomic measurements and cosmological time - perhaps I should get to know some social scientists. 

I'm aware that, as humans, we're bad at things outside our direct experience - and have to either put things into scales we can understand, or manipulate the numbers directly. My problem is, perhaps, that we're bad at viscerally understanding how large (small)  a thousand (th), a million (th), or a billion (th) actually is - and so when we combine a tiny with a huge we're bad at understanding what that means. In our actual lives, being out by a factor of two is plenty - but when dealing with things beyond our ken, it's much harder to spot. Both the following are out, by a bit - but by how much? 1) A trillion (US) dollars spent on the Iraq war. 2) One in ten million chance of winning the (UK) lottery. To get light-headed, what if the war spend had been aimed at the lottery? Fifty thousand winners, you say, currency conversion being on your mind. Let's say you'd only want one winning ticket a week, if you could help it - that's a winning lottery ticket every week all the way from the battle of Hastings to sometime around my unborn children's late middle age.

Hmm. Let's get back on theme. For the testers among you, another question: How are we sizing our beta tests? At what point might a beta test outweigh a local test team? At what point might a beta test be reasonable expected to have found problems that could surface in the first month of general use? 

Justified finger in the air estimates preferred to unsubstantiated formulae.

Wednesday, April 30, 2008

Fixing 7digital's playback problems in iTunes

Summary: you'll need to rename .mp4 to .m4a

I just bought The Age of The Understatement by The Last Shadow Puppets, which is rather lovely. As an experiment, I bought it from 7digital - it cost a fiver rather than £7.99 on iTunes.

The beggar doesn't play properly on my kit. Each track pauses for 5 seconds or so, just under 10 seconds into the track - which is hardly conducive to great listening pleasure. As far as maintaining the atmosphere goes, it's a bit like turning the house lights on in a nightclub for a few seconds, just as every track gets going.

This is not normal, so for the configurators among you, I'll describe normal. Music comes from the iTunes store or CDs. I play the songs using iTunes under Leopard on my non-intel Mac Mini. As a tester, I've got CPU monitoring on, always, and as a tester I notice that the machine is pushing as hard as it can during the gap, but not either side. This is also not normal.

Let me also describe the differences - comparing specifically with iTunes Plus, as both deliver AAC audio without DRM protection.

The most obvious difference is that 7digital sent me the files encoded at 320kbps, compared with iTunes' 256kbps. The problem could be to do with the way iTunes codec responds to higher than expected encoding rates. However, I've successfully worked with AAC of my own projects encoded at this rate.

One consequence of this difference is that the files are larger, so perhaps this is at the root of my problem. However, I'm instinctively less than keen on the this as a target for investigation. Digging into that instinct, I can rationalise that 1) the kit is perfectly capable of shifting the quantity of data, 2) the problem is limited to part of a song, 3) the position and duration of the dropout seems unrelated to song length. It doesn't feel like something to do with amount of data.

Another difference: while looking at iTunes' information for the song, I find that it's listed as an MPEG-4 video file. This seems inappropriate - I'd expect it to be an AAC audio file. This is, after all, how it was described at download. I'm aware that this is linked to the extension, and checking the files, find the extension is .mp4. Pretty much everything else is .m4a, including those iTunes Plus files. Other iTunes files are .m4p. Both these extensions indicate Apple's AAC format, one unprotected, the other rights-managed. Those in the know can chastise me for (at least) two reasons. Firstly, MPEG-4 is a container, not a format. Secondly, iTunes should treat .mp4, .m4a and .m4p files the same - if it doesn't, it's iTunes that has the bug, not 7digital.

In the spirit of investigation and experimentation, I change the extension from .mp4 to .m4a, and the problem is gone - no hot-running CPU, the right details in the information. Most importantly, the music now plays without interruption.

Now, this is a hack. It may have corrected a simple mis-configuration. However, sound files are complex containers, and their interactions with various players are not always predictable. Will iTunes be able to burn these tracks onto a one-off mix CD? Will they play nicely with the iPod? If I chuck them into Ableton's Live to mix and mash, will there be a problem? Heaven knows, until I try. Perhaps I should have stuck with their MP3 versions . . .

The immediate problem is solved. I've not resolved the bug, but I've developed an understanding of it which has led to a fix for my machine and my purposes.

Now I'm bug-hunting, here are a few more I noticed along the way. These may help you fix/get round/be prepared for stuff on your own machine.

  • The transition between Black Plant and I Don't Love You Anymore is stuffed - there's a hole where there should be a transition. This isn't iTunes' infamous truncation bug, it's because it sticks a pause between tracks unless told not to. It's easily fixed - setting the album to "gapless" does the trick.
  • 7digital's site, for reasons best known to itself, gives the tracks un-numbered, and in reverse order - it was only when I played the album that I noticed that due to my own finger trouble, I had two Calm like Us and no Standing Next to Me. That said, re-downloading is trivially easy and (with the server in the UK) pleasantly speedy.
  • I have to trigger each download individually (dull) if I don't want to install their downloader (I don't, thankyou - and I can't anyway as it's PC only), and when done have to put the tracks into my library (whine whine whine).

Customer impressions of 7digital: they are good value, but slightly troublesome. The trouble may not be their fault, but it means I have to work to use their music on my kit. Not their fault, but I may not buy again.

Customer impressions of The Age of the Understatement: a fine album - good songs, great words. Retro and fun without too much pastiche. Epic, if you're in the mood; insert iPod and become the hero of your supermarket trolley. Better fixed than buggy - I like it enough to mean that I'd buy a working copy if I'd not managed to fix this one, but not without a niggling sense of being conned.

Update: shortly after posting this entry, I was contacted by the CTO of 7digital, who is clearly on the ball. The .mp4/.m4a thing is now with his technical people. Good to help - better to be useful.

Monday, April 14, 2008

Comedy bug

So: I'm having a problem or two that's likely to be related to a particular software package. I can't say I'm surprised - the software's been buggy and crash-prone from the off, has a UI designed to appeal only to a designer, and the online support is anything but.

I choose to uninstall the software with an application the supplier supplies. Halfway through the install, it tells me that I should have quit out of mail, as it's uninstalling a component with which it vandalised my mail application. It quits its own uninstall and gives me a nice, although temporary log.

I quit mail, and (with a hint of a tester premonition) head off to find the uninstaller, so I can uninstall the final component of this rotten puzzle.

The uninstaller has, naturally, uninstalled itself.

Marvellous.

(for those of you unfamiliar with why this is a problem, it's broadly analogous to locking the keys inside the car)

Wednesday, April 09, 2008

Running LEWT

Rikard Edgren's WhereTesting Creativity Grows pointed me to a resource that should be read by anyone trying to create an environment that supports creativity. Nils-Eric Sahlin's article creative environments:
a simple recipe
(translated by Linda Schenck) neatly encapsulates and codifies the principles that I try to use for LEWT - and, indeed, extends them.

To borrow Rikard's summary, they include;
  • generosity
  • a sense of community
  • qualifications
  • cultural diversity
  • trust and tolerance
  • equality
  • curiosity
  • freedom of spirit
  • small scale


It's interesting that the facile quality 'communication' is not included. It's also interesting that qualifications (which don't particularly play their part in LEWT) is more subtle that I had perhaps expected. I'll be considering making some minor changes so that people are more confident in their qualification to be part of the group - although I'll be careful about affecting its diversity. My feeling that LAWST's guru-led approach is problematic is neatly caught by the note on equality.

I'll be posting a link to the article to the LEWT discussion group, and hope to start a conversation. Let me know if you'd like to be part of it.

Creativity in Software Testing

Software testing is frequently - and fundamentally - a creative endeavour. That creativity is closer to that employed for mathematics and music than for statistical analysis or singing.

Although I've talked about this until I've bored all around me, I've never managed to write anything coherent on the subject. So I'm pleased that I came across Rikard Edgren's Where Testing Creativity Grows, as it is one of the very few papers I'm aware of that looks at this important, yet unaccountably neglected perspective.

You can find it here: http://qualtechconferences.arobis.com/upload/documents/REdgren_where_testing_creativity_grows.pdf

I encourage you to have a read. If you know of other papers on the subject, post them here.

James