Wednesday, August 27, 2008

Wikipedia and the question of who owns Darwin

We live in age where "information wants to be free" and steadily ever more information that one used to have to pay for is now available online free of charge, from maps to bibliographic databases such as Medline to scientific articles themselves. But "free to look at" does not mean "free to use for whatever purpose one likes", and what happens when two online giants of the information age (Wikipedia and Darwin Online) collide?

The Wikipedia must surely rate as one of the seven wonders of the Internet. But although simple in concept--an encyclopedia in which anyone can create and edit material--in fact, it has grown to be a highly complex  trans-national organisation, with its own culture and customs. Beneath the simple exterior of the wiki article lies a complex machinery of rules, edits, discussions and guidelines...

A couple of times when I have tried to contribute myself, I have come away somewhat disgruntled--once over the neutering suffered by the no-bacteria-on-the-moon article mentioned in an earlier post and on another occasion, when I tried to create an entry for Charles Darwin 1st (1758-1778), the uncle of the more famous CD and the son of Erasmus Darwin. Before I had even finished the article, a notice was slapped on, saying that the article was likely to be deleted very soon on grounds of lack of notability--a decision made by someone whose credentials one could only guess at!

But although my own experiences have led me to have mixed feelings about the Wikipedia, I marvel at the effort put in behind the scenes in terms of arguments for or against a given position on what should be included. There is something magical here with the Wikipedia, as I know from experience that attempts to get research communities to engage in community annotation or maintenance of subject-specific wikis (e.g. the E. coli wiki) usually collapse in apathy, as academics say they are just too busy to contribute. How is it that the Wikipedia manages to enlist so much support from so many unpaid volunteers! And how do these people find the time for all this, when academics are so busy?

A couple of examples suffice to illustrate the remarkable internal machinations of the Wikipedia:

1. In one of the Wikipedia's lamest edit wars, the issue of whether the fact that Darwin and Lincoln were born on the very same day is notable enough for inclusion in their respective Wikipedia entries has generated the equivalent of a short novel's worth of words of discussion! See further discussion on the I'm-from-Missouri blog.

2. Just this morning I came across this protracted and fascinating discussion as to whether John van Wyhe at Darwin Online is able to assert copyright on the images scanned from out-of-copyright printed works. This issue has a particular interest for me, as we explored whether we could afford to use some of John's images in The Rough Guide to Evolution in the light of the tight budget we had been allotted for pictures. In a nutshell, the argument is whether the Wikipedia and its sister resource, Wikimedia Commons, has the right to appropriate these images from John's Darwin Online site without his permission and without payment on the grounds that the images are in the public domain. 

John argues that the act of scanning creates copyright, at least in English law. Various wikipedians argue that mechanical processes that do not add originality do not create copyright where none exists to start with; that English copyright law does or does not allow for "sweat of the brow" efforts; that the Wikipedia should or should not ignore English law and so on! 

John, perhaps wisely, has refused to take part in the discussions--he is busily writing or editing several books on Darwin (working on one book has been exhaustive enough for me!). In the end, the wikipedians seem to have side-stepped the issue by starting to scan in their own images, and so I am left unclear what the precise legal position is. One elephant-in-the-room argument that they haven't addressed is whether changing an image from a paper format to a digital format creates copyright. Or does this not make any difference?

The question of who owns Darwin goes deeper than these arguments over copyright of images. Cambridge University claims to own the copyright on all Darwin's letters, whether held by the University or in private hands. This seems a bit odd to me, as when Darwin sent someone a letter, one would have thought that the recipient gained ownership of the letter and perhaps even copyright of what was in it. I wonder what case law exists on this.

One thing is clear, in the Internet age, who owns Darwin remains a highly contentious point!


nickloman said...

The database right may be of even more importance than copyright, if Wikipedia would wish to re-use a substantive portion of John's collection.

Mark Pallen said...

I think they only ever intended to use a few of the images, rather than any of the text files or images of text.

Nihiltres said...

See the Bridgeman Art Library v. Corel Corp. Wikipedia article for a rough explanation of why Wikipedia can use scans of public-domain documents. Wikipedia, which is hosted in the United States (though its scope is international), generally follows U.S. copyright law.