The Value of Gnutella and Freenet

by Andy Oram
May 12, 2000

Notice how much bad press has fallen recently on the networking technologies Gnutella, Freenet, and Napster? I think some of the public alarm over genetic crop modification has cross-pollenated over to software. Suffering from legitimate fears over far-reaching technologies like genetic modification, the Strategic Defense Initiative, and nuclear waste disposal, the press and the public are ready to listen to anything bad said about anything new, even a clean, open, noninvasive technology like distributed computing.

If you check my biography, you will see that I make my living selling content. I do not extend knee-jerk sympathy to systems publicized as ways to circumvent copyright enforcement. But investigating Gnutella, Freenet, and Napster, I have been pleasantly surprised to find that they’re intriguing innovations in the best tradition of the Internet heroes. While it’s important to talk about their potential for the distribution of illegal content, we have to start with their larger goals and the promise they offer.

The title of this essay contains a hidden message. There are important areas where Gnutella and Freenet have value, but there are also areas where they don’t offer much value. The area where all the fears are being spawned—the distribution of illegal, defamatory, or copyright-infringing material—is actually not a big danger, according to my analysis. I’ll return to this controversial conclusion after I describe the two systems.

Basic goals

Gnutella and Freenet are simple protocols that let sites query one another in a chain—the way systems have always exchanged news and mail over UUCP—in order to find material matching a search string. On the most superficial level, they can be treated as alternative search engines; in fact, the most exciting potential for Gnutella right now is to enable a new generation of super-search engines. I talk more about the technical aspects of these systems in a companion article, Gnutella and Freenet Represent True Technological Innovation.

More conceptually, Gnutella and Freenet make location irrelevant; data belongs to the whole system rather than to a particular server. Freenet in particular is aimed at protecting anonymity and distributing information in such a way that its origin cannot be traced and its location is irrelevant. Once somebody releases a copy of Battlefield: Earth to Freenet, not even the battalions of lawyers mobilized by the Church of Scientology would be able to get it removed.

While most press reports lump Gnutella together with Freenet, Gnutella is really not designed for anonymity. When queried for information, Gnutella sites are likely to return a URL or some other identifying information. Two Gnutella developers I talked to, Gene Kan and Spencer Kimball, explained that Gnutella goes beyond simple file sharing to allow the distributed processing of search queries, and thus to better distribute information about what’s available online.

Gnutella and Freenet software are both open sourced. The Free Software Foundation designed the GNU General Public License so that source code could not be stripped of its open status. Gnutella and Freenet extends this irreversible status to content: any content placed on one of those systems becomes nearly impossible to control. Freenet is particularly well designed with that end in mind; as we shall see, Gnutella offers its own impressive benefits. By extending the freedom of open-source software to anything that can be digitized, both are profoundly viral.

I have addressed Napster briefly in another well-pubicized paper, a comment to the U.S. Copyright Office on the behalf of Computer Professionals for Social Responsibility. The comment points out that Napster is essentially a combination of two well-established technologies, a directory service and a file transfer protocol. (Erik Nilsson has pointed out that Napster is also a new namespace with powerful capabilities.) Some librarians are even thinking of adapting the basic Napster model for a system that would facilitate interlibrary document exchange. Surprised to hear that the model has legitimate uses?

The lawyers I’ve heard don’t hold great expectations for the legal success of Napster, because it focuses on the distribution of MP3 files and is wide open to the charge of contributory infringement. But my comment to the Copyright Office concludes, “A challenge to Napster, based simply on the proclivity of its users to breach copyright, is a challenge to the basic technologies on which the Internet is based.” As I will show, a challenge to Gnutella and Freenet is even worse, because it cuts off promising directions where the Internet needs to grow. The rest of this article concentrates on those two systems.

The next stage in search engines

One of the most worrisome developments on the Web is the inadequacy of existing search tools to work in an era when Web sites depend increasingly on database queries and dynamically-generated temporary URLs. Many sites have their own sophisticated searches, but you have to visit the site and enter the string manually (or study the site’s HTML form and write a customized LWP script—in any case, you have to narrow your search to that single site). Data is generated dynamically for each query. There is no way for a search engine to find the information during a Web crawl, because no URL even exists until the user queries the database. In short, users never find many sites that have the information they want.

Gnutella offers the path forward. It governs how sites exchange information, but says nothing about what each site does with the information. A site can plug the user’s search string into a database query or perform any other processing it finds useful. Search engines adapted to use Gnutella would thus become the union of all searches provided by all sites. A merger of the most advanced technologies available (standard formats like XML for data exchange, database-driven content provision, and distributed computing) could take the Internet to new levels.

Is the genie no longer a dream?

A government could theoretically shut down all computers within its jurisdiction that run a Gnutella or Freenet site, and could force routing points to filter out packets from Gnutella or Freenet sites outside its jurisdiction. Some countries have pretty good success at screening unwanted sites. (Mostly countries with small populations and minimal Internet penetration, like Saudi Arabia and Vietnam.) So-called democratic governments could try to do the same on the grounds that the sites are guilty of contributory and vicarious copyright infringement, as the Recording Industry Association of America claims in its suit against Napster. Even the software itself could be suppressed on the grounds that its primary purpose is to overcome copyright restrictions; that’s how a notorious Copyright Act clause is being used against DeCSS.

But to do so would be a crying shame. Gnutella and Freenet have much to offer; in addition to the search possibilities already mentioned, they distribute information in a way that offers an intriguing alternative to the heavy, expensive, overly centralized servers that characterize the Web at present. The data propagation model used by Freenet, in which data spreads out in unusual and surprising patterns like the classic computer game of Life, is a model well worth studying.

Ian Clarke, creator of Freenet, is pretty sure the genie is out of the bottle. “If I don’t release Freenet, the copyrighted information will get out eventually. Maybe Freenet will make it happen a little faster, but it should serve as a wake-up call.” And Gene Kan says, “Copyright holders have encountered waves of new technologies over the decades; they’ve started by fighting every one and ended by reaping even bigger profits from the new technologies than before. Every week that the RIAA spends trying to get rid of things like Napster is a big wasted opportunity for it to capitalize on this method of distribution.”

Would you get free content from Napster, Gnutella, or Freenet?

The spread of MP3 files, and their centrality to Napster, skew the debate over free and copyrighted content. Lots of people are willing to download free music files from strangers, because if they find out that the sampling quality is lousy or the song breaks off halfway through, nothing has been lost. They can go back to Napster and try another site.

Matters would be entirely different if you tried to get free software from strangers, especially in binary form. You’d never know whether a Trojan Horse was introduced that, two years later, would wipe your hard disk clean and send a photo of a naked child to the local police chief. (And you thought UCITA’s self-help provision was as bad as it could get!)

True, people get binary software or “warez” from unauthorized sources already, but they often have a pre-existing relationship with the person putting up the software. Ironically, they can trust the unauthorized software precisely because it is copyrighted and available only in binary form; malicious people would find it extremely difficult to patch it so that it can still run but produce deleterious effects on the user (unless those malicious people are angry manufacturers—will we start to experience this kind of self-help from vendors?)

The gist of this section is to counter John Perry Barlow’s famous phrase “Information wants to be free” with the somewhat less well-known reply, “Information wants to be valuable.” When software comes from anonymous sources—unless you obtain and read the source code—its value drops to nearly nothing.

Let’s take a more meritorious example: a human rights observer who posts a long list of crimes committed by the Pinochet regime in Chile, along with precise descriptions of how military leaders were implicated in each crime. If the observer wants to remain anonymous, it will be hard to trust this report, but sometimes internal details can convince trained experts that a report is genuine. Still, nothing prevents the implicated military leaders from flooding a system like Gnutella or Freenet with altered versions of the report that are plausible enough to cause confusion and raise doubts about which version is the real one. Unless digitally signed and traceable, such a report will have little value.

In short, anonymity is the enemy of reliability. Anonymity is valuable for many purposes, such as in support groups for the victims of abuse; it is also a shield for distributing certain types of content where reliability doesn’t matter. Gnutella and Freenet could therefore be sources for pornography and for copyrighted music or movies. But people who care about quality will choose identifiable sources.

We need systems like Gnutella and Freenet. They are not only legitimate objects for research, but solutions to certain technical problems arising on the Internet. When did we start to fear the future so much that we subject such innovations to calumny?


Andy Oram is an editor at O’Reilly & Associates. This article represents his views only. It was originally published in the online magazine Web Review.