Project Gutenberg (1971-2008)

Chapter 3

Chapter 32,559 wordsPublic domain

Donations are used to buy equipment and supplies, mostly computers and scanners. Founded in 2000, the PGLAF (Project Gutenberg Literary Archive Foundation) has only three part-time employees.

More generally, Michael should be given more credit as the real inventor of the electronic book (eBook). If we consider the eBook in its etymological sense, that is to say a book that has been digitized to be distributed as an electronic file, it is now 37 years old and was born with Project Gutenberg in July 1971. This is a much more comforting paternity than the various commercial launchings in proprietary formats that peppered the early 2000s. There is no reason for the term "eBook" to be the monopoly of Amazon, Barnes & Noble, and others. The non-commercial eBook is a full eBook, and not a "poor" version, just as non-commercial electronic publishing is a fully-fledged way of publishing, and as valuable as commercial electronic publishing. Project Gutenberg eTexts are now called eBooks, to use the recent terminology in the field.

In July 1971, sending a 5K file to 100 people would have crashed the network of the time. In November 2002, Project Gutenberg could post the 75 files of the Human Genome Project, with files of dozens or hundreds of megabytes, shortly after its initial release in February 2001, because it was public domain. In 2004, a computer hard disk costing US$140 could potentially hold the entire Library of Congress. And we probably are only a few years away from a storage disk capable of holding all the print media of our planet.

What about documents other than text? In September 2003, Project Gutenberg launched Project Gutenberg Audio eBooks. As of December 2006, there are 367 computer-generated audio books and 132 human-read audio books. The number of human-read books should greatly increase over the next few years. There were 412 books in May 2008. As for computer-generated books, they won't be stored in a specific section any more, but "converted" when requested from the existing electronic files in the main collections. Voice-activated requests will be possible, as a useful tool for visually impaired readers.

Launched at the same time, The Sheet Music Subproject is dedicated to digitized music sheet. It also contains a few music recordings. Some still pictures and moving pictures are also available. These new collections should take off in the future.

But digitizing books remains the priority, and there is a big demand, as confirmed by the tens of thousands of books that are downloaded every day. For example, on July 31, 2005, there were 37,532 downloads for the day, 243,808 downloads for the week, and 1,154,765 downloads for the month. On May 6, 2007, there were 89,841 downloads for the day, 697,818 downloads for the week, and 2,995,436 downloads for the month. A few days later, the number of downloads for the month hit the landmark of 3 million downloads. On May 8, 2008, there were 115,138 downloads for the day, 714,323 downloads for the week, and 3,055,327 downloads for the month. This only for transfers from ibiblio.org (University of North Carolina at Chapel Hill), the main book distribution site (which also hosts the website). The Internet Archive is the backup distribution site and provides unlimited disk space for storage and processing.

Project Gutenberg has 40 mirror sites in many countries and is looking for new ones. It also encourages the use of P2P for sharing its books.

The "Top 100" lists the top 100 books and the top 100 authors for the previous day, the last 7 days and the last 30 days.

Project Gutenberg books can also help bridge the "digital divide." They can be read on a computer or a secondhand PDA costing just a few dollars. Solar-powered PDAs offer a good solution in remote regions and developing countries.

Later on, it is hoped machine translation software will be able to convert the books from one to another of 100 languages. In ten years from now, it is possible that machine translation will be judged 99% satisfactory (research is very active on that front, but there is still a lot to do), allowing for the reading of literary classics in a choice of many languages. In 2004, Project Gutenberg was in touch with a European project studying how to combine translation software and human translators, somewhat as OCR software is now combined with the work of proofreaders.

37 years after the beginnings of Project Gutenberg, Michael Hart describes himself as a workaholic who devotes his entire life to his project, because he thinks electronic books will become the "killer ap(plication)" of the computer revolution. He considers himself a pragmatic and farsighted altruist. For years he was regarded as a nut but now he is respected. He wants to change the world through freely-available books that can be used and copied endlessly. Reading and culture for everyone at minimal cost. Project Gutenberg's mission can be stated in eight words: "To encourage the creation and distribution of eBooks," by everybody, and by every possible means. While implementing new ideas, new methods and new software.

According to him, there might be 25 million books belonging to public domain in the main regional and national libraries in the world, without counting various editions. If Gutenberg allowed everyone to get print books at little cost, Project Gutenberg could allow everyone to get a library of electronic books at no cost on a cheap device like a USB drive. So far, in April 2008, 25,000 high-quality books were available for free.

Let us give the last word to Michael, whom I asked in August 1998: "What is your best experience with the internet?" His answer was: "The notes I get that tell me people appreciate that I have spent my life putting books, etc., on the internet. Some are quite touching, and can make my whole day." Ten years later, he confirms that his answer would still be the same.

8. CHRONOLOGY

[*1971/07 = year/month]

1971/07: Michael Hart keyed in The United States Declaration of Independence (eBook #1) and informed the first 100 internet users. Project Gutenberg was born.

1972: He keyed in The United States Bill of Rights (eBook #2).

1973: He keyed in The United States Constitution (eBook #5).

1974-88: He keyed in parts of the Bible and several works of Shakespeare.

1989/08: The King James Bible (eBook #10).

1991/01: Alice's Adventures in Wonderland, by Lewis Caroll (eBook #11).

1991/06: Peter Pan, by James Barrie (eBook #16).

1991: Digitization of one book per month.

1992: Digitization of two books per month.

1993: Digitization of four books per month.

1993/12: Creation of three main sections: Light Literature, Heavy Literature, Reference Literature.

1994: Digitization of eight books per month.

1994/01: The Complete Works of William Shakespeare (eBook #100).

1995: Digitization of 16 books per month.

1996-97: Digitization of 32 books per month.

1997/08: La Divina Commedia di Dante, in Italian (eBook #1000).

1997: Launching of Project Gutenberg Consortia Center (PGCC).

1998-2000: Digitization of 36 books per month.

1999/05: Don Quijote, by Cervantès, in Spanish (eBook #2000).

2000: Creation of Project Gutenberg Literary Archive Foundation (PGLAF).

2000/10: Charles Franks started Distributed Proofreaders to assist Project Gutenberg.

2000/12: A l'ombre des jeunes filles en fleurs, 3rd volume, by Proust, in French (eBook #3000).

2001/08: Creation of Project Gutenberg of Australia.

2001/10: The French Immortals Series (eBook #4000).

2001: Digitization of 104 books per month.

2001: Distributed Proofreaders became the main source of Project Gutenberg books.

2002: Distributed Proofreaders became an official Project Gutenberg site.

2002/04: The Notebooks of Leonardo da Vinci (eBook #5000).

2002: Digitization of 203 books per month.

2003/08: "Best of Gutenberg" CD with 600 books.

2003/09: Launching of Project Gutenberg Audio eBooks.

2003/10: The number of books doubled in 18 months, going from 5,000 to 10,000.

2003/10: The Magna Carta (eBook #10000).

2003/12: First DVD, with 9,400 books.

2003: Digitization of 348 books per month.

2003: Project Gutenberg Consortia Center (PGCC) became an official Project Gutenberg site.

2003/12: Launching of Distributed Proofreaders Europe by Project Rastko.

2004/01: Launching of Project Gutenberg Europe by Project Rastko.

2004/02: Michael Hart went off to Europe (Paris, Brussels, Belgrade).

2004/02: Michael Hart's presentation at UNESCO headquarters, in Paris.

2004/02: Michael Hart's visit to the European Parliament, in Brussels.

2004/10: 5,000 books processed by Distributed Proofreaders.

2004: Digitization of 338 books per month.

2005/01: The Life of Reason, by George Santayana (eBook #15000).

2005/05: 7,000 books processed by Distributed Proofreaders.

2005/05: First 100 books processed by Distributed Proofreaders Europe.

2005/06: 16,000 books in Project Gutenberg.

2005/06: First 100 books in Project Gutenberg Europe.

2005/07: 500 books at Project Gutenberg of Australia.

2005/10: 5th anniversary of Distributed Proofreaders.

2005: Digitization of 252 books per month.

2006/01: Launching of Project Gutenberg PrePrints.

2006/02: 8,000 books processed by Distributed Proofreaders.

2006/05: Creation of the Distributed Proofreaders Foundation.

2006/07: 35th anniversary of Project Gutenberg.

2006/07: New DVD, with 17,000 books.

2006/11: Launching of the Project Gutenberg News website.

2006/12: 20,000 books in Project Gutenberg.

2006/12: 400 books processed by Distributed Proofreaders Europe.

2006: Digitization of 345 books per month.

2007/03: 10,000 books processed by Distributed Proofreaders.

2007/04: 1,500 books in Project Gutenberg of Australia.

2007/07: Creation of Project Gutenberg Canada (PGC).

2007/12: Launching of Distributed Proofreaders of Canada (DPC).

2007: Digitization of 338 books per month.

2008/03: 100 books in Project Gutenberg of Canada.

2008/04: 25,000 books in Project Gutenberg.

2008/04: English Book Collectors, by William Younger Fletcher (eBook #25000).

2008/05: 500 books in Project Gutenberg Europe.

9. STATS

*All the stats below are the main Project Gutenberg stats. Stats about other Project Gutenberg sites (Australia, Canada, Europe) are provided in Project Gutenberg News.

= A Few Milestones

1,000 books in August 1997.

2,000 books in May 1999.

3,000 books in December 2000.

4,000 books in October 2001.

5,000 books in April 2002.

10,000 books in October 2003.

15,000 books in January 2005.

20,000 books in December 2006.

25,000 books in April 2008.

= New Books: Yearly Averages

2001: 1,244 books per year.

2002: 2,432 books per year.

2003: 4,176 books per year.

2004: 4,058 books per year.

2005: 3,019 books per year.

2006: 4,141 books per year.

2007: 4,049 books per year.

= New Books: Monthly Averages

2001: 104 books per month.

2002: 203 books per month.

2003: 348 books per month.

2004: 338 books per month.

2005: 252 books per month.

2006: 345 books per month.

2007: 338 books per month.

= New Books: Weekly Averages

2001: 24 books per week.

2002: 47 books per week.

2003: 79 books per week.

2004: 78 books per week.

2005: 58 books per week.

2006: 80 books per week.

2007: 78 books per week.

= A Few eBooks

eBook #1: The United States Declaration of Independence (1776) [posted in July 1971].

eBook #2: The United States Bill of Rights (1789) [posted in 1972].

eBook #5: The United States Constitution (1787) [posted in 1973].

eBook #10: The King James Bible (1769) [posted in August 1989].

eBook #11: Alice's Adventures in Wonderland, by Lewis Caroll (1865) [posted in January 1991].

eBook #16: Peter Pan, by James Barrie (1904) [posted in June 1991].

eBook #100: The Complete Works of William Shakespeare (1590-1613) [posted in January 1994].

eBook #1000: La Divina Commedia di Dante (1321, in Italian) [posted in August 1997].

eBook #2000: Don Quichote, by Cervantès (1605, in Spanish) [posted in May 1999].

eBook #3000: A l'ombre des Jeunes Filles en Fleurs, vol. 3, by Marcel Proust (1919, in French) [posted in December 2000].

eBook #4000: The French Immortals Series (1905) [posted in October 2001].

eBook #5000: The Notebooks of Leonardo da Vinci (early 16th century) [posted in April 2002].

eBook #10000: The Magna Carta (early 13th century) [posted in October 2003].

eBook #15000: The Life of Reason, by George Santayana (1906) [posted in January 2005].

eBook #20000: Twenty Thousand Leagues Under the Sea, by Jules Verne (1869), audio book [posted in December 2006].

eBook #25000: English Book Collectors, by William Younger Fletcher (1902) [posted in April 2008].

= Number of Languages With 50+ Books

January 2004: 25 languages.

July 2005: 42 languages.

December 2006: 50 languages.

April 2008: 55 languages.

= Main Languages

July 2005: English, French, German, Finnish, Dutch, Spanish, Chinese. [Out of a total of 16,800 books on July 27, 2005, 14,548 books are in English, 577 books in French, 349 books in German, 218 books in Finnish, 130 books in Dutch, 103 books in Spanish and 69 books in Chinese.]

December 2006: English, French, German, Finnish, Dutch, Spanish, Italian, Chinese, Portuguese, Tagalog. [Out of a total of 19,996 books on December 16, 2006, 17,377 books are in English, 966 books in French, 412 books in German, 344 books in Finnish, 244 books in Dutch, 140 books in Spanish, 102 books in Italian, 69 books in Chinese, 68 books in Portuguese and 51 books in Tagalog.]

April 2008: English, French, German, Finnish, Dutch, Portuguese, Chinese, Spanish, Italian, Latin, Tagalog. [Out of a total of 25,004 books on April 21, 2008, 21,475 books are in English, 1,168 books in French, 530 books in German, 433 books in Finnish, 326 books in Dutch, 217 books in Portuguese, 196 books in Chinese, 180 books in Spanish, 128 books in Italian, 55 books in Latin and 54 books in Tagalog.]

= Downloads From ibiblio.org

*ibiblio.org is the main downloading site. The downloads of mirror sites are not included here.

July 31, 2005: 37,532 files downloaded in the day; 243,808 files downloaded in the week; 1,154,765 files downloaded in the month.

May 6, 2007: 89,841 files downloaded in the day; 697,818 files downloaded in the week; 2,995,436 files downloaded in the month.

May 8, 2008: 115,138 files downloaded in the day; 714,323 files downloaded in the week; 3,055,327 files downloaded in the month.

10. LINKS

Distributed Proofreaders (DP): https://www.pgdp.net/

Distributed Proofreaders Canada (DPC): http://www.pgdpcanada.net/

Distributed Proofreaders Europe (DP Europe): http://dp.rastko.net/

Hart, Michael (blog): http://hart.pglaf.org/

Project Gutenberg: https://www.gutenberg.org/

Project Gutenberg / Catalog: https://www.gutenberg.org/catalog/

Project Gutenberg / File Recode Service: https://www.gutenberg.org/catalog/world/recode.php

Project Gutenberg / Top 100: https://www.gutenberg.org/browse/scores/top

Project Gutenberg Canada (PGC): http://www.gutenberg.ca/

Project Gutenberg Consortia Center (PGCC): http://www.gutenberg.us/

Project Gutenberg Europe (PG Europe): http://pge.rastko.net/

Project Gutenberg Literary Archive Foundation (PGLAF): https://www.pglaf.org/

Project Gutenberg News (PG News): http://www.pg-news.org/

Project Gutenberg of Australia: https://gutenberg.org.au/

Project Gutenberg of the Philippines (PGPH): http://www.gutenberg.ph/

Project Gutenberg PrePrints: http://preprints.readingroo.ms/

Projekt Gutenberg-DE: http://gutenberg.spiegel.de/

Project Runeberg: http://runeberg.org/

Copyright © 2008 Marie Lebert

End of Project Gutenberg's Project Gutenberg (1971-2008), by Marie Lebert