The Internet and Languages [around the year 2000]
Chapter 3
1. Computer technology has traditionally been the sole domain of a 'techie' elite, fluent in both complex programming languages and in English -- the universal language of science and technology. Computers were never designed to handle writing systems that couldn't be translated into ASCII. There wasn't much room for anything other than the 26 letters of the English alphabet in a coding system that originally couldn't even recognize acute accents and umlauts -- not to mention non-alphabetic systems like Chinese. But tradition has been turned upside down. Technology has been popularized. GUIs (graphical user interfaces) like Windows and Macintosh have hastened the process (and indeed it's no secret that it was Microsoft's marketing strategy to use their operating system to make computers easy to use for the average person). These days this ease of use has spread beyond the PC to the virtual, networked space of the internet, so that now non-programmers can even insert Java applets into their webpages without understanding a single line of code.
2. An extension of (local) popularization is the export of information technology around the world. Popularization has now occurred on a global scale and English is no longer necessarily the lingua franca of the user. Perhaps there is no true lingua franca, but only the individual languages of the users. One thing is certain -- it is no longer necessary to understand English to use a computer, nor it is necessary to have a degree in computer science. A pull from non- English-speaking computer users and a push from technology companies competing for global markets has made localization a fast growing area in software and hardware development. This development has not been as fast as it could have been. The first step was for ASCII to become Extended ASCII. This meant that computers could begin to start recognizing the accents and symbols used in variants of the English alphabet -- mostly used by European languages. But only one language could be displayed on a page at a time.
3. The most recent development is Unicode. Although still evolving and only just being incorporated into the latest software, this new coding system translates each character into 16 bytes. Whereas 8-byte Extended ASCII could only handle a maximum of 256 characters, Unicode can handle over 65,000 unique characters and therefore potentially accommodate all of the world's writing systems on the computer. So now the tools are more or less in place. They are still not perfect, but at last we can at least surf the web in Chinese, Japanese, Korean, and numerous other languages that don't use the Western alphabet. As the internet spreads to parts of the world where English is rarely used -- such as China, for example, it is natural that Chinese, and not English, will be the preferred choice for interacting with it. For the majority of the users in China, their mother tongue will be the only choice. There is a change-over period, of course. Much of the technical terminology on the web is still not translated into other languages. And as we found with our Multilingual Glossary of Internet Terminology -- known as NetGlos -- the translation of these terms is not always a simple process. Before a new term becomes accepted as the 'correct' one, there is a period of instability where a number of competing candidates are used. Often an English loan word becomes the starting point -- and in many cases the endpoint. But eventually a winner emerges that becomes codified into published technical dictionaries as well as the everyday interactions of the nontechnical user. The latest version of NetGlos is the Russian one and it should be available in a couple of weeks or so [at the end of September 1998]. It will no doubt be an excellent example of the ongoing, dynamic process of 'russification' of web terminology.
4. Whereas 'mother-tongue education' was deemed a human right for every child in the world by a UNESCO report in the early '50s, 'mother-tongue surfing' may very well be the Information Age equivalent. If the internet is to truly become the Global Network that it is promoted as being, then all users, regardless of language background, should have access to it. To keep the internet as the preserve of those who, by historical accident, practical necessity, or political privilege, happen to know English, is unfair to those who don't.
5. Although a multilingual web may be desirable on moral and ethical grounds, such high ideals are not enough to make it other than a reality on a small-scale. As well as the appropriate technology being available so that the non-English speaker can go, there is the impact of 'electronic commerce' as a major force that may make multilingualism the most natural path for cyberspace. Sellers of products and services in the virtual global marketplace into which the internet is developing must be prepared to deal with a virtual world that is just as multilingual as the physical world. If they want to be successful, they had better make sure they are speaking the languages of their customers!"
How about the future of the WorldWide Language Institute? "As a company that derives its very existence from the importance attached to languages, I believe the future will be an exciting and challenging one. But it will be impossible to be complacent about our successes and accomplishments. Technology is already changing at a frenetic pace. Lifelong learning is a strategy that we all must use if we are to stay ahead and be competitive. This is a difficult enough task in an English-speaking environment. If we add in the complexities of interacting in a multilingual/multicultural cyberspace, then the task becomes even more demanding. As well as competition, there is also the necessity for cooperation -- perhaps more so than ever before. The seeds of cooperation across the internet have certainly already been sown. Our NetGlos Project has depended on the goodwill of volunteer translators from Canada, U.S., Austria, Norway, Belgium, Israel, Portugal, Russia, Greece, Brazil, New Zealand and other countries. I think the hundreds of visitors we get coming to the NetGlos pages everyday is an excellent testimony to the success of these types of working relationships. I see the future depending even more on cooperative relationships -- although not necessarily on a volunteer basis."
= Logos
Logos is a global translation company with headquarters in Modena, Italy. In 1997, Logos had 200 in-house translators in Modena and 2,500 free-lance translators worldwide, who processed around 200 texts per day. The company made a bold move, and decided to put on the web the linguistic tools used by its translators, for the internet community to freely use them as well. The linguistic tools were the Logos Dictionary, a multilingual dictionary with 7 billion words (in fall 1998); the Logos Wordtheque, a multilingual library with 300 billion words extracted from translated novels, technical manuals and other texts; the Logos Linguistic Resources, a database of 500 glossaries; and the Logos Universal Conjugator, a database for verbs in 17 languages.
When interviewed by Annie Kahn in December 1997 for the French daily Le Monde, Rodrigo Vergara, head of Logos, explained: "We wanted all our translators to have access to the same translation tools. So we made them available on the internet, and while we were at it we decided to make the site open to the public. This made us extremely popular, and also gave us a lot of exposure. This move has in fact attracted many customers, and also allowed us to widen our network of translators, thanks to contacts made in the wake of the initiative."
In the same article, "Les mots pour le dire" (The Words to Tell it), Annie Kahn wrote: "The Logos site is much more than a mere dictionary or a collection of links to other online dictionaries. The cornerstone is the document search program, which processes a corpus of literary texts available free of charge on the web. If you search for the definition or the translation of a word ('didactique' [didactic], for example), you get not only the answer sought, but also a quote from one of the literary works containing the word (in our case, an essay by Voltaire). All it takes is a click on the mouse to access the whole text or even to order the book, including in foreign translations, thanks to a partnership agreement with the famous online bookstore Amazon.com. However, if no text containing the required word is found, the program acts as a search engine, sending the user to other web sources containing this word. In the case of certain words, you can even hear the pronunciation. If there is no translation currently available, the system calls on the public to contribute. Everyone can make suggestions, after which Logos translators check the suggested translations they receive."
ONLINE LANGUAGE DICTIONARIES
= [Quote]
WordReference.com was created in 1999 by Michael Kellogg, who wrote on his project's website: "I started this site in 1999 in an effort to provide free online bilingual dictionaries and tools to the world for free on the internet. The site has grown gradually ever since to become one of the most-used online dictionaries, and the top online dictionary for its language pairs of English-Spanish, English-French, English-Italian, Spanish-French, and Spanish-Portuguese. Today, I am happy to continue working on improving the dictionaries, its tools and the language forums. I really do enjoy creating new features to make the site more and more useful."
= From print versions
The first online language dictionaries stemmed from print versions, with websites launched in the mid-1990s.
On the website "Merriam-Webster Online: The Language Center", Merriam- Webster, a main publisher of English-language dictionaries, gave free access to online resources stemming from its print publications. The online resources were: Webster Dictionary, Webster Thesaurus, Webster's Third (a lexical landmark), Guide to International Business Communications, Vocabulary Builder (with interactive vocabulary quizzes), and the Barnhart Dictionary Companion (hot new words). The goal was also to help track down definitions, spellings, pronunciations, synonyms, vocabulary exercises, and other key facts about words and language.
The "Dictionnaire Francophone en Ligne" was the web version of the "Dictionnaire Universel Francophone", published by Hachette, a major French publisher, and the University Agency for Francophony (AUF: Agence Universitaire de la Francophonie, also known as AUPELF-UREF). The dictionary included not only standard French but also the French- language words and expressions used worldwide. French is the official language of 49 states, with a number of them in Africa, and is spoken by 500 million people worldwide. The Agency of French-speaking Countries (Agence de la Francophonie), which has included the AUF, was founded in 1970 as an instrument of multilateral cooperation at the international level. As a side remark, English and French are the only official and/or cultural languages that are widely spread on five continents.
= Directories of dictionaries
Directories of dictionaries have been useful too, such as "Dictionnaires Électroniques" (Electronic Dictionaries), an online catalog of electronic dictionaries maintained by the French Section of the Swiss Federal Administration's Central Linguistic Services (SLC-f: Section Française des Services Linguistiques Centraux). The catalog included five main sections: abbreviations and acronyms, monolingual dictionaries, bilingual dictionaries, multilingual dictionaries, and geographical information. The catalog could also be searched by keywords.
Marcel Grangier was the head of the French Section of Central Linguistic Services, which means he was in charge of organizing translation matters into French for the linguistic services of the Swiss government. He wrote in January 1999: "Our website was first conceived as an intranet service for translators in Switzerland, who often deal with the same kind of material as the Federal government's translators. Some parts of it are useful to any translators, wherever they are. The section "Dictionnaires Électroniques" is only one section of the website. Other sections deal with administration, law, the French language, and general information. The site also hosts the pages of the Conference of Translation Services of European States (COTSOES). (...) To work without the internet is simply impossible now. Apart from all the tools used (email, the electronic press, services for translators), the internet is for us a vital and endless source of information in what I'd call the 'non-structured sector' of the web. For example, when the answer to a translation problem can't be found on websites presenting information in an organized way, in most cases search engines allow us to find the missing link somewhere on the network."
How about the future? "We can see multilingualism on the internet as a happy and irreversible inevitability. So we have to laugh at the doomsayers who only complain about the supremacy of English. Such supremacy isn't wrong in itself, because it is mainly based on statistics (more PCs per inhabitant, more people speaking English, etc.). The answer isn't to 'fight English', much less whine about it, but to build more sites in other languages. As a translation service, we also recommend that websites be multilingual. (...) The increasing number of languages on the internet is inevitable and can only boost multicultural exchanges. For this to happen in the best possible circumstances, we still need to develop tools to improve compatibility. Fully coping with accents and other characters is only one example of what can be done."
The section "Dictionnaires Électroniques" was later transfered on the website of the Conference of Translation Services of European States (COTSOES), when COTSOES launched its own website.
= The yourDictionary.com portal
Robert Beard, a language teacher at Bucknell University, in Lewisburg, Pennsylvania, created the website "A Web of Online Dictionaries" (WOD) in 1995. In September 1998, the website provided an index of 800 online dictionaries in 150 languages, as well as specific sections: multilingual dictionaries, specialized English dictionaries, thesauri and other vocabulary aids, language identifiers and guessers, an index of dictionary indices, the Web of Online Grammars, and the Web of Linguistic Fun (i.e. linguistics for non-specialists).
Robert Beard wrote in September 1998: "There was an initial fear that the web posed a threat to multilingualism on the web, since HTML and other programming languages are based on English and since there are simply more websites in English than any other language. However, my websites indicate that multilingualism is very much alive and the web may, in fact, serve as a vehicle for preserving many endangered languages. I now have links to dictionaries in 150 languages and grammars of 65 languages. Moreover, the new attention paid by browser developers to the different languages of the world will encourage even more websites in different languages."
A few months later, Robert Beard co-founded a larger project, yourDictionary.com, that included his previous website and was launched in February 2000. He wrote in January 2000: "The new website is an index of 1,200+ dictionaries in more than 200 languages. Besides the WOD, the new website includes a word-of-the-day-feature, word games, a language chat room, the old 'Web of Online Grammars' (now expanded to include additional language resources), the 'Web of Linguistic Fun', multilingual dictionaries; specialized English dictionaries; thesauri and other vocabulary aids; language identifiers and guessers, and other features; dictionary indices. yourDictionary.com will hopefully be the premiere language portal and the largest language resource site on the web. It is now actively acquiring dictionaries and grammars of all languages with a particular focus on endangered languages. It is overseen by a blue ribbon panel of linguistic experts from all over the world. (...) Indeed, yourDictionary.com has lots of new ideas. We plan to work with the Endangered Language Fund in the U.S. and Britain to raise money for the Foundation's work and publish the results on our site. We will have language chatrooms and bulletin boards. There will be language games designed to entertain and teach fundamentals of linguistics. The Linguistic Fun page will become an online journal for short, interesting, yes, even entertaining, pieces on language that are based on sound linguistics by experts from all over the world."
How about the future of the web? "The web will be an encyclopedia of the world by the world for the world. There will be no information or knowledge that anyone needs that will not be available. The major hindrance to international and interpersonal understanding, personal and institutional enhancement, will be removed. It would take a wilder imagination than mine to predict the effect of this development on the nature of humankind."
= Terminological databases
Some terminological databases are run by international organizations in their own field of expertise, with free online versions, for example ILOTERM maintained by the International Labor Organization (ILO), TERMITE (ITU Telecommunication Terminology Database) maintained by the International Telecommunication Union (ITU), WHOTERM (WHO Terminology Information System) maintained by the World Health Organization (WHO), and Eurodicautom maintained by the European Commission.
ILOTERM is a quadrilingual (English, French, German, Spanish) terminology database maintained by the Terminology and Reference Unit of the Official Documentation Branch (OFFDOC) at the International Labor Office (ILO) in Geneva, Switzerland. As explained on its website, ILOTERM's primary purpose is to provide solutions, reflecting current usage, to terminological problems in the social and labor fields. Terms are entered in English with their French, Spanish and German equivalents. The database also includes records for the ILO structure and programs, official names of international institutions, national bodies and employers' and workers' organizations, and titles of international meetings.
TERMITE (which stands for: Telecommunication Terminology Database) is maintained by the Terminology, References and Computer Aids to Translation Section of the Conference Department at the International Telecommunication Union (ITU) in Geneva, Switzerland. It is a quadrilingual (English, French, Spanish, Russian) terminological database built on the content of all ITU printed glossaries since 1980, and updated with recent entries.
WHOTERM (which stands for: WHO Terminology Information System) is maintained by the World Health Organization (WHO) in Geneva, Switzerland. It has included: (a) the WHO General Dictionary Index (in English, with the French and Spanish equivalents); (b) three glossaries in English: Health for All, Programme Development and Management, and Health Promotion; (c) the WHO TermWatch, an awareness service from the Technical Terminology, reflecting the current WHO usage, but not necessarily terms officially approved by WHO, and links to health- related terminology.
Eurodicautom, a multilingual terminological database maintained by the Translation Service of the European Commission, was initially developed to assist in-house translators. The free online version was used by European Union officials and by language professionals throughout the world. Its contents were available in the eleven official languages of the European Union (Danish, Dutch, English, Finnish, French, German, Greek, Italian, Portuguese, Spanish, Swedish), plus Latin. Eurodicautom covered "a broad spectrum of human knowledge", mainly relating to economy, science, technology and legislation in the European Union. In late 2003, the website announced the inclusion of the existing database into a larger terminological database that would also include databases from other official European institutions. The new terminological database would be available in more than 20 languages, because a number of Eastern European countries were expected to join the European Union in the near future, thus the need for more languages than the eleven original ones. The European Union went from 15 country members to 25 country members in May 2004, and 27 country members in January 2007. The website of IATE (Inter-Active Terminology for Europe) was launched in March 2007 as an eagerly awaited free service on the web, with 1.4 million entries in 24 languages.
= Wikipedia
Wikipedia was launched in January 2001 by Jimmy Wales and Larry Sanger (Larry resigned later on). It has quickly grown into the largest reference website on the internet, financed by donations, with no advertising. Its multilingual content is free and written collaboratively by people worldwide, who contribute under a pseudonym. Its website is a wiki, which means that anyone can edit, correct and improve information throughout the encyclopedia. The articles stay the property of their authors, and can be freely used according to the GFDL (GNU Free Documentation License).
Wikipedia had 1.3 million articles (by 13,000 contributors) in 100 languages in December 2004, 6 million articles in 250 languages in December 2006, and 7 million articles in 192 languages in May 2007, including 1.8 million articles in English, 589,000 articles in German, 500,000 articles in French, 260,000 articles in Portuguese, and 236,000 articles in Spanish. In August 2009, Wikipedia was among the top five websites in the world, with a total of 330 million visitors a month.
Wikipedia is hosted by the Wikimedia Foundation, founded in June 2003, which has run a number of other projects, beginning with Wiktionary (launched in December 2002) and Wikibooks (launched in June 2003), followed by Wikiquote, Wikisource (texts from public domain), Wikimedia Commons (multimedia), Wikispecies (animals and plants), Wikinews, Wikiversity (textbooks), and Wiki Search (search engine).
LEARNING LANGUAGES ONLINE
= [Quote]
Robert Beard, a language teacher at Bucknell University, in Lewisburg, Pennsylvania, wrote in September 1998: "As a language teacher, the web represents a plethora of new resources produced by the target culture, new tools for delivering lessons (interactive Java and Shockwave exercises) and testing, which are available to students any time they have the time or interest -- 24 hours a day, 7 days a week. It is also an almost limitless publication outlet for my colleagues and I, not to mention my institution. (...) Ultimately all course materials, including lecture notes, exercises, moot and credit testing, grading, and interactive exercises will be far more effective in conveying concepts that we have not even dreamed of yet."
= CTI Centre for Modern Languages
Since its inception in 1989, the CTI (Computer in Teaching Initiative) Centre for Modern Languages, based in the Language Institute at the University of Hull, United Kingdom, aims to promote and encourage the use of computers in language learning and teaching. The CTI Centre provides information on how computer-assisted language learning (CALL) can be effectively integrated into existing courses. It offers support to language lecturers who are using computers in their teaching, or who wish to use them.