History of Search Engines
Early Beginnings of the Internet and the World Wide Web
In 1957, after the U.S.S.R. launched Sputnik (the first artificial earth satellite), the United States created the Advanced Research Projects Agency (ARPA) as a part of the Department of Defense. Its purpose was to establish U.S. leadership in science and technology applicable to the military. Part of ARPA’s work was to prepare a plan for the United States to maintain control over its missiles and bombers after a nuclear attack. Through this work the ARPANET ? a.k.a. the Internet ? was born. The first ARPANET connections were made in 1969 and in October 1972 ARPANET went ‘public.’
Almost 20 years after the creation of the Internet, the World Wide Web was born to allow the public exchange of information on a global basis. It was built on the backbone of the Internet. According to Tim Berners-Lee, creator of the World Wide Web, “The Internet is a network of networks. Basically it is made from computers and cables. The World Wide]Web is an abstract imaginary space of information. On the Net, you find computers ? on the Web, you find documents, sounds, videos, information. On the Net, the connections are cables between computers; on the Web, connections are hypertext links. The Web exists because of programs which communicate between computers on the Net. The Web could not be without the Net. The Web made the Net useful because people are really interested in information and don’t really want to have to know about computers and cables.” With information being shared worldwide, there was eventually a need to find that information in an orderly manner.
The very first tool used for searching on the Internet was called “Archie”. (The name stands for “archives” without the “v”, not the kid from the comics). It was created in 1990 by Alan Emtage, a student at McGill University in Montreal. The program downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites, creating a searchable database of filenames.
While Archie indexed computer files, “Gopher” indexed plain text documents. Gopher was created in 1991 by Mark McCahill at the University of Minnesota. (The program was named after the school’s mascot). Because these were text files, most of the Gopher sites became Web sites after the creation of the World Wide Web. Two other programs, “Veronica” and “Jughead,” searched the files stored in Gopher index systems. Veronica (Very Easy Rodent-Oriented Net-wide Index to Computerized Archives) provided a keyword search of most Gopher menu titles in the entire Gopher listings. Jughead (Jonzy’s Universal Gopher Hierarchy Excavation And Display) was a tool for obtaining menu information from various Gopher servers.
In 1993, MIT student Matthew Gray created what is considered the first robot, called World Wide Web Wanderer. It was initially used for counting Web servers to measure the size of the Web. The Wanderer ran monthly from 1993 to 1995. Later, it was used to obtain URLs, forming the first database of Web sites called Wandex.
Web robots are sometimes referred to as web wanderers, web crawlers, or spiders. These names are a bit misleading as they give the impression the software itself moves between sites like a virus; this not the case, a robot simply visits sites by requesting documents from them.” Initially, the robots created a bit of controversy as they used large amounts of bandwidth, sometimes causing the servers to crash. The newer robots have been tweaked and are now used for building most search engine indexes. In 1993, Martijn Koster created ALIWEB (Archie-Like Indexing of the Web). ALIWEB allowed users to submit their own pages to be indexed. According to Koster, “ALIWEB was a search engine based on automated meta-data collection, for the Web.”
Eventually, as it seemed that the Web might be profitable, investors started to get involved and search engines became big business. Excite was introduced in 1993 by six Stanford University students. It used statistical analysis of word relationships to aid in the search process. Within a year, Excite was incorporated and went online in December 1995. Today it’s a part of the AskJeeves company.
EINet Galaxy (Galaxy) was established in 1994 as part of the MCC Research Consortium at the University of Texas, in Austin. It was eventually purchased from the University and, after being transferred through several companies, is a separate corporation today. It was created as a directory, containing Gopher and telnet search features in addition to its Web search feature.
Jerry Yang and David Filo created Yahoo in 1994. It started out as a listing of their favorite Web sites. What made it different was that each entry, in addition to the URL, also had a description of the page. Within a year the two received funding and Yahoo, the corporation, was created. Later in 1994, WebCrawler was introduced. It was the first full-text search engine on the Internet; the entire text of each page was indexed for the first time.
The entire directory is maintained by human input. Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s.[10] Several companies entered the market spectacularly, receiving record gains during their initial public offerings. Some have taken down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine companies were caught up in the dot-com bubble, a speculation-driven market boom that peaked in 1999 and ended in 2001.
Around 2000, Google’s search engine rose to prominence.[citation needed] The company achieved better results for many searches with an innovation called PageRank. This iterative algorithm ranks web pages based on the number and PageRank of other web sites and pages that link there, on the premise that good or desirable pages are linked to more than others.
Google also maintained a minimalist interface to its search engine. In contrast, many of its competitors embedded a search engine in a web portal. By 2000, Yahoo was providing search services based on Inktomi’s search engine. Yahoo! acquired Inktomi in 2002, and Overture (which owned AlltheWeb and AltaVista) in 2003. Yahoo! switched to Google’s search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions.
Microsoft first launched MSN Search in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display listings from Looksmart blended with results from Inktomi except for a short time in 1999 when results from AltaVista were used instead. In 2004, Microsoft began a transition to its own search technology, powered by its own web crawler (called msnbot).
Microsoft’s rebranded search engine, Bing, was launched on June 1, 2009. On July 29, 2009, Yahoo! and Microsoft finalized a deal in which Yahoo! Search would be powered by Microsoft Bing technology.
search engine operates, in the following order
1. Web crawling
2. Indexing
3. Searching