A brief history of search

Network information search has a short but exciting history. Let’s look at some trendsetters and heroes of search and try to see where this whole thing is going.

1) 1990 - Archie . Created by a few students at McGuill University in Montreal and designed to search for scientific documents. Archie was originally a local tool that grew to become network services based on FTP protocol. It has a trawler, index database and a search interface. Important limitation - it could only search for document titles.
Actually, you can still check it out. There is one that takes your search query and returns the list of results via e-mail. Neat!

[Archie picture]

2) 1993 - Wanderer. Created by Matthew Gray at Massachusetts Institute of Technology. Wanderer was the first Web-based search engine and introduced both indexing of web sites and autonomous trawling agents (web-robots). It has the same limitation of only indexing web site titles, not the content. I have not found any working services but interestingly enough, Matthew Gray now works for Google and one of his latest creations is to map all locations mentioned in all books accessible via Google Books. The result is quite interesting.

[Google Books map picture]

3) 1994 - WebCrawler. Created by Brain Pinkerton while working at Steve Job’s Next company. This was the first search engine that indexed full context of web pages it found. It also arguably contributed to internet bubble creation after it was acquired for around $1 million by AOL in 1995. Currently WebCrawler is a meta-search engine using other search engines indexes.

[Webcrawler picture]

4) 1995 - Altavista. The first truly commercial multipurpose internet search engine that was created by Louis Monier at DEC. Altavista brought search to the place it deserves - a utility we heavily rely on in interfacing with the world wide web. Almost all modern innovations were there - good interface, powerful backend with quick response time, fast trawlers that were capable of indexing the entire Web, index that uses both link names and document content.

[Altavista picture]

5) 1996 - Google. With the Web growing much faster than any single human can process information, relevance becomes increasingly important task. Is one resource better that another? And what is better? Google came up with a very good and perfectly scientific answer - a document is better if more other documents contains a reference to this document. Wisdom of crowds at it’s best - the bigger the Web the better it works.

[PageRank picture]

Of course there were other great search engines along the way that introduced important pieces to the search as we know it. But all of them exhibited the important trend of applying statistical methods of structuring and ranking to ever growing amount of information. This did work quite well. But does this approach has any limitations? I think it does.
Let’s look at the pace information is growing. 5 years ago it used to take 3 years to double the entire sum of information available to mankind. Now it only takes one year to do the same. With accessibility of publishing and broadcasting tools more and more people are involved into production and consumption of information. Most of those producers are located towards the end of the Long Tale. Imagine that someone wants to learn about social networks and read a blog about it. She enters “blog social network” into Google search and instantaneously get 64,700,000 results. Even among the UK pages alone there are 6,250,000 results. At the very top of the list is a great Mashable! blog. This or a couple of other blogs below in the search result will probably be the final destination. And she would not pick on a great blog on page 15 and she would not read an interesting post “Why Do Black Men Date White Women?”.
There are a lot of great stuff out there and all or almost all of it is hidden behind a PageRank barrier.
A solution? Make search even more relevant to end users. Not to everybody en masse but to every individual with unique preferences and interests. There are a few interesting social networks initiatives that aims to improve relevance by introducing human interpretation to the cold science of statistical analysis.
Mahalo introduces a search engine with index information that is handpicked by a number of editors and volunteers.
Wikia has a wiki based search.
We have already covered Google’s new ability to vote for search results in Digg-style.

So will my list get a new entry? Time will tell :)

2 Responses to “A brief history of search”


  1. 1 Houf

    Thank you for this interesting blog on the search history. It’s a simple thing we use almost everyday without a hint of what’s behind it or where it’s coming from. it’s amazing, in less than 20 years the evolution is quite phenomenal.

    You raised an interesting point at the end of the article, the amount of information is enormous and the right stuff might be hidden behind the page rank barrier. Will we reach reach a point where for the same key words, the 10 first results for me will be different from the 10 first results for my neighbor?

    Then maybe the next on your list will one of those social network introducing a search engine that uses the social information about the user to customize the search results? let’s see.

  2. 2 Alexey Gabsatarov

    Houf, thank you for your comment.

    Search as well as any other activity performed by an individual is subjective. By subjective i mean that people do different things and expect different results. Even if two different people search using exactly the same search phrase it is logical to assume that they actually SEARCH for different subjective things.

    So good search engine will present subjective results that are indeed different.

Leave a Reply