Wednesday, November 21, 2007

My Research Proposal for PhD

Background


In the age of information explosion the volume of information on the Internet is growing and the web has become a huge warehouse of vast information resources. However, effective access to the most relevant resources available on the Internet is one of the chief concerns for users and becoming a critical problem. In this situation Internet search engines and search engine based portals become navigator to their users. Bharat and Broder (1998) believe that “search engines are among the most useful and popular services on the web”. However, some ( eg: Moghaddam and Parirokh :2006, Dogpile:2005) argue that current available search engines have been overwhelmed by thousands of documents resulting in information overload or chaos and a single search engine, which has powerful capabilities cannot search and retrieve comprehensively all of the resources available on the Internet. Moghaddam et al. (2006) believe that the major search engine (such as Google, Yahoo Altavista, Excite, etc) index only a fraction of the publicly indexable web.

The web is growing at a much faster speed than the indexing capabilities of those search engines; in addition programs such as robots and spiders, which are used by majority of the search engines cannot indexed many web pages; although they can be searched through local search engines (Meng, Yu and Wu: 2001). So, no single search engine can index the entire web. Moreover, every search engine indexes different web pages, which means if only one search engine is used, relevant results from other search engines will be missed (Bazac, 2002 cited in Moghaddam and Parirokh: 2006).
Spink et al.(2006) suggested that web search engines can differ from one another in three ways – crawling reach, frequency of the updates, and relevancy analysis. So performance capabilities and limitation of each search engine is different. Various studies (eg: Ding and Marchionini: 1996, Bharat and Broder: 1998, Spink et al.:2006 ) showed that very low overlap between results retrieved by different Web search engines for the same queries. Furthermore, Lawrence and Giles (1998) study showed that any single web search engines indexes no more than 16 percent of all web sites.


Meta-search engines

Continuous increase in the size of web, the relative coverage of individual search engines is decreasing and search tools that combine the results of multiple search engines are becoming more valuable (Lawrence and Giles: 1999). Consequently, search tools called meta-search engines have been designed so users can search the web more effectively. Meta-search engines send the query to several search engines at once and return the results from all of the search engines in one long unified list by exploiting simultaneously the best features of many individual search engines (Langville and Meyer: 2006). Unlike individual search engines and subject directory-based information systems, meta-search engines do not crawl the Internet and therefore they do not maintain their own databases. So, meta-search engines’ performance based on the performance of the combination of individual search engines included. This unique feature of meta-search engines enable users to search more of the web in less time and do it with only one click. Furthermore, Zhang and Cheung (2003) argue that meta-search engines’ customised and advanced search features offer an excellent alternative for users to improve both search effectiveness and efficiency, which will make them more appealing for information professionals and personal users. Moreover, other various studies (eg: Hu et al. 2001, Hai et al.: 2004) show the attractiveness of the meta-search engines.

Interestingly, none of the meta-search engines rank top based on the Nielsen//NetRatings search reporting service: 2006 report. According to this report Google: 49.2%, Yahoo: 23.8%, MSN: 9.6, AOL: 6.3%, ASK: 2.6% and others: 8.5% posses search market share. So, my study intend to focus on underlying issues of meta-search engines and identify causes which are stopping meta-search engines even though meta-search engines have greater advantage over individual search engines; to serve greater percentage of users.

Research questions

In particular, I would like to explore the two following questions:
1. Which search engines deal best with which types of queries?
2. Can a meta-search engine be designed that directs queries to thesearch engines best able to handle them?

I believe the findings of my study will benefit both meta-search engine developers and users.