Information Retrieval System and Its Evaluation Report

Exclusively available on IvyPanda Available only on IvyPanda
Updated: Mar 6th, 2024

Introduction

A web is the most place of retrieving information of wide area ranging from scientific to general information., there are variety of search engines available for search, each of the search engine retrieves information different, therefore the quality are not the same, It is important for a research to choose the most suitable search engine that it will be efficient for her/his work. In our discussion we will be discussing the Google Scholar search engine and scirus specifically PubMed. The Scirus engine was designed specifically for scientific data retrieval from journal and some web resources. Scirus enable one to search on a wide range of issues which are science related subjects, by allowing you to narrow the search to particular author or journal. Also scirus will give you option of setting preference to access full text article through Weber state Library.

We will write a custom essay on your topic a custom Report on Information Retrieval System and Its Evaluation
808 writers online

PubMed search tool provides free access to MEDLINE.NLM’s database citation and abstract from field of medicine, nursing, dentistry, healthy care system, and clinical science (Norbert, Mounia, & Andrew, 2007). PubMed was designed in such way that will link a researcher to full-text articles in the PuBMed center, publisher web sites or any other related information on the web, Also offer search services for clinical query search filters, links to articles which are related to the search term, possesses also spell checker functionality in case a user type incorrectly the PubMed will correct it automatically. There is a database that is used for bibliography, the MEDLINE, and it is used for citations; it has approximately 5,400 journals for medicine discipline and it has been used worldwide. The citation in MEDLINE are usually assigned on Medical subject Heading (MeSH) which control vocabulary which help user in doing their research.

Johannes (1982) in searching using a PubMed a user is prompt to enter the search topic and click search button. The searching of the PubMed is by use of MeSH terms, and the author’s name. There are also added search modes which have the capability of looking for complex terms that are specific fields. There is also specialized search that are specific to clinical terms (2002).

Used to do scientific research, scirus has over 370 million indexes which are eased in aid of the research. The reason for choosing scirus for information retrieval is that scirus focuses only web pages that contain scientific content, actually the use of scirus assist in quick information location on the web.

Scirus has ability to filters out those sites which are not scientific; locate peer-reviewed articles like PDF and postscript files this are difficult to locate on other search engine. This search engine searches information based on what is in the server as explained by Daniel & Ishwa (2002).

Seed List Creation

The seed list is way in which Scirus crawl the information in internet. The scirus seed list is developed by the following methods:

  • An automatic URL extractor tool is used to locate new scientific seeds based on link analysis of very popular sites in specify area.
  • The Elsevier publishing units are request at regular basis to supply a list in their subject area
  • Scirus users and webmasters regularly submits suggestion for new sites
  • Members of the scirus scientific, Library and Technical Advisory Boards will provide input on an ongoing basis

The seed list contain only URLs which have been manually checked for scientific content, scirus will crawls internet in an efficient way, and deep coverage of scientific web sites.

1 hour!
The minimum time our certified writers need to deliver a 100% original paper

Information Classification

Scirus uses software to analyze the profile of a page and classify information according to the type. The types which are recognized are: scientific abstract, a full text scientific article, scientific home pages, and other page type which are relevant to scientific domain. Classification algorithm analyses the structure and vocabulary of the page to assign one category (Soffer, & and Samet, 1996). For example, scientific home pages are recognized by looking at structural information- such as availability of address information, biographical data layout, and publication list

Scirus Index

After completion of classification, Scirus will begin to index for searching. Scirus returns results that are from the whole page and will have terms that were not included in the search index.

Scirus query

Query is implemented by scirus will improve ranking and relevance of results. It has been designed to automatically understand the intention of the user and make some more intelligent searching by rewriting the queries.

Scirus Ranking

Scirus will use an algorithm to rank the documents from the query. Ranking are based on two values: term and links. The search using Scirus will base the search on their relevance. The ranking is determined by two ways. The algorithm will be based on the first term and the links that are relevant. Term frequency

The location and the frequency of occurrence of term within the document are measured. The global frequency of term within whole index is taken into consideration. While searching for a term, scirus asks the following question, about term location and frequency:

  • Is term in the title?
  • Is term in text in a link?
  • Location of term on text top or bottom?
  • How many times term is used?

For the full-text article to be avoided ranked higher than title/abstract pages scirus will count the number of keywords and divide them by total number of terms available in the document. Those terms which appear near each other within the documents are considered more relevant than those terms appearing at greater distance to each other. However, we conclude that the proximity of search terms will affect the scirus ranking discussed by Johannes (1982).

Google Scholar

Adam (2002) explained that Google Scholar is search engine developed by Google, designed to search for scholar literature in the web, user is able to search for scholar content. Google Company has been working with publishers, this enable the company to get access to materials which are accessible through the Google Scholar search engine. The access to publisher’s information is what made Google Scholar advantaged to other search engine, search get easy access to the material which he or she is interested. Getting the access to full-text by researcher is possible because of the relationship Google and publishers.

Remember! This is just a sample
You can get your custom paper by one of our expert writers

The tool used by Google Scholar is such that it uses indexes to search the content. It covers great range of areas including science specifically medicine. PubMed is a craw which agree with Google Scholar to index full content to site discussed by Soffer & Samet (1996)

Google Scholar is faster and easy in information search, retrieves information document based on searched keyword and organizes output result relevant algorithm. Most content of Google Scholar are indexed which comes with the licensed commercial journal content ,when a user search the result will be only abstract leaving out the full text. To address this issues institution have to configure openURL link resolved, Such as SFX, for authentication of user to provide access to full text content which are available through institution subscription (Peters, 2002).

There are weaknesses that are associated with Google Scholar. It is devoid of advanced search features. In medical research it was discovered that Google Scholar has limitation when used for clinician use. For the test conducted by medical practitioners it was found that Google Scholar craw only the subset of available content.

Google Scholar realizes that there is problem in accessing articles after search, most of those articles found only to journal subscribers, only those have the subscription-based will have the link of charged article of the institution. To mitigate this problem the Google Scholar has worked together with various service providers and content to offer openURL link to address this situation to researchers (Eakin, & Margaret, 1999). A Google Scholar will show openURL links to SFX link server from ex library; this will enable institution with SFX link server with electronic library holding displayed on the Google Scholar search outcome.

After a research carried it was discover that The Google Scholar suffer from the following problems:

  • Google Scholar cannot index citation in the database, which result to lack of clickable link
  • Citations displayed are highly inflated and not accurate
  • Google Scholar will not remove duplicate citation
  • Citation ranks given by Google are not clear compare to those from established services

Google Scholar lack important features of MEDLINE Cannot permit nested Boolean searching lacks important features like explosion, Publication-type limits and subheading. For the researchers who are looking for specific questions, they will use the Google Scholar.

Google have their own mechanisms of knowing the authors and the publishers of the paper that are being searched. The Google Scholar use citations to calculate the algorithm.

Google Scholar Ranking

The reason why many search engines get different information, when searching for information is because of different ways of ranking sites. Google rank a site by counting other link sites linking to them, to popular sites are given higher ranking (Deb, 2004). While other sites count number of times keyword appear, and places of appearance. A keyword that appears in the title is ranked higher than the one which appears on the body of web page discussed by Eakin & Margaret (2002).

We will write
a custom essay
specifically for you
Get your first paper with
15% OFF

Comparing test search Scirus (PubMed) and Google Scholar. The Google Scholar provides a simple way of accessing “peer-reviewed papers, and academic journals and articles. Google Scholar uses intuitive interface as the main search interface which have simple query box while databases of pubMed, uses search interfaces with variety of advanced features (Zongmin, 2009).

Google Scholar provide large access to gray literature, It retrieves additional content apart from journals which include preprint archives, conference proceeding, and institutional repositories, also include the links to online collections for academic libraries. Despite the above benefit brought by Google scholar it has some disadvantages, it lack advance search function, lack of vocabulary control, and scope of coverage is limited. When Google Scholar is used to retrieve information from PubMed, you will find that most information are lacking.

Methodology

A clinician performs the following test using the two retrieval method, PubMed and Google Scholar. The search was done based on topic, author, title, journal name, and combination of the all field (Datta, 2006). The selected topics includes: blood pressure, stress, and articles by specific authors in specific journals. Selected topics were based on question received during the reference transaction.

The search citations found through Google Scholar and PubMed was examined to find characteristic in format, date, medical subject Heading (MeSH), uniqueness, duplication and availability of full text from author institution. The searched result was analyzed to ascertain possibilities of unique retrieval of item in both retrieval tools

Result

In the searched result we found that, Google Scholar returned a larger retrieval set than PubMed (Daniel, & Ishwa, 2002). Most of the items retrieved by Google Scholar were journal articles, while PubMed retrieves items in format of books, web pages, subject index list, thesis, newsletter, bibliography, newsletter items, annual meeting abstract. Actually the later result gives little gray literature items.

From collected result we found that the main title link in Google Scholar citation used to find if full text. Availability of full text was 45% of the total citation retrieved. The assumption was that full-text access was based on the institutional subscription available to the author of study, while other retrieved items assumed they were freely available. In 23.34 % of results, the Google Scholar citation was link found out of PubMed record. Almost half 47.90% of PubMed Citation provide full-text access through the author’s institution.

On the analysis we found that:

  • Thirty items retrieved by Google Scholar were different format from journal article
  • A number of unique Google Scholar items appeared are not indexed by PubMed.
  • Google Scholar retrieves items based on its ability to search full text of many articles than on citation data

When the terminology was observed it was discovered to be a major factor which affects retrieval and the ability of both systems to return unique items. Most of the items retrieved by Google Scholar were out of topic. Google Scholar also returned those items that contain searched terminology and not corresponding to intended search. The complete citation for unique items retrieved by PubMed while Google Scholar fails to retrieve because most were indexed under appropriate MeSH term. Pub argue by Adam (2002)

Google Scholar and PubMed: A Comparison

The comparison of Google Scholar and PubMed is very hard because the both system operate in different way. PubMed is designed to search a well-defined set of journal, while Google Scholar search resources beyond journal with the exact coverage is not well described. The results obtained by these systems are different because systems search different data (Johannes, 1982). Google Scholar lack in offering special search those limited features of Google are found in PubMed. Therefore, Google search have some benefit, It provide easy place to start search to locate an initial retrieval of the possible articles, also enables the searcher to locate citation of older items that would miss if PubMed is used only. Google Scholar has the ability to provide access to grey literature which will ease the access to biomedical literature. This would be difficult task to search which would have result to a lot of implication on the field of public health explained by Datta (2006).

The benefit derived from features of searching PubMed is that ability to use MeSH vocabulary, while Google Scholar is not have the implemented feature for vocabulary searching. The MeSH provides a method of narrowing result to that user need. Also PubMed narrow their retrieval to citation.

The is need also to establish a standard means IR software which can interact with the reference database, by using a published web services interface which will enable IR administrator to select between reference provider example PubMed and cross reference.

Conclusion

With the increase in the use of automated web searches, there has been need for the search engines to come up with tailored indices that will get the academic sources that is so desired by the scholars. More specifically, there is need for the science researchers to have their searches to be based on the scientific journals. The common search engines have not been up-to task for this, and the use of scirus has made this to be a reality. The scirus search engine has been developed and tailored to make sure that the users have scientific information at hand. When compared to other search engines, scirus tops on the list of the search engines and tools that give quality and more specific information. The Scirus engine was designed specifically for scientific data retrieval from journal and some web resources. Scirus enable one to search on a wide range of issues which are science related subjects, by allowing you to narrow the search to particular author or journal. Also scirus will give you option of setting preference to access full text article through Weber state Library.

References

Adam, D. (2002) The counting House, Nature 415(6873).

Daniel, S, & Ishwa, K. S, (2002) Image Retrieval using a hierarchy of cluster.

Datta, R., (2006) image retrieval: ideas, influence, trend of the new age: USA, Pennsylvania.

Deb, S (2004) multimedia system and content-based image retrieval: New York, Idea Group Inc (IGI).

Eakin, J, P & Margaret, E,. (1999) content-based image retrieval.

Johannes, A., B (1982) User evaluation of information retrieval systems: some methodological considerations Indian University; University of South Africa.

Norbert, F., Mounia, L, & Andrew, T (2007) Comparative evaluation of XML information retrieval system. German; Springer.

Peters, C, (2002) Evaluation of cross-language information retrieval systems; German: Spring.

Soffer, A & and Samet, H (1996) retrieval by content in symbolic databases.

Zongmin, M, (2009) Artificial intelligent for maximizing content Based image Retrieval; IGI Global snippet.

Print
Need an custom research paper on Information Retrieval System and Its Evaluation written from scratch by a professional specifically for you?
808 writers online
Cite This paper
Select a referencing style:

Reference

IvyPanda. (2024, March 6). Information Retrieval System and Its Evaluation. https://ivypanda.com/essays/information-retrieval-system-and-its-evaluation/

Work Cited

"Information Retrieval System and Its Evaluation." IvyPanda, 6 Mar. 2024, ivypanda.com/essays/information-retrieval-system-and-its-evaluation/.

References

IvyPanda. (2024) 'Information Retrieval System and Its Evaluation'. 6 March.

References

IvyPanda. 2024. "Information Retrieval System and Its Evaluation." March 6, 2024. https://ivypanda.com/essays/information-retrieval-system-and-its-evaluation/.

1. IvyPanda. "Information Retrieval System and Its Evaluation." March 6, 2024. https://ivypanda.com/essays/information-retrieval-system-and-its-evaluation/.


Bibliography


IvyPanda. "Information Retrieval System and Its Evaluation." March 6, 2024. https://ivypanda.com/essays/information-retrieval-system-and-its-evaluation/.

Powered by CiteTotal, essay bibliography generator
If you are the copyright owner of this paper and no longer wish to have your work published on IvyPanda. Request the removal
More related papers
Cite
Print
1 / 1