Search Engine Structure

             The Internet is a vast and overwhelming collection of information on
             any subject that can be imagined. To provide structure to this huge amount
             of information, search engines allow users to search for specific pieces of
             Search engines such as Google and Yahoo are technically known as
             information retrieval systems (IR) (Liddy, 2001). These search engines
             then work on the basis of created indexes. These indexes are matched with
             queries entered by users. Indexes are created according to words in
             documents and pointers within documents. The IR system creating this index
             is structured according to four elements: a document processor, query
             processor, search and matching function, and ranking ability (Liddy, 2001).
             The document processor comprises a preparing, processing and
             inputting function when a search is conducted (Liddy, 2001). Several
             functions are inherent in this process, including normalizing the document
             stream, breaking it into retrievable units, metatagging subdocument pieces,
             identifying indexable elements, etc. The first three functions are known
             as pre-processing, and the main aim is standardization of multiple formats.
             The nature and quality of search results are determined by the index
             identification stage. Further concerning the quality of material is the
             elimination of stop words. These include words of little meaning to the
             content of the query, such as "and", "but", "of", etc. Deleting these
             words helps to save search time and volume. Closely related is term
             stemming, according to which suffixes are removed. This helps to reduce
             the number of unique words in an index, and again saves storage space. A
             disadvantage is that precision and accuracy of search results may be
             negatively affected. There is however the option of a strong or weak
             stemming algorithm in order to regulate precision. Finally, the document
             processor extracts i...

More Essays:

APA     MLA     Chicago
Search Engine Structure. (1969, December 31). In MegaEssays.com. Retrieved 14:10, April 19, 2024, from https://www.megaessays.com/viewpaper/200658.html