Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

A Comprehensive Guide to Different Types of Search Engines and Search Techniques - Prof. J, Assignments of Geology

An overview of various search engines and their functionalities, including open web search engines, metasearch engines, and directory search engines. It also discusses search parameters and strategies such as keyword searching, phrase searching, title searching, and date searches. The document aims to help users understand the differences between these search engines and how to effectively use search techniques to find relevant information.

Typology: Assignments

2009/2010

Uploaded on 02/24/2010

koofers-user-o09
koofers-user-o09 🇺🇸

10 documents

1 / 8

Toggle sidebar

Related documents


Partial preview of the text

Download A Comprehensive Guide to Different Types of Search Engines and Search Techniques - Prof. J and more Assignments Geology in PDF only on Docsity! Assignment 25 A Brief Guide to World Wide Web Search Engines Anyone who has used the world wide web has probably used a search engine to find information and resources. Search engines are the only reasonable means of indexing and ranking sites that exist on the vast expanse of the internet. In a way, search engines are the means by which anyone can alphabetize, sort, or prioritize the contents of the web. Web searching in some respects can be considered an art, and search styles and techniques can be quite personal and individualized. However, there are sets of rules and constraints that channel and govern users of search engines. In addition, there are a wide variety of search engines available for users to take advantage of (it’s true – Google actually has some competition!). It’s the aim of this assignment to familiarize you with some of the many alternative search engines and search tools available to all web users, as well as expose you to the principles for constructing efficient and useful web searches. Search Engines Search engines operate much the same way a relational database management system does – a body of data is subject to a query, with results that match the query returned as output. Web queries are structured in a manner similar to database queries – phrase searching and Boolean operators are employed in the same way. The difference, of course, is the body of data being searched. World wide web search engines are without exception trained on ‘distillations’ of the web – abbreviated indexes that are the product of specialized programs that “crawl” the internet, following links, and sending back to the mother-database a text-based copy of all linked pages they encounter. These programs, called ‘web crawlers’ or ‘spiders,’ are key to the success of any search engine. The more numerous and efficient the crawlers for a particular engine are, the more pages that can be indexed and added to the engine’s index or database. This means that the results from that engine will be more comprehensive. One thing to remember about web search engines is that the index they actually search is a snapshot of the world wide web’s content taken at an earlier time. Though the index may be in the process of continual revision, there are always parts of it that are older, and thus more ‘stale.’ For instance, it is unlikely that any search engine will turn up a result for a search that includes the term “geol2002” and “assignment 25” within hours, or possibly even days, of my uploading this assignment to my website. How quickly “assignment 25” becomes indexed depends on the number and efficiency of the spiders that compile an engine’s index. Another critical aspect of search engine functionality is the way that query results are presented to the user. Since many generalized searches may return hundreds, thousands, or even millions of “hits,” ranking the relevance of these returns (in terms of their relationship to the original query) is of utmost importance. Some engines rank results by the number of times a search term appears on a web page; others employ complex algorithms that rank results based on links to other pages. The parent companies of some Assignment 25 2 search engines even accept payment from web page owners to promote their sites to the top of a ranked search list. Taking into consideration the above statements, search engines can be divided into three classes based on what and how they search the web. The following list categorizes thirteen of the more popular search engines available to web users, with a brief explanation of each group’s characteristics and functionality: Open Web Search Engines This category of search engine provides the most comprehensive set of results, because there are no real constraints on the scope of included material in the indexed database the engine searches. All data returned by the spiders is available to be searched, and all data is treated equally. In other words, the only limitation on the data to be searched is the quantity of data the spiders actually return to the index database. Below is a list of six of the more popular open search engines on the world wide web. http://www.alltheweb.com This engine has a moderate-to-large searchable index and is notable for allowing users to display up to 100 results per page. It also supports title searching, where users may restrict results to key words that appear only in a web-page’s title. http://www.altavista.com One of the original, founding search engines on the web, Altavista defined web searching in the mid 1990’s. Although its index is not as large as Google’s or Yahoo’s, this engine has at least one unique search function not shared by the others – the “near” operator, which allows users to find web pages with two words located within 10 words of each other. http://www.ask.com Formerly known as Ask Jeeves, Ask has been on the scene since 1996. Its indexing capability is on the order of magnitude of Yahoo and Live. http://www.gigablast.com/ A comparatively recent search engine on the web, this engine is notable for the ability to search web page metadata. Otherwise, it has a relatively small search index that is not updated at the same frequency as many other engine’s. Assignment 25 5 Keyword searching Plug a word like ‘geology’ into Google’s search window and press the Google Search button. This is an example of a simple keyword search. You’ll receive about 64.5 million hits – showing you that this is not exactly a useful search. Using some of the operators below, you’ll be able to effectively narrow your results to a meaningful list of relevant sites. Boolean Operators The operators described below are described as ‘full boolean’ in that they give you (where supported) complete logical limitations in your searches. Remember to always capitalize all the letters in full Boolean operators. AND Use the AND operator to strongly limit your results. If the term ‘geology’ gives you 25.3 million hits, the term ‘geology AND georgia’ will return only those pages with both keywords in them – about 3.25 million on Google. OR The OR operator effectively broadens searches, and allows hits into your results that include one or the other or both of your search terms (if it’s a two-word search) if they are found on any indexed web page. In Google, the term ‘geology OR georgia’ returns about 333 million results (that is, one third of a billion)! AND NOT The AND NOT operator is useful for excluding potentially irrelevant web pages from your searches – for example, the phrase ‘quartz AND NOT watches’ will remove all pages that describe or sell quartz time pieces. Be cautious, however – you can lose web pages from your results that will describe the use and application of quartz in commerce or industry. NOTE that many search engines do not support this full Boolean operator – most, however, will support the implied Boolean not (-). NEAR This is a proximity search term supported primarily by Altavista. The NEAR operator filters search results and limits them to pages where two search terms fall within ten words of each other. Assignment 25 6 Implied Boolean Operators + (plus sign) Functionally similar to the AND operator described above. - (minus sign) Functionally similar to the AND NOT operator described above. Phrase searching Phrase searching is an extremely useful method of filtering your search results by allowing web pages into your results that have only a particular phrase in them. It is considered the simplest form of a proximity search. For example, you can use Google to find environmental geology web pages by issuing (in quotes) “environmental geology” as your search criteria. This will return 913,000 results. Using the Boolean OR (environmental OR geology) returns 313 million results. Limiting Strategies Title searches Most of the major search engines support title searches, most under the advanced search link on the main search page. Altavista allows users to type in title: as the leader in a search query in the simple search box – e.g., title:”environmental geology” in Altavista returns about 20,700 web pages that have environmental geology in their titles. Date searches Another limiting parameter that can be used to limit results is the date parameter – you can specify a date range in the advanced search page to limit your results to, for example, only the most recent web pages that contain your search terms. Domain searches Also under most of the advanced search functions on many popular search engines is the ability to limit your search to specific internet domains - .com, .org, .edu, as well as by country - .us, .uk, .cn, etc. Assignment 25 7 URL searches This search strategy allows you to limit results to specific URLs. For example, you could find the number of web pages that mentioned orthophotos on the USGS website by phrasing your search in Altavista like this: url:www.usgs.gov orthophoto Link searches This is a specialized but interesting search function that allows you to return a list of web pages that link to a specific web page. This search is supported by many engines, but with widely disparate results. For example the link search in Google link:www.westga.edu returns about 847 hits, while the same search in Altavista returns over 38,000! Internet scavenger hunt Answer the following ten questions using whatever internet resources you choose. Submit your responses to me in the form of a Word document attached to an email (jconglet@westga.edu). Be sure I receive your answers by 11:00 AM on Monday, May 4, 2009. No assignments will be accepted after this time. 1. Who was the first director of the USGS? 2. What are the two species of elephants? 3. In what year did the Washington Senators play their last game? 4. From what place did the fine-grained ultramafic rock called Komatiite get its name? 5. What is the physical (street) address of the USGS facility in Menlo Park, CA? 6. What is Georgia's state mineral? 7. What was Richter’s (of earthquake fame) home institution at the time he established his seismic scale? 8. Who invented the paper clip? 9. Where was the epicenter of the most recent earthquake in Illinois located (give the town name)? 10. What country had the largest recorded earthquake?
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved