|
|
|
|
| ||
|
| |||||
|
| |||||
|
| |||||
|
The first tool at your disposal is the search engine. Search engines ask you to input a description of what you are looking for, and then give you a list of pages that have something to do with what you asked about. 1. How do search engines find stuff? Large search engines like Digital's AltaVista employ some earthy friends, the spider and the worm. Programs that gather information about what's available on the World Wide Web and the Internet are called crawlers, spiders, worms, or robots (bots for short). These creepy critters essentially spend their days traversing the Web and Internet, doing nothing but browsing. Everything they take a look at is sent back to home base and cataloged. Web designers can put special instructions called meta descriptions (among other things) to give these guys more information about the pages they're looking at. For instance, the Vegetarian Pages web site for vegetarians contains keywords and descriptions of what type of information can be found on the site. Keywords for this site are: vegetarian, vegan, vegetarianism, veganism, meatless, compassion, health, etc. These keywords are not seen on the web pages themselves; instead, they are hidden and visible only to programs that collect that data for cataloging and indexing. Once the spider or worm has collected these keywords and descriptions your search engine can come along and search through all of the information that they have gathered. 2. How do search engines differ? The biggest difference between search engines is where and how they will let you search. Today, more and more search engines will let you specify that you would like to search through Usenet newsgroups rather than the World Wide Web. Since DejaNews focuses on Usenet newsgroups exclusively I usually utilize their search engine to find a topic of interest in a newsgroup or a particular article in a group. 3. What's the difference between a search engine and a directory? A directory classifies and catalogs the documents found on the World Wide Web. A search engine simply lets you search through those documents. A directory requires more maintenance as new pages are added by the minute to the Web and each new page has to be indexed and filed away in a suitable category. Someone or something, usually a computer program, has to look at all of these pages being added to the Web and decide which category the page best fits into. Both Yahoo and Wired Source are excellent directories that let you search through their database of information. Either will allow you to focus your browsing. Directories are your best bet if you are not sure what you are after yet, but you have some idea of what category it might fall under. But, if you are looking for something specific, search engines may be able to find what you are looking for faster -- once you are accustomed to using them. 4. How can I improve my search? Searching can be frustrating. However, there are a few simple steps you can take to improve your search. Since the WebTV Network gives you the power of the Excite search engine right from your home page I will explain how to make better use of their engine. Excite is unique because it searches by concept. As Excite puts it: "Excite uses Intelligent Concept Extraction (ICE) to find relationships that exist between words and ideas, so the results of a search will contain words related to the concepts you're searching for." Excite's concept searching will often find relevant pages that may not even contain words used in the query/search statement. It also helps you out by moving the most important or the most relevant documents to the top of the results, even when thousands of documents are found. In other words, it's smart and it tries to comprehend just what you are after. It's also versatile in that it allows you to employ many of the same searching techniques found with other search engines in a simpler format. I'll now demonstrate just what this means. Let's try to find something. Some of us here at WebTV Networks really dig J. Otto Seibold, author of childrens' books. He does a series about a character named Mister Lunch, who is a professional bird-chasing dog. So, since just about everyone else has a home page, let's see if Mr. Lunch has one of his own! If we search for just mister lunch we would wind up with pages and pages of results. Personally, I would rather be reading a Mr. Lunch book than thousands of search results. Let's narrow the search down. Excite lets us use several shortcuts to narrow our search: Since lunch is such a common word Excite uses its ICE to substitute words for lunch to find something similar. Utilizing the + symbol cuts the concept searching down a bit. Now our search results are down to a more manageable number (1759 to be exact, I checked on my own WebTV Internet terminal). However, this is still around War and Peace size. So, let's try something new: It's worth sticking in a word or two about capitalization here. If you type out your search in either all lowercase or all UPPERCASE the search engine will ignore capitalization. However, if you type out your search with a mixture of both upper and lower case the engine will look for an exact match of what you typed. For instance, "mister lunch" finds those two words in any case but "Mister Lunch" will cause the search engine to become case sensitive. Back to our search. A search for "Mr. Lunch" found 34 documents. Much better. 5. What is a Boolean expression? It's an imposing word but a useful tool. Named after mathematician George Boole, Boolean expressions are the common thread between nearly all of the search engines on the web. Here is a list: AND: works in the same way that the + plus symbol does as described above. AND NOT:works in the same way that the - minus symbol does as described above. OR: will return pages with the word on either side of it. For instance, a search for children OR kids would return any pages that contain the word children or the word kids. This helps if you are looking for something that could be defined by many different names. ( ): Parentheses are used to group portions of Boolean queries together for more complicated queries. For example, to find web pages that contain the word "sports" and either the word "soccer" or the word "hockey" you would enter: sports AND (soccer OR hockey) Read through Excite's Search Help for a more detailed discussion of their Advanced Search, and how it uses Boolean queries. 6. I still haven't found what I'm searching for! The key to searching is to know what you want to find. The more keywords that you can include in a search query the more that a search engine has to go on. Let's go back to our search for the Mister Lunch web page. Having read J. Otto Siebold's books, I already know a little bit about Mister Lunch. I know that Siebold goes by the name Jotto and Mr. Lunch has a friend named Space Monkey. Fortunately, that's all I need to know. When I entered: jotto AND space AND monkey AND lunch AND (mr OR mister) The search returned only 3 documents, all of them about Mr. Lunch. Notice that the use of the Boolean expressions AND, OR and parentheses helped me find Mr. Lunch no matter where he might be or how his name is spelled. Another way that I could have found Mr. Lunch's home page is by searching for images of Mr. Lunch. After all, he is an illustrated character. To search for an image I would use either the WebSEEk directory or Wired Magazine's search engine, HotBot. HotBot is particularly helpful because it allows you to search for types of data like a sound, an image, a JavaScript, or whatever you like! 7. What cool things can I do with a search engine? Well, here's something all of you web developers out there might find useful. AltaVista allows to you to find web pages that have linked to your web page. See what has linked to WebTV Networks Inc.. To learn how to do this take a look at AltaVista's help page on advanced searches. Once you have found what you were looking for Excite will let you find a lot documents just like it! Click on More Like This to the right of the item listed and Excite will immediately use that document as an example in a new search. If you would like to limit your search to just the title of a web page or site or just an image on a site enter it this way: title:"The Washington Post" image:"flower.jpg" If you know a part of a web site's actual address but not all of it, you could try the following. Say you were looking for a web page that information on BMW cars. type in url:BMW.html This search will return web pages that contain both BMW and html in a web site's address. The search tools page in the reference section of Explore is a good place to jump off into many different search engines and directories. A recent special report by Scientific American is a must read for anyone who is interested in the future of searching on the World Wide Web. I highly recommend taking a look at it. Next month I hope to talk more about searching using the Internet (instead of the World Wide Web). You'll learn about all of its wonderful text based protocols like FTP and Gopher. Until then, good hunting! | |||||||||||||||||