Tons of search engines are out there. Novices will feel
most comfortable in a friendly, easy-to-manipulate
environment such as Yahoo!, About.com, or AskJeeves,
while power users swear by Google (hands down the best
search engine around), AltaVista, and HotBot, the
engines that store the most sites. Other well-known
engines are Excite, Infoseek, LookSmart, Lycos, and
WebCrawler, and there's plenty more out there, believe
me. Each one has its advantages and its disadvantages.
Some, like AltaVista, are search indexes, which
attempt to index all the sites they visit, based on the
text of each site. Subject directories like
Yahoo! act more as a "card catalog," assigning each site
to a subject category. More specialized search engines
abound, such as the Brittanica Internet Guide, which
focuses more on encyclopedia-like sites about history,
science, the arts, and so forth; LookSmart, which offers
a smaller, more detailed database; and Lycos, which
gives you more control over your search, as well as
focusing more on pages concerning art, science, and
literature. Most of the directories and smaller search
engines provide links to the behemoths such as
AltaVista, Google, or Excite in case their smaller
database doesn't contain what you want; conversely, some
search engines like AltaVista are supplementing their
sites with directory listings. (Web addresses for these
search engines, and others, are in the
Search Engine Links
and
Metasearch Engine Links
pages of this site.) Search sites have come under fire
for not investing in new technology, selling rankings in
search results, censoring results from rival sites, and
focusing more on advertising and marketing to do a
decent job of searching the Web for you. If you think a
particular search site is failing you, you've got plenty
of other sites to choose from. And remember: according
to rather old (i.e. pre-Google) data from the NEC
Research Institute, the most complete search engine
available, Northern Light, only covers 16% of the sites
out there. AltaVista and NCBi (formerly Snap!) cover
15.5%, HotBot covers 11.3%, and things drop off from
there, with Lycos and EuroSeek limping in with less than
3% coverage. There are almost 1.5 billion Web sites as
of February 2000; think the number has shrunk since
then? The NEC researchers recommend using metasearch
engines such as Dogpile or MetaCrawler and performing
multiple searches on different search engines. Note that
this survey was done around 1999; Google has since
lapped AltaVista and the others as the most complete
search engine around.
The best search engines index less than 1 in 6 of the
Web sites out there (currently numbering somewhere
around 1.5 billion, so don't restrict your search
to one engine or directory.
A number of search engines such as Microsoft's, AOL's,
GoTo, and Yahoo!, among others, use the search engine
software provided by Inktomi (www.inktomi.com).
Inktomi has released the third generation of its search
technology, called GEN3. The claims are that it
currently indexes over 500 million Web documents and has
the ability to hold over a billion. The new search
technology can be seen at any Inktomi-driven site.
A few basics of the kinds of search engine styles:
-
Any words.
This is what most novice searchers enter, and they
end up with yea thousand results, most of which
aren't anywhere close to what they want. An any-word
search for "Martin Luther King" turns up pages about
Martin Tupper, purple martins, Martin Luther, Luther
Vandross, King George III, the rock band King
Crimson, and who knows what else.
-
All words.
These kinds of searches turn up indexed pages with
every search word, in order. You stand a better
chance of finding what you're looking for, but
off-topic results are still very possible. For
example, entering an all-word search for "Martin
Luther King birthdate" may give you a page about
Leon Smith, the NBA rookie who went to MLK High
School.
-
Exact phrase.
Every indexed page containing the exact phrase turns
up. Carefully worded phrases can often turn up very
useful results, though spurious results still pop
up.
Here are some tips on using the more advanced search
features (often listed under "Power Search" or "Advanced
search":
-
Boolean search
You're using algebraic notation, more or less, to
group and set off your search parameters. Bare bones
Boolean entries use AND, NOT, and OR to narrow down
results, such as "UNC AND Tarheels AND basketball
NOT football" to find info about the UNC Tarheels
basketball program, but to weed out pages about the
football program. Parentheses often come into play,
such as "Independence Day AND (NOT movie)" to find
pages about the topic of Independence Day, but to
weed out pages about the movie of the same title.
Many people find Boolean searching quite difficult.
If you do go for Boolean operations, don't forget
the NEAR attribute. It works something like AND,
except that it insists that the two words be near
each other. Most Web search sites default to a NEAR
distance of 10 or 20 words. You might also like some
shorthand notations for commonly used Boolean
operands. AND is the same as & is the same as +. OR
is the same as |. NOT is the same as ! is the same
as -. NEAR is the same as ~. ( ) is the same as " ".
And this should confuse you: when you put items of a
search phrase within parentheses or quotation marks,
you are telling the search engine that you want to
find only those pages that contain all of the items
of the search phrase in the order shown within
parentheses or quotation marks. So, for example,
searching for lions and tigers produces a
list of all pages that mention both lions and
tigers. Meanwhile, searching for "lions and
tigers" or (lions and tigers) produces a
list of all pages that contain the phrase lions
and tigers in that order. That list of results
in the second bullet doesn't include any pages that
mention only lions or pages that mention only
tigers. Nor does it list any pages that mention
tigers and lions. (Note the different order of
the words in the phrase.) Whee!
-
Combining Boolean operands: There's no limit on how
you can use Boolean operators to expand or focus
your search phrases. You can just keep adding them
until you find what you want, and don't find what
you don't want. For example, you could run this
entire search:
lions AND tigers AND bears OR "dorothy and toto"
NOT "wicked witch"
That search would help you discover almost any
Wizard of Oz pages that don't mention the Wicked
Witch.
-
Categories.
Much easier. Most directories, such as Yahoo!,
separate their indexed pages into a multilevel
directory -- categories. Thus, this site might be
found under "Computers -- Software -- Operating
Systems -- Microsoft -- Windows" or some such.
Usually you can search within categories.
-
Read the site instructions.
Every site does things differently, sometimes
dramatically so.
-
Exclude words.
Sort of a "Boolean lite" technique, this involves
putting a minus sign ( - ) in front of a word to
exclude it. A search for "salsa -dance" gives you
more pages on the condiment and less on the dance.
-
Include words.
The flip side of the above. Put a plus sign in front
of words you want to ensure appear in the pages.
Good for fine-tuning any-word searches.
-
Number of results.
Unless you're impatient or your connection is slow,
go for the maximum.
-
Personalization.
These are the search sites' attempts to become your
"portal" to the Internet. Usually they're highly
configurable for your tastes, and often offer other
services such as Webmail, personal calendars, etc.
-
Quoted phrases.
Define entire phrases for searching. "Lord of the
Rings" inside quotation marks will give more pages
about the Tolkien trilogy as opposed to an any-word
search. Probably the most useful single tip in this
list.
-
Search form.
A few sites give you short forms to use for focusing
your searches. You may be able to set the desired
languages, restrict the search to a particular
domain, search for titles only, etc.
-
Search within results.
Some sites let you modify your search and search
again within the previous results.
-
Word forms.
These search for variants of given words: "mouse"
also gives results for "mice," or "children" for
"child."
Finetune your searches:
-
Start
within a specific category, if available.
-
Avoid
computer-specific terms such as "file," "folder," or
"memory," unless you intend that meaning.
-
If
you can use the advanced search features, then do
so.
-
Find
proper names faster by surrounding them in quotes.
-
If
you get too few results, switch to any-word search,
or reduce the number of keywords; for too many
results, try switching to an all-word search or add
keywords
-
Avoid
articles (a, an, the); the search engines ignore
them.
-
If
you're using a plain-English search such as Ask
Jeeves, restrict yourself to very simple questions:
"Where is Constantinople?" "Who is Marilyn Monroe?"
-
Spell the words right.
Try to use a larger number of words to limit your
search. For example, hunting up a page on North Carolina
men's basketball will be more productive if you use a
search query something like NORTH CAROLINA TARHEELS
COLLEGE BASKETBALL MEN NCAA rather than just NORTH
CAROLINA BASKETBALL. Also, it's helpful to include
synonyms when possible; for example, when hunting down
something on MONITORS, include the word DISPLAYS also.
Try searching for both upper- and lower-case versions of
the words. Try singular instead of plural words. Avoid
letter/number combinations such as "Windows NT" or "3DO
technology."
Google is probably the single most used search engine
out there. Although it is quite simple to use, and
almost eerily accurate, there are methods that will make
Google results even better. First, use the Google
Toolbar (available at toolbar.google.com/) if you
have Internet Explorer 5 or later. Even the advanced
Google search options are available on the toolbar. If
you have another browser, use the Google Buttons
(available at www.google.com/options/buttons.html)
on your browser's toolbar, or try GGSearch from
www.frysianfools.com/ggsearch/. Another tip is to
have Google open its links in new windows; in the
Toolbar, click the Google button and select "Search
Preferences Page." Check the box labeled "Open search
results in a new window." Google will also translate
pages in foreign languages; copy the page's address, and
in the toolbar, click the Google button, select
"Language Tools," paste the copied URL into the
"Translate a Web Page" field, select your language from
the drop-down menu, and click Translate. Google also
translates foreign phrases; in the same "Language Tools"
page, enter the phrase in the "Translate Text" field,
and press Translate. (Don't expect miracles on the
translations, but you should get enough information to
at least get the gist of the phrase.) Not enough
parameters? Try www.google.com/advanced_search;
if that isn't enough, try the Google Ultimate
Interface, with lots of parameters that Google
itself doesn't include, such as date ranges, file types,
language, and country. Check it out at
www.faganfinder.com/google.html. Want a definition
of a word? Search for the word, and click on the
underlined word in the blue bar at the top of the
results page for a quick definition from Dictionary.com,
or find the definition of a word by typing
"define:word." If you're unsure of the spelling of a
word, just enter "Spell:" followed by the word, like so:
"Spell: speling" (don't include the quotes). And for
absolute language fun, go through the Preferences link
and choose a language from the drop-down menu to have
Google's display in any of 88 languages and dialects,
including some silly ones like Elmer Fudd, Pig Latin, or
Klingon. Qa'pla!
Google
limits its search phrases to 10 words, so shorter is
better. And word order matters: a search for "tarheel
basketball" gives a different set of results than
"basketball tarheel."
Google
has a plethora of syntax tricks that most of us don't
use. You can find out more at
www.google.com/help/operators.html, but here are
some of the most useful. Intitle: at the
beginning of a query, use this to find words or phrases
restricted to the titles of Web pages: i.e.
intitle:"Three Blind Mice" . Intext does the
opposite: a query for intext:"Three Blind Mice"
hunts down the phrase in the body of Web pages without
looking through the titles. A good example is using the
syntax "intext:HTML" to find pages that talk
about HTML without getting results like
www.fubar.com/index.html . Use the Link
syntax to find out who's linked to a particular page:
for example, I might type in
link:http://www.toejumper.net to see who's linking
to this site. The site syntax restricts searches
to particular domains: for example, I might do best
hunting down scholastic references to, say, Mark Twain
by restricting my search to .EDU sites: I would use
"Mark Twain"site.edu to find these pages. I could
refine my search even more by using a combination: for
example, intext:"Mark Twain"site.edu to find only
.EDU pages with Twain in the title.
It's easy
enough to use the Google Toolbar to search within a
specific site, but you can use the syntax directly: to
search within this site, type "site:toejumper.net
google" (without quotes) to hunt down all Google
references in these pages.
The plus
(+) and minus (-) symbols have their uses in Google. To
force Google to include so-called "stop words" (words it
normally ignores, such as the), place a plus in
front of it: +the. To exclude words, use the minus
symbol. Don't put spaces between the symbols and the
words.
Want to
find something in your area? Go to local.google.com
and enter whatever you're looking for along with the
city, state, or ZIP code.
Find maps
for a specific location by typing "map location"
(without the quotes, and where the word "location" is
replaced by the city or state you desire). More exacting
maps results can be had by entering a US street address
along with the city, state, and/or ZIP code. Entering a
phone area code gets you a regional map.
Find
someone by entering their phone number: you get a name,
an address, and a map which can lead you to their front
door. Makes you leery about giving out your own phone
number to just anyone, doesn't it? Looking for someone?
Enter their name, city, and state into the search box
and see what comes up. It's not always accurate: for
example, I learned that I'm a basketball player for the
Loyola Greyhounds. News to me...!
Google
gives us a nifty calculator that can be used from the
search box: go to
www.google.com/help/features.html#calculator.
Enter
FedEx, US Postal Service, or UPS tracking numbers to
track errant packages. Entering a UPC (Universal Product
Code) gets you info on the product and its maker.
Entering an airline flight number, such as "United
Airlines 150" (without the quotes) gets you info on the
flight.
Google
News, though comprehensive and up-to-date, can't be
customized, so you're stuck with information you don't
necessarily want. You can drag the linked name of a News
section such as Sports to IE's Links toolbar, and then
click that link when you want to see sports news. Links
to the other Google news sections appear on the left
side of each news section's page.
"The In
URL and All In URL Options:" The InURL option is simple:
by using the search keyword, you force Google to only
hunt within URLs themselves. You wouldn't find this site
by using "inurl:troubleshooting", but you would find it
by using "inurl:toejumper" (leave out the quotes). If
you're not sure of the entire URL of a particular site,
precede your Google search phrase with "allinurl:" For
example, if you're looking for the URL of a site with
the words Tarheel Basketball, type "allinurl: tarheel
basketball" (again, sans quotes).
Find
locations fast by entering the address into the Google
search field; you'll get two links, one to Yahoo! Maps
and the other to MapQuest. Find out about phone numbers
by entering the area code and number you're curious
about (like this: 123 456-7890) and see if useful
information comes up.
Want to
know if your site is listed in Google? For the fewest
and most accurate hits, substitute your Web domain name
for each "example" in the following: example
site:www.fubar.com and enter it into Google's search
field.
Finding
images in Google is simple; just use the Images button
on the home page, the "Search Images" button on the
Toolbar, or go to images.google.com/. For best
results, select "Advanced Image Search" and start with
the "related to the exact phrase" field.
Stock
ticker imformation can be accessed by simply putting in
the appropriate symbol, say NYSE, AMEX, or NASDAQ, and
clicking the "Show stock quotes" link at the top of the
page to get a special page from Yahoo! Finance, along
with tabs to get info from ClearStation, The Motley
Fool, MSN MoneyCentral, and Quicken.
Broadband
users can change their results numbers from 10/page to
30/page with a neglible loss in download time. Just make
the changes in the Preferences page, as detailed above.
Want to
search within a site? The Toolbar has a "Search Site"
button that makes it simple, once you go to the site
itself.
Use the
Adult Filter to keep out the adult-related sites and
images by going through the Preferences link, scrolling
down to SafeSearch Filtering, and clicking "Use Strict
Filtering."
View .PDF
files in HTML simply by clicking on the "View as HTML"
link. It won't always display as nicely as the actual
.PDF file, but you can see it well enough to decide
whether it's worth the download. In fact, Google can
hunt down file types: just use the "filetype" search
marker, as in "filetype:doc tarheel basketball" (sans
quotes, as always) to find .DOC files about Tarheel
basketball. Other file types include Adobe Acrobat files
(.pdf), Lotus 1-2-3 files (.wk1, .wk2, and so forth),
Excel (.xls), PowerPoint (.ppt), Rich Text files (.rtf),
Flash (.swf), regular text (.ans, .txt), and others.
Find
quotes or phrases by wrapping the portion you do know in
quotation marks, i.e. "I pity the fool" -- just enter it
and find out the rest of the quote and who said it.
Note: Yahoo! is rather unique in the way it handles
search queries. It assumes that every word you enter is
part of an AND strand (i.e. "North AND Carolina AND
basketball") unless you go into the Advanced options and
choose the "Matches on any word (OR)" choice. Then you
have to insert the + sign to get the AND operation.
Don't forget, you can bookmark and save search pages, to
use at a later time. Or for real down-the-road use, copy
them to your hard disk with SurfSaver (www.surfsaver.com/).
It's free.
As mentioned above, some search sites give you a
"natural language" option which purport to translate
your English phrasings (often questions such as "Where
can I find a '59 Corvette page?") into something that
search engines can use. Computers don't work this way;
don't use this option, unless you're using the
AskJeeves, Excite, or Lycos Pro engines (see below), and
then don't expect miracles. However, Ask Jeeves and its
associate AltaVista do quite well with simple questions
such as "How do I learn about patent law?"
Excite has recently refined its natural search option,
calling it the "Zoom In" feature. Now you enter your
search phrase, press the Zoom In button, and Excite
provides you with a list of alternate terms and phrases
that might help you narrow the focus of your search.
Naturally, it works best with broader topic searches.
Start with as narrow a search as you can. Use a
specialized search engine to search a narrower database
if that's possible.
If a site has advanced search options, learn to use
them. One useful example is HotBot's advanced site,
which you can learn from the info on
www.hotbot.com/help/tips/search_features.asp.
HotBot, Lycos, and the others doing this have tried hard
to make it easier for plain folks to use.
Typing multiple words in a search box will give you
varying results depending on the search engine. Yahoo!
assumes you want an OR between each word (i.e. "North OR
Carolina OR Tarheels") and gives you results from pages
with any of those words in them. HotBot assumes that you
want an AND between each word (i.e. "North AND Carolina
AND Tarheels") and gives you search results restricted
to all of those words. You're dependent on each search
engine's default assumptions; hunt them out.
By using the phrasing "title: Pearl Jam" (without the
quotes), you restrict the search engine to finding only
those sites with the phrase "Pearl Jam" in their titles.
In this case, you'd avoid getting a million listings of
homepages from people who "reely love Pearl Jam" and say
so on their site, right above the picture of their
beagle.
If you put a domain name such as .COM or .EDU in your
search string, most engines will only pull up sites with
those domains.
The "url" option restricts engines to presenting you
with results from sites with the given word or phrase in
the site's URL. For example, the search string "url:
sasquatch" would give you www.sasquatch.com and
www.whattheheck.com/sasquatch but not
www.bigfoot.com.
Some engines support the "host field" option. In this
case, you could use the string "host: ebay.com" (again,
no quotes) and limit your search to pages on the EBay
site.
Some engines also support the "related" option, where
you can type "related: www.microsoft.com" and get pages
that the engine lists as being related to Microsoft's
home page.
An even more restrictive search option is the "image"
selection. Type "image: penguin" to find only images
with the word "penguin" in their file names. Not always
useful, as many sites use codes or odd combinations of
characters to name their images -- check out the various
NASA sites, for example, for beautiful shots of the Ring
Nebula named "STS-4356id6.jpg" or something similar and
uninformative.
"Wildcards" isn't just a term for your black-sheep
inlaws, it means symbols that stand for something else.
An asterisk * means "anything," so searching for the
string "mam*" gets you results on "mama," "Mame,"
"mam4," "mamma," whatever's out there. A few search
engines automatically insert invisible wildcards at the
end of every search phrase, so searching for "flow" gets
you results on "flowers," "flowbee," "flow-control," and
all sorts of possibly irrelevant results. Use quotations
marks to rid yourself of the automatic wildcards.
Here's a
list of examples of some commonly used wildcards and an
explanation for each one:
-
WIN*
searches for any file starting with the letters
win -- Windows, wince, wine cellar, etc.
-
DATA
searches for any file containing the word data
-- databank, rawdatafile, etc.
-
*UP
searches for any file ending with the letters up
-- backup, fouled up, etc.
-
P?P
searches for any three-letter filename beginning and
ending with the letters P -- pip, pop, pup,
etc.
-
*??T
searches for any filename and its accompanying
extension ending in the letter T --
README.TXT. AUTOEXEC.BAT, etc. etc.
"Field" searches limit your search to web pages'
"fields," such as the title, the URL, or the top-level
domain.
If the search engine is subdivided into categories,
drill down into the category that is applicable to your
search before submitting a search query. You'll often
find more, and more relevant, results.
Use quotation marks to force a multi-word phrase to
appear, for example, "fox terrier" will not give you
pages devoted to Scotties or Cairn terriers, or foxes.
Most search engines let you save successful searches for
later browsing; you can also bookmark search pages.
Tailor your search query to a single site by using the
HOST protocol: if you only wanted to search the ZDNet
site for information about board games, you would type
HOST:ZDNET.COM"BOARD GAMES" to limit your search to that
particular site.
AltaVista (www.altavista.com/) gives you the
option to hunt down pages containing specific Java
applets by using the "applets:class" search parameter.
Just replace the term "class" with the name of the
applet you're looking for, lose the quotation marks, and
begin your search.
AltaVista and some others will let you use the
"like:URL" modifier to search for sites similar to the
one you list.
Some search engines let you use a text-only version; use
this version to speed up your searches.
Multi-search engines are coming into vogue. Dogpile,
MetaCrawler, All4One, MetaSeek, and others combine up to
20 different search engines in their queries. Lots of
users don't even bother going to single search engines
anymore, rather, they go straight to one of the meta
engines.
Running a complex search or hunting for more than one
item? Run your search, bookmark the first page of your
results, and keep searching. Check your bookmarked sites
later.
Search for your target words, but also include synonyms
and common variants when possible. Language-impaired?
Try www.thesaurus.com/ for help.
From a Search Results page on Yahoo! (and also from the
Yahoo! home page,) you can opt for a more sophisticated
set of options for your keywords. On any of those pages,
click the "Advanced Search" link to see the Search
Options page. From here you can give Yahoo! more clues
than the raw keywords that you use in a simple search.
You can: choose between Category and Site searching,
apply Boolean search operators to your keywords, select
to avoid the Web entirely in favor of Usenet newsgroups,
and alter the time period of your search.
Northern Light (www.northernlight.com/) has a
special collection of 2900+ full-text periodicals
unavailable elsewhere on the Web. Searching it is free,
but downloading full-text articles can cost from $1 to
$4. Researchers who find Lexis-Nexis too pricey may want
to give Northern Light a try. Another alternative is
InfoBeat, which feeds your e-mail account with ton lots
of general news and information. I'd recommend setting
up a freemail account for this, since InfoBeat tends to
overwhelm you with stuff. Try this free service from
www.infobeat.com/.
Don't forget to search Usenet (groups.google.com/)
for info.
Off-line search tools are becoming more popular,
especially for those power searchers who grind their
teeth when AltaVista gives them a kazillion sites to
hunt through for a single piece of information. The
available utilities run the gamut from giving you
automated search functions, weeding out invalid and
outdated links, compiling results from several search
engines, and letting you group links by category for
later perusal. Copernic 2001 comes highly
recommended.
Want to pick up a little Internet litter? When you enter
a URL and get the "Error: 404" message, submit that site
to one of the major search engines such as HotBot or
AltaVista. Their spider will visit the site, see it
doesn't exist, and delete it from its listings. That way
the next searcher doesn't get that link returned to them
on their search. They ought to give a Boy Scout award for
this one.