16. Tailor search engines to
By Mark Alberstat
Q. I thought some columns might usefully focus on search engines. I
usually just type a name in either the MSN search bar or the Google search
bar but wonder if there are not more sophisticated engines or how to use
IN THE TIME it takes you to read this short column, thousands of Internet
searches will have taken place. McAnerin Networks Inc., a Canadian-based
Web site promotion company, estimates that there are more than 319 million
search queries each month; that's about 127 searches every second.
Gobbling up almost 97 per cent of those queries are Google, Yahoo, MSN,
AOL, Go Network, Excite, Altavista and Lycos. These are the names you
would expect to see in a ranking of search engines, the biggies with more
hits than you could aim at a well-formed search.
The question arises, however: Which search engine is best for you and how
do the main ones differ from each other? A few years ago this was a
simpler question than it is today, with seemingly rival search engine
companies buying each other and morphing into larger sites with sponsored
The sites that users go to can be divided into two basic categories -
directories and engines.
Directories, such as Yahoo, are based
on a hierarchical database that you can either search through or drill
down into. If you were looking for sites featuring mathematical
probability puzzles - and don't we all? - you would drill down into the
Science directory, then into the Mathematics directory, then into the
Probability directory. Although this example is only three levels deep,
some categories are much deeper than that.
Directories are great when you have a vague notion of the site you are
looking for and want to do a bit of searching yourself. When you send a
query through directories, you are searching text that appears in that
site's title and description and not its contents.
Search engines, such as Google and Altavista, use electronic robots to
crawl through the Web, cataloguing the information they find and
organizing it into a database back at their home, which you can then
search. Because these crawlers are always working and there are lots of
them, these sites tend to have a larger chunk of the World Wide Web in
their databases than do the directory sites.
Engine-based sites are ideal when you know exactly what you are looking
for, such as "cross stitch patterns featuring cats." If, however, you
simply were interested in cross-stitch as a hobby, a directory site such
as Yahoo would work best.
Metasearch sites have become more popular in the past couple of years
because of their power to search several of the top search sites, both
engines and directory types, with a single query.
These sites have their uses but can be frustrating as they only display a
small number of the results from each of the sites they query, so the
result you may be looking for could get left behind, leaving you to search
again at another site.
If you are curious about what other people are searching for on the Web,
Metaspy has two interesting services
for you, featuring a Sherlock Holmes character that describes what you may
be getting yourself into. The two services are filtered and unfiltered
search strings. The unfiltered is not for family viewing. The list of 10
search strings refreshes every 15 seconds and represents only a small
number of the searches being performed at Metacrawler.
The following are a few related sites you may not know about:
A listing of directories and engines from 195 countries and 39 territories
around the world.
- www.ixquick.com: A powerful and
popular metasearch engine (available in several languages)
- www.alltheweb.com: Another
metasearch site that claims to search 3.1 billion Web pages for you. Also
has a button for you to view the last 10 queries it performed.
The Mousepad runs every two weeks. It's a service of Chebucto Community
Net, a community-owned Internet provider. If you have a question about
computing, e-mail email@example.com. If we use your question in
a column, we'll send you a free mousepad.
Originally published 31 August 2003