Sunday, December 04, 2005

DEEP WEB SEARCH

WHAT IT IS: Technology that boldly goes where no search engine has gone before.

WHY IT'S HOT: Google may have already indexed 8 billion webpages, but that's just the tip of the iceberg. Many more pages are hidden behind corporate firewalls or in databases waiting to be indexed. By some estimates, this so-called dark Web is 500 times bigger than the World Wide Web as we know it. Unlike the public Internet, however, it can't be retrieved by the usual Web crawlers. Instead, the information must be fed into search engines' mammoth databases using special retrieval techniques.

Before the advent of desktop search, our PCs were part of that invisible Web -- connected to the Internet but not indexed. File-sharing networks already search your PC for MP3s, but there are tricky privacy and security issues to resolve before your hard drive can join the visible Web. There are also millions of digitally transcribed books waiting to be connected. Ultimately, deep Web search could answer a direct question better than hundreds of links, because many of the most authoritative sources have yet to make it online.

KEY PLAYERS: Endeca, Glenbrook Networks, Google, IBM (IBM), Kozoru, and Yahoo.

No comments: