Tuesday, September 25, 2007

The Invisible Web

In the instruction sessions I like to talk to students about the "invisible web" or what some people call the "deep web." Instructors ask me to tell their students the difference between a simple Google search and a search in the databases. Basically, search engines can send out "crawlers" or "spiders" to gather links to their servers. They input this information into their own database, so a search on Google is actually a search on their database. They rank their results by relevance using an algorithm that determines to some extent how easily accessible or easy it would be to find a certain page. Theoretically, the most authoritative sources on a specific topic would have the most links to their web page. This does not happen in reality as many individuals or businesses create tons of links to their pages, so that their pages will climb the list of Google results. Some political or television persons tell people to embed links into their web sites to push their agendas. Several years ago, such a prank was pulled as tons of links included "miserable failure" in its html code with a link to the White House home page. Sometimes these are called Google Bombs: http://blogoscoped.com/googlebomb/.
Additionally, web search engines must follow internet protocols. This means that if one of their crawlers comes upon a web site with a robot extension, that robot is required to ignore that web site. Search engines do not have the authority to read and retrieve everything on the internet. Some information belongs to publishers and authors and must be accessed with a user name and password, which incidentally requires a fee. Databases house information that costs lots of money; perhaps it should be stated that the information cost a lot of money in producing and the publishers/authors must be compensated accordingly. Academic libraries, therefore, purchase multiple subscriptions to online databases to enhance the research efforts of the students and scholars they serve. Tuition money from students helps to pay for these databases, so students should take advantage of the vast amounts of knowledge that can be found in these databases.
Databases include newspaper articles, book reviews, popular magazine articles, and scholarly articles. College students should endeavor to find the most accurate and relevant information for their assignments. Often the most authoritative information can be found in the scholarly or peer-reviewed articles. Scholars critique other scholars' work before it can get published. Scholars that submit their work often must edit their work before it can be published. This rigorous activity helps to promote the advancement of truth and knowledge.

