The Info Is Just Too Much to Handle

April 3, 2007 at 5:00 am by Jack Yi

Everyone knows that the Internet and the Web contain an enormous amount of information. It was estimated that, in 2005, over 12 billion documents were stored online. In order to sort through all of those documents and find what we need, we rely on Internet search engines and, more often than not, Google. Web crawlers (or spiders) sift through html source code to search for key words that are indexed for future Internet search reference.

Although search engines have become very advanced, there is a critical, fundamental flaw in the technology, mainly due to the tremendous growth rate of information on the Internet. According to a New York Times article, by the year 2010, information that is available in the world will outnumber the available storage available by a factor of nearly two to one. If we think logically about the situation, search engine results cannot possibly filter the 12 billion documents well enough and fast enough to get the best, most relevant information possible.

According to a 1999 study by Lawrence and Giles, who were among the first to explore search engine technologies, no search engine indexes more than 16 percent of the Web. Although a team of four Google “spiders” can crawl at an estimated 100 pages per second – or around 650 kilobytes of data per second – it is still not fast enough. What this ultimately means is that there is too much information on the Internet for our relatively primitive search engine technologies to sort through.

In addition to this primary flaw of search engines, there are other issues. Because the search engines work in a straightforward and methodical way, the algorithms can be exploited to make the sites rank higher on Google search results, thus bring in more traffic and therefore more revenue. This is known as search engine optimization (SEO) or search engine marketing (SEM).

As we look into the current landscape of the Web, we can see a possible solution to the problem of too much information for such primitive search engines. The solution involves the human mind and, more importantly, the human network.

First, let’s examine the capability of the human mind. The new “cell” central processing unit used in the new Playstation 3 is estimated to have a performance of two teraFLOPS (two trillion floating point operations) per second; however, the human brain is believed to have a theoretical performance of one hundred teraFLOPS per second. What this means is that humans have a vastly superior ability to sift through information and decide if it is relevant or not; with the help of the human social networking of the emerging Internet, we can predict that future search engines will not rely on computer software to sort out relevant information and documents but instead will have human beings deciding on the relevancy of Internet content. I don’t want to reiterate my previous column, but we can see this happening already on the Web. Take Younanimous.com for example; for this search engine, the user searches through one of the popular search engines (i.e. Google, MSN, Yahoo!) and submits what he has found to Younanimous.com. Then Younanimous.com will rank those links based on popularity and will be archived for future search queries. Through this democratic system, superior, quality content will float to the top of the rankings where it will have the best chance of exposure. I’ve tried out their service, and the search results were mediocre. However, because the site is relatively new, we’ll have to wait a bit to see it’s full potential.

Currently, exposed content is mainly due to good SEO or SEM practice, which does not guarantee the best content. But using this new Internet search engine architecture, we can see a more balanced and fair method ranking quality content. This will ultimately mean a more advanced Internet and more relevant content for the end-user, which, especially for university-level students, is a good thing.

Comments are closed.

Weekly Horoscopes

by @dailynexopinion

THE SIGNS AS UCSB BUCKET LIST ITEMS

Aries

March 21 - April 19

Ride through the bike path without crashing

Taurus

April 20 - May 20

Find a seat at the red couches on the second floor of the library

Gemini

May 21 - June 20

Sunrise and sunset dip on the same day

Cancer

June 21 - July 22

Find parking on Del Playa Drive

Leo

July 23 - August 22

Drunk for 24 hours

Virgo

August 23 - September 22

Bagel crawl

Libra

September 23 - October 22

First pick at an Arbor muffin in the morning

Scorpio

October 23 - November 21

Participate in a mosh pit at a band show

Sagittarius

November 22 - December 21

Finally talk to their class crush

Capricorn

December 22 - January 19

Get an earlier pass time than their friends in GOLD

Aquarius

January 20 - February 18

Graduate

Pisces

February 19 - March 20

Successfully walk through I.V. without seeing an old hookup

Icons made by bqlqn from www.flaticon.com

More from Daily Nexus

Video