Weekend Wrapper : More thoughts on GOOG database; the drag on internet performance from the bot armies; and SVW readers get discount to upcoming Churchill Club Mossberg/Swisher gadget fest!

By Tom Foremski - October 28, 2005

By Tom Foremski, Silicon Valley Watcher.com

The Google database is the logical next step in the creation of a second, proprietary internet. Upload your content to the GOOG, the planet's largest computer.

Why host your own site? Just dump it into GOOGbase, the search algorithms will organize it for you, and for your customers/readers/users.

Content owners such as craigslist, won't dump their content into GOOGbase. But many smaller sites will--because they are desperate for clicks and get excited when the GOOGbot comes around every three months. No waiting--just dump it in.

I've often said to the fine Search Engine Optimization folk out there: Let the search engines optimize themselves to find the sites, it's their job; concentrate on optimizing for your customers/users--not the bot.

Now, no need for the bot. And no need for the SEO most probably. Because the GOOG knows what is located where--if it's hosting and holding the content.

And this also is the best way to stop the masses of online scams and fraud. Google can argue that it can track every bit of the bytes that cross its network, from start to finish. (It is a shame AOL went open--the right move at the right time, again ;-)

The burden of search sites on the internet

But, do we still need GOOG? Search engines don't drive much traffic to sites--direct bookmarks do and those are the ones you want. This is repeat traffic. You get them by optimizing for the user not the bot.

Let me say again: you do not want to rely on search engines to bring you your traffic. That traffic comes from people that don't know where you live or what you do. You want repeat traffic, from people that know you, trust you and know where you live. They don't come in through a search engine. So optimize for the user-not the bot.

BTW, I would love to see how much the bots, crawlers, scrapers etc are using up in terms of internet bandwidth. I wonder if anybody has done a study of the amount of traffic and server performance degradation that they cause?

It would not surprise me if the number is gobsmackingly high. Can anybody do the math? Let me know...

In SVW's case, for example, the MSN and all the other spiders and bots, used up about one third of our bandwidth, and provided less than 4 per cent of the views. And degraded the performance of the site for others.

Coming SF Event:

SVW Readers get the member price. Please contact Julie Crabill for details: jcrabill at shiftcomm.com.

Tuesday, November 8, 2005

Making a List: What's Hot and What's Not in Personal Technology

Walt Mossberg, Columnist, The Wall Street Journal
Kara Swisher, Columnist, The Wall Street Journal

Sponsored by Cadence Design Systems, Inc.

Tuesday, November 8, 2005

Time: Registration/Buffet: 6:00 PM Program: 7:00-8:30PM

Location: San Francisco Marriott, 55 Fourth Street, San Francisco

Registration Members: $60 Non-Members; $75


« The Google database is an attempt to accumulate a massive amount of content for free--just as the balance of power is shifting towards content owners | Main | Will search become less important? The future of the online bot armies . . . »


                   

October 28, 2005 | Permalink | Comment | Category: | Subscribe to SVW

Comments (4)

Tom,

I like SVW and agree with you that repeat visitors are better than one off search engine vistors. But my logs tell a different story than yours. Google provides roughly 30-40 % of my readership (rest mainly direct visitors and RSS readers). Rest of search engines are tiny in terms of traffic. But even at 30-40 % Google and other search engines are major providers of readers.

Content is king (in some senses) but I think that the most valuable position, in dollar terms, in the Internet landscape is the attention direction ability search engines like Google has.


"In SVW's case, for example, the MSN and all the other spiders and bots, used up about one third of our bandwidth, and provided less than 4 per cent of the views. And degraded the performance of the site for others."

2 comments. First, if the bots are causing havac, you should add a robots.txt file to your site. All of the major bots are very good about checking this file first, before crawling your site.

Second, I come to this site via bookmark, but, I originally found this via Google. 99.99% of my hits from bookmark. It would have been zero hits overall if not for the bots...


I take everyone's point about search generating regular readers...but at some point, the value of search compared with the resources it consumes, must outweigh the value. Also, the robot.txt file is very limited. Why can't I ban some bots and favor others?


Repeat visitors are certainly the key to building a successful web site, but they have to be a first time visitor first. Search engines are certainly a great source of traffic for sites like this one because people may stumble across it while looking for the answer to a question using search. Once they've arrived, they may become your next bookmarked visitor, RSS subscriber, del.icio.us bookmarker, etc.

BTW, you can selectively exclude / limit robots using robots.txt.


Post a comment