Weekend Wrapper : More thoughts on GOOG database; the drag on internet performance from the bot armies; and SVW readers get discount to upcoming Churchill Club Mossberg/Swisher gadget fest!
By Tom Foremski - October 28, 2005
The Google database is the logical next step in the creation of a second, proprietary internet. Upload your content to the GOOG, the planet's largest computer.
Why host your own site? Just dump it into GOOGbase, the search algorithms will organize it for you, and for your customers/readers/users.
Content owners such as craigslist, won't dump their content into GOOGbase. But many smaller sites will--because they are desperate for clicks and get excited when the GOOGbot comes around every three months. No waiting--just dump it in.
I've often said to the fine Search Engine Optimization folk out there: Let the search engines optimize themselves to find the sites, it's their job; concentrate on optimizing for your customers/users--not the bot.
Now, no need for the bot. And no need for the SEO most probably. Because the GOOG knows what is located where--if it's hosting and holding the content.
And this also is the best way to stop the masses of online scams and fraud. Google can argue that it can track every bit of the bytes that cross its network, from start to finish. (It is a shame AOL went open--the right move at the right time, again ;-)
The burden of search sites on the internet
But, do we still need GOOG? Search engines don't drive much traffic to sites--direct bookmarks do and those are the ones you want. This is repeat traffic. You get them by optimizing for the user not the bot.
Let me say again: you do not want to rely on search engines to bring you your traffic. That traffic comes from people that don't know where you live or what you do. You want repeat traffic, from people that know you, trust you and know where you live. They don't come in through a search engine. So optimize for the user-not the bot.
BTW, I would love to see how much the bots, crawlers, scrapers etc are using up in terms of internet bandwidth. I wonder if anybody has done a study of the amount of traffic and server performance degradation that they cause?
It would not surprise me if the number is gobsmackingly high. Can anybody do the math? Let me know...
In SVW's case, for example, the MSN and all the other spiders and bots, used up about one third of our bandwidth, and provided less than 4 per cent of the views. And degraded the performance of the site for others.
Coming SF Event:
SVW Readers get the member price. Please contact Julie Crabill for details: jcrabill at shiftcomm.com.
Tuesday, November 8, 2005
Making a List: What's Hot and What's Not in Personal Technology
Walt Mossberg, Columnist, The Wall Street Journal
Kara Swisher, Columnist, The Wall Street Journal
Sponsored by Cadence Design Systems, Inc.
Tuesday, November 8, 2005
Time: Registration/Buffet: 6:00 PM Program: 7:00-8:30PM
Location: San Francisco Marriott, 55 Fourth Street, San Francisco
Registration Members: $60 Non-Members; $75
« The Google database is an attempt to accumulate a massive amount of content for free--just as the balance of power is shifting towards content owners | Main | Will search become less important? The future of the online bot armies . . . »
October 28, 2005 | Permalink | Comment | Category: | Subscribe to SVW
- Top Stories:
- Tech Awards For Benefiting Humanity
- The Death Of The Search Algorithm? Techmeme Has Six Editors
- TEDxSF - Little TED Just Like The Big TED
- SNCR Research: Social Media IS Influencing Business Decisions
- What's Next? Beyond Real-Time...
- PearlTrees: A Novel Approach To Human Mapping Of The Internet
- MediaWatch Analysis Part II: Google Has More To Lose Than Murdoch
- MediaWatch Analysis: Murdoch Will Negotiate Payment For Access To Basket Of Content With GOOG et al
- WeekendWatcher: The Sheer Number Of Things Will Devalue Them
- ChipWatch - Where Will The Next Generation Of Engineers Come From?
- Public Healthcare Could Cut Startup Costs And Help Spur Innovation
- Is GOOG's $750m AdMob Buy Strategic Or Dumb? An alternate view...
Comments (4)
Tom,
I like SVW and agree with you that repeat visitors are better than one off search engine vistors. But my logs tell a different story than yours. Google provides roughly 30-40 % of my readership (rest mainly direct visitors and RSS readers). Rest of search engines are tiny in terms of traffic. But even at 30-40 % Google and other search engines are major providers of readers.
Content is king (in some senses) but I think that the most valuable position, in dollar terms, in the Internet landscape is the attention direction ability search engines like Google has.
Posted: October 28, 2005 5:42 PM
"In SVW's case, for example, the MSN and all the other spiders and bots, used up about one third of our bandwidth, and provided less than 4 per cent of the views. And degraded the performance of the site for others."
2 comments. First, if the bots are causing havac, you should add a robots.txt file to your site. All of the major bots are very good about checking this file first, before crawling your site.
Second, I come to this site via bookmark, but, I originally found this via Google. 99.99% of my hits from bookmark. It would have been zero hits overall if not for the bots...
Posted: October 29, 2005 9:59 AM
I take everyone's point about search generating regular readers...but at some point, the value of search compared with the resources it consumes, must outweigh the value. Also, the robot.txt file is very limited. Why can't I ban some bots and favor others?
Posted: October 31, 2005 2:07 PM
Repeat visitors are certainly the key to building a successful web site, but they have to be a first time visitor first. Search engines are certainly a great source of traffic for sites like this one because people may stumble across it while looking for the answer to a question using search. Once they've arrived, they may become your next bookmarked visitor, RSS subscriber, del.icio.us bookmarker, etc.
BTW, you can selectively exclude / limit robots using robots.txt.
Posted: December 20, 2005 12:39 PM