Autonomy CEO says tags don't work

By Tom Foremski - April 6, 2007

Michael_Lynch.jpg Mike Lynch, the CEO of Autonomy, the UK enterprise search company, was in town this week (video is coming). I used to meet with Mr Lynch regularly when I was at the Financial Times, and when the dotcom boom was in full swing.

The dotbomb affected Autonomy along with tens of thousands of other companies, and Mr Lynch's visibility suffered too. So it was good to catch up with Mr Lynch and we joked that the "bubble" was back (it's not, of course).

We chatted about some of the trends in Autonomy's space, which is how to deal with unstructured data--estimated at about 87 per cent of all data.

Autonomy's technology uses statistics to find correlations between data, documents, video, and audio. It doesn't need artificial intelligence to understand the connection between data--it understands the probability of that data being related.

Autonomy uses Bayesian probability developed by Thomas Bayes, an 18th century British Presbyterian minister. The advantage of this approach is that Autonomy can find stuff without knowing any keywords or tags or taxonomy--it can determine the taxonomy on the fly.

During a presentation, Mr Lynch slammed the popular practice of tagging web content and says that it won't help to organize information. Mr Lynch quoted an essay by Cory Doctorow, the science fiction writer, titled Metacrap. "Tags don't work because people lie, they are lazy, and they use different tags. And there is a huge amount of information that will never be tagged."

I agree, and I resent the work in having to tag everything (See: Is search broken?) But, tagging does work to some extent, and in some applications.

Autonomy's technology could be used to improve "tagging." Often, it is not clear what tags to apply--technology such as Autonomy's could help identify the appropriate tags. For example, are the Technorati tags at the end of this post the right ones to use to associate this post with other, similar posts?

...

Earlier this week, Autonomy introduced a new product called Virage ACID (Automatic Copyright Infringement Detection) which uses its technology to search through video images. It can automatically detect copyrighted videos.

 Being totally independent of media format, ACID can not only detect whether distributed video infringes copyright, but also whether audio content ripped from a copyrighted video or audio track that is overlaid on legitimate video has been uploaded.

Seems like it was designed for YouTube...

----

» Is search broken Tom Foremski IMHO ZDNet.com

Autonomy

 

,

 



Share this article

 Subscribe in a reader

By Tom Foremski - April 6, 2007 | Permalink | Comment | Category: Search Watch
| SVW Toolbar | SVW Newsletter | SVW Mobile

Comments (8)

Well, I'd have to disagree, although I do see how hard it is to apply this to all the information on the 'web. We use http://www.taglocity.com to tag our email data as it's a place that (a) really reaps the benefits and (b) is intrinsically 'social'.

Like Autonomy, Taglocity uses Bayesian statistics to 'AutoTag', and this works very well. You can use the product for free too.

More info/review here: http://online.wsj.com/article/SB117038342247195757.html


hi tom -- so, tags don't work? Right..., and wikis don't work, blogs don't work, and - as we roll back in time - no new technology works, especially if it disrupts a technology *I* sell. I remember when "PCs don't work."

One good measure of whether a new Web-based technology works is what on might call the "time-to-a-million users" measure. How long do you think it took before a million users globally worked in wikis (starting in '95 when Ward Cunningham invented them)?

How long between the invention of weblogs (start where you like - one could imagine a number of origin points) and 1 million blogs?

Now, how long between the launch of del.icio.us and 1 million users?

I'd wager del.iciou.us got to a million users faster than blogs. And, I'm sure it beat the wiki growth rate easily.

Yes, tagging works -- that is why people use it. Because it works.


Mike Lynch's comments are the equivalent of arguing that "motorised transport doesn't work", because of the number of accidents.


Ged:

Tom, Mike's comments needed a more detailed rebuttal than a comments box would allow so I have posted here: http://renaissancechambara.com/blog/2007/04/07/256/


Arguing that tags "work" or "don't work" without specifying exactly what they're supposed to accomplish is a waste of time.

Are tags useful? Yes.

Can you depend on tags to find everything you need to find? No.

Human-generated tags lack standardization. If I post about my cat, I might tag it with "Cat". You might write about your pet and tag it "Kitten", or "Feline".

Someone looking for blog posts about "pets" would find neither.



Phil Manchester:

Buckminster Fuller - father of domes, synergy and a million other wonders - saw capitalism moving through several phases - industrial, merchant, accounting and legal (see Critical Path). The final phase of Lawcap is perfectly enshrined in ACID - Lynch's latest manifestation of IP paranoia... Let's scan everything to make sure it does not contain one iota of anything that has gone before. It will all, inevitably, end it tears. Such a pity so much effort is being expended on such worthless endeavours....


Tom Foremski:

Lot's of good comments on Mike Lynch and tagging...

I think there is room for many approaches to finding stuff. And Mike Lynch did concede that some cutomers, such as BAE are using Autonomy to improve tagging, and to tag untagged data. But, it's clear that if you were to rely only on users tagging content, it would not be good enough. Autonomy's technology could also be used to "clean up" tags by helping to remove spam tags, and to provide more appropriate tags.


Hi, I am looking for Mike Lynch email address. Need to discuss some business opportunities in China.
Tks.
Savio


Post a comment