Finding your media on the Web

Most people know how to search textual information on the Web. Bradley Horowitz, Yahoo’s Director of Technology Development for Search, wants you to know they’re doing big things to make audio and video searching just as easy and satisfying. Bradley took a few minutes to to share his take on media searching on the Web and what Yahoo has in store for the future.

DANA GREENLEE: Can you give us a little background on your work at Yahoo?

BRADLEY HOROWITZ: I looked at what was happening in my backyard in Silicon Valley — the search wars that were brewing between Yahoo and Google and Microsoft, as a late entry — and I wanted to get a piece of that. I got to know Yahoo a little bit, and I looked at the DNA in the company from executive leadership like Terry [Semel] and Lloyd [Braun], who have deep roots in Hollywood — Terry was the chairman of Warner Bros. and Lloyd was the ABC television executive who greenlighted programs like “Desperate Housewives” and “Lost.” We had media people with deep connections who could really make things happen. Then what was surprising to me, because Yahoo has such a low-key reputation, was the level of research and platform within the company on the technology side. Yahoo Search was rolled out from a number of acquisitions: Inktomi, AllTheWeb, AltaVista, Overture and Fast.

GREENLEE: How is Yahoo indexing audio and video files?

HOROWITZ: We’re doing speech recognition. We take these media assets and run them through a speech recognizer and pull out descriptive text. The publisher of the content needs to describe the media assets and get them into an index: title, author, keywords, description — things that make it findable. There is a lot of meta data in a media file. Our strategy for pulling that out is pragmatic. One way to do that is context. When we bump into a MP3 file on the Internet, it typically doesn’t live in isolation just floating in space. It’s on a page and there is text around it. The titles of the MP3 files can give you clues as to their content. If there are links that link to that MP3 file, what is the anchor text for the alt tags of those links? There are a lot of techniques where we can deduce information via context.

GREENLEE: How did you envision audio and video search working, leading you to improve it?

HOROWITZ: One of the things that dictate to us our road map is what are our users using as traditional search methods that we provide today and how can we better fulfill those intentions? Audio search is a great example of that. When we looked at our query logs, we noticed a large percentage of searches are for music. When we stacked all of these searches, you see before your eyes a significant category. We ask ourselves if there is a better way we can fulfill that class of query than how people are getting it right now. When those kinds of stars align, that’s what sends us into the lab to build a product to meet that need.

To try out audio search, go to