Today is my last day with Search Engine Watch, with me heading to my new digs at Search Engine Land tomorrow. I wanted to wish Search Engine Watch all the best going forward, plus help readers understand some of the changes that are happening. To do that best, I thought I'd go all the way back to the beginning, to the birth of Search Engine Watch.
In case you missed it, My Decade Of Writing About Search Engines from earlier this year covers how I got into reporting on search engines in the first place. Information posted as part of my web development work in 1996 expanded and relaunched as Search Engine Watch on June 9, 1997 (that's it in the image above). It rapidly drew more attention and traffic, in no small part due to Eric Ward's fantastic way of getting news around.
Later that year, I was approached by Meckermedia (then renamed Internet.com, INT Media and Jupitermedia) about buying the site. I decided to sell to them on November 19, 1997. It meant the site could grow and I could stay firmly focused on the editorial development, which is my passion. I stayed on, contracted to be editor.
Two years later, the first companion conference to the site was held, Search Engine Strategies in San Francisco on November 18, 1999. I produced the content for that event as a contractor and have since continued to produce the major shows in the US, as the series has grown.
Last year, both the site and the conference series were sold to the current owner, Incisive Media. For 2007, we didn't agree on contract renewal terms, which resulted in me last August announcing my departure from both SEW and SES.
I'm happy to say that further talks resulted in me staying on to do SES show in the US in 2007. I will chair the SES New York 2007 event, then cochair the San Jose show and take part in Chicago at the end of 2007 as a speaker and moderator.
Search Engine Watch was a different matter. I felt it was better for me to go off on my own, which is what I'm going to do. In some ways, I'm leaving my baby behind. But the baby's pretty grown up now!
I joked with my managing editor Elisabeth Osmeloski that I'll likely become one of the top traffic referral sources to Search Engine Watch, since I'll be mentioning stories I've done in the past over here. But it won't be only past stories that I'll be referring to. If there's good content on Search Engine Watch, I'll be mentioning it and talking about it, just as I've always done for any web site even if it might have been seen as a competitor to SEW by some.
As I said earlier this year:
Whatever I do, I've tried to make it a hallmark to always to be inclusive of content, people, web sites or organizations that will help my readers, even if I might technically be competing with them. Whatever I end up doing, you can expect I'll still be pointing at Search Engine Watch as appropriate and wish those that remain a part of it the very best.
That remains the case!
My goodbye is less tearful because writers I've worked with day-in and day-out are joining me at Search Engine Land. Barry Schwartz (he told me to say goodbye to everyone), Phil Bradley (despite having a name that doesn't end in S), Bill Slawski, Jennifer Slegg, Brian Smith and Greg Sterling will be writing with me from December. Chris Sherman joins us in January. I'm naturally thrilled to continue working with them.
Elisabeth, who I mentioned already, stays on here at Search Engine Watch as managing editor and is working on plans with Incisive to take the site into its new life without me at the helm, a new generation for Search Engine Watch. She'll be along later with a post of her own on this.
I am saying a sad goodbye to my days administrating and moderating the Search Engine Watch Forums. In just over two years, an incredible community has sprung up over there, with nearly 15,000 members.
Earlier this week, I said a private goodbye and thank you to the hard-working moderators that have nurtured the community over this time. I'll share part of that to underscore what I said earlier about being inclusive:
I have absolutely no intention of going over to the new place with any type of "us versus them" type of attitude. I've always tried to be inclusive of good content and communities regardless if they might be seen as competitive to SEW. At SEL, I plan to continue the same. If there are good discussions here, I'm going to be pointing at them. If there are good opportunities for the mods with SEW, I honestly want the best for you. By no means do I want anyone thinking that staying on here, or perhaps doing other things with SEW, is somehow something I won't like or perhaps "disloyal" in any way. I don't know if anyone was even thinking like that -- but if so, don't!
That's pretty much it. I'm going to finish my last day doing a bit of blogging, do my last monthly newsletter, then I'm giving Elisabeth a virtual hug and dropping my keys off at the virtual door.
Any comments, please feel free to add them to this thread at the Search Engine Watch Forums, Best Wishes, Search Engine Watch!
Posted by Danny Sullivan at 8:30 AM | Permalink
Google has purchased the garage where the company developed after its initial birth at Stanford University. Actually, they've purchased the home of Google vice president of product management Susan Wojcicki. Before she became a Google VP, Wojcicki rented the garage attached to her home to Google cofounders Larry Page and Sergey Brin. Google buys garage that launched Internet's top search engine from the Associated Press has details about the sale, which was probably in the $1.2 million range. As of yet, Google doesn't know what exactly it may do with the home the article reports. It's already a tourist attraction, it seems.
Posted by Danny Sullivan at 8:23 AM | Permalink
Google put up a special birthday logo for themselves today, since today is Google's 8th birthday. Or is it? In last years post, we asked Didn't Google Just Have A Birthday? We quoted the When is Google's birthday? from Google's support page, which said back then:
Google's official birthday is September 7, 1998. If Google were a person, it would have started elementary school late last summer (around August 19), and today it would have just finished the first grade. In other words, we're just getting started. To discover more about Google's history, please visit http://www.google.com/intl/en/corporate/history.html. To learn about our mission, please see http://www.google.com/intl/en/corporate/index.html
Today, it says;
Google opened its doors in September 1998. The exact date when we celebrate our birthday has moved around over the years, depending on when people feel like having cake. For more on Google's history: http://www.google.com/corporate/history.html
So maybe today is Google's birthday and not on 9/7 or some other date. In any event, happy 8 Google!
Postscript: Google just blogged about the birthday here.
Posted by Barry Schwartz at 8:23 AM | Permalink
Perspective: The man who would be Sergey from News.com talks with Gary Culliss, formerly of Direct Hit, on cashing out of search early on. Google and Direct Hit came along at the same time (see Counting Clicks and Looking at Links from me in 1998). Ask Jeeves bought Direct Hit, making the original group involved with it a good chunk of money. But Direct Hit effectively died as a brand and a technology while Google....
I disagree with News.com that in 1998, Google was somehow lumped in with "non-household name" sites while Direct Hit was the shining hope. They both got a lot of attention, but Google very quickly surpassed Direct Hit as the wunderkind to watch. A little bit of history and reflection, in this Q&A.
Posted by Danny Sullivan at 10:19 AM | Permalink
Ten years ago today, I first starting writing publicly about search engines. If we had blogs back then, I suppose I would have been a search blogger. But we didn't. We hand-coded our HTML, walked through the snow for eight miles to FTP files to our web servers, and we liked it :)
My involvement with search engines goes back to my first year as a student at the University Of California, Irvine in 1983. No, I wasn't part of the university's highly regarded information and computer science department. Instead, I was an English major -- and a pretty bored one for my first two months, when I had to commute until getting on-campus housing.
I spent some time exploring the library, having been a big library user since I was a child. The library had a magical electronic card catalog called Melvyl (named for Melvil Dewey, who created the Dewey Decimal System). For fun, I'd do Melvyl searches for broad topics such as history, art or love, to see how many matches would come back. I could routinely crash the search routine by doing this.
The system would diligently try, telling me it would take 50, 60, 80 search cycles, and then the countdown would begin. Some searches would eventually get through all cycles and give me a matching results count. Often, the system would just give up as the countdown approached the teens.
My 1996 Study
Search engines remained fascinating to me when I reencountered them in 1995. I'd left working as a newspaper reporter to go into web development, since I didn't want to miss out what was obviously going to be the future of publishing. As the general manager of Maximized Online, my job was to help get people in the Orange County, California area online. We'd build web sites, get them publicized to search engines and other publicity venues, plus host them.
One of our clients was upset at the end of 1995 that his OC jobs site wasn't ranking tops for a search on "orange county" in WebCrawler. We didn't have a good answer to give him. We'd done the submission, made use of the meta tags the search engines said to use, but why exactly a site would rank well wasn't well known. So I decided to look into it.
I spent January through April 1996 making changes to the InfoPages directory that my company maintained, a search engine just for Orange County web resources, to see if it could rank better in a search for "orange county." I tried putting those words in the body text, the title tag, in the meta tags and also checked to see if spamming helped, if repeating the word over and over would have an impact.
I published the results online, and 10 years on, a lot of the advice remains exactly the same. Don't depend on ALT text. Don't fixate on only one or two terms, because there are many ways people will seek you -- a long tail before we had talk of long tails. Build links, because links can send you traffic. And don't fixate on getting traffic just from search engines. The conclusions from that study are below for those really interested; others can jump past for the rest of this article.
There don't seem to be any magic methods that will make a page appear at the top of every search engines' listings. There's too much fluctuation on the web for any page to claim a foothold, and all the engines handle relevancy slightly differently. However, there are some general tips that do help a page appear more relevant.
A Webmaster's Guide To Search Engines
Along with the study, I also published a collection of documents called "A Webmaster's Guide To Search Engines." My goal was to help site owners better understand the essentials of being found plus identify which search engines really mattered. Knowing who mattered was crucial when you'd have some search engines like Galaxy forcing your through a three part, multiple question submission process to be included in their directory. Was spending all that time worthwhile? (For Galaxy, the answer was no!).
The guide provided links to the FAQs of each search engines, along with my own observations about whether how each search engines said it worked actually lived up to reality. There was a guide to which search engines I considered to be "major" or most important to site owners and searchers alike. I had a "Strategic Alliances & Victories" chart to show which search engines had deals with the Netscape or Internet Explorer browsers and which had gained positive reviews in magazines.
The information I published quickly generated a lot of positive feedback, both from site owners and searchers such as librarians. At the same time, the web development company I worked for closed, so that the parent firm could concentrate on web software development. I hung out my internet consultant shingle and kept maintaining the Webmaster's Guide on a part time basis, sending out a newsletter update (The Search Engine Report) twice that year, along with making further site updates.
In 1997, I moved to the UK from California, so my wife could be closer to her family. I also began spending more and more time on the site, as well as writing freelance articles on search for various publications. In the middle of the year, I rebranded the site as Search Engine Watch, which generated more attention. By the end of the year, Mecklermedia purchased the site from me, and I continued on as editor of it.
The Search Revolution
Ten years on, I remain as fascinated with search engines as ever. I've been fortunate to help chronicle the birth of an entirely new advertising medium. Equally important has been the birth of an entirely new way for people to seek out information.
I knew search engines were important when I decided to write about them. The journalist in me could see they were a good story, especially when you realized that under the hood, they weren't doing things like crawling as often as people widely believed. But a study by Keen in 2001 especially resonated with me. Search engines (as a whole -- we weren't Google obsessed yet then) were the single most likely way people would seek information.
The study was small, but the findings were still stunning. In only about five years, search engines had ousted things like friends, family, books, magazines, libraries and other perfectly good resources for seeking answers.
Some of this was bad. I'd personally watched people when doing search training spending ages trying to find a phone number, when a call to telephone information would have found much faster. Old but still useful search strategies were abandoned in favor of the magic search box.
Lots of this is good. Search engines remain amazing tools that get us the right answers quickly in many circumstances.
Looking Ahead
Will I still be doing this in 20 years? Almost certainly not, at least not in the daily grind format I've been doing. I'd like to keep writing about search issues, but eventually I'll move away from the regular day-to-day coverage to perhaps focus on less frequent but deeper looks at particular search issues.
I'm also thinking a lot about doing a book these days. I'd always wanted to do a book on search, indeed the exact type of history that John Battelle did a fantastic job with in The Search.
Since that's come out, I've thought more and more about doing a more personal retelling of web search history -- the evolution, developments and trends I've seen from having been in the trenches of covering them over the years.
I'd also like to do a separate one talking to various search marketers, spotlighting them and focusing on how that medium has evolved over the years and where it will be going. The most fascinating book idea remains the impact of search on our everyday lives, how people make use of them, how habits have changed, our laws are starting to account for the power of search and many related issues like that.
Someday! What I can say is that for the near future, I expect to remain working on the site and coverage as I have, bringing some of our standing content back up to date, which I know has been neglected due to the need to cover the news that continues to flow in. My original Webmaster's Guide helped many understand search engines, and I very much want to ensure Search Engine Watch remains as a leading resource doing that in the years to come.
Looking Back
I don't have a succinct list of big picture items or "high order bits" to offer. A lot of this has already been covered in things I've written, so instead I'm going to spend some time recapping pieces I think are most important below. These are either big trend pieces I've done or big shifts in the search landscape I think worth noting.
I know -- I KNOW -- I've left some things out. My apologies, if so. It's a bit easier for me to cover all the things I've written that what me or Chris Sherman both have done, and he's clearly covered tons himself. Plus, skimming through 10 years worth of writings means I'll accidentally miss stuff. If you want to go poking yourself, I'll give more tips after the summary.
The biggest overall theme in doing the recap is how that big old wheel keeps spinning around and around, with people often buying hype because they don't remember things have come before -- or marketers making errors because they don't understand issues that were explored already in the past.
I've definitely felt myself getting more and more jaded. Part of that's bad, because there are cool, new things that I don't want to be blinded to. But then again, you go through the list below and tell me if you don't emerge feeling a big jaded about some ideas and concepts that are retreads.
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006 (To Date)
As I said earlier, I know I've missed stuff. Unfortunately, there's no easy way for me to see everything written on Search Engine Watch over the years in one single list. For those who wish to explore, the Search Engine Report archives are probably the best thing to review. Each month, there's an issue of the Search Engine Report that recaps virtually everything of importance that was published on the site. You can also see the SearchDay archives and the SEW Blog archives, though material from both of those places is integrated into Search Engine Report mailings.
Thanks
Finally, some thanks....
Want to comment or discuss? There's a thread going at our Search Engine Watch Forums.
Posted by Danny Sullivan at 1:05 PM | Permalink
Once a potential rival to the Open Directory, LookSmart had decided to close its volunteer-driven Zeal directory at the end of this month. LookSmart acquired the service in 2000. Though the Zeal site itself is still accepting registrations, here's the closure message LookSmart's sent out:
Dear Valued Zealots and Contributors,
Thank you for being a part of the Zeal community and contributing your time and knowledge to the Directory. After trying to put the deserved resources behind Zeal, we have made the conscious decision to shut down Zeal.com. On March 28, 2006 Zeal will no longer be available. We are not selling Zeal.com and have no future plans for it at this time.
We think that avid Zeal users will appreciate the large, interesting and vital community at Furl.net (www.furl.net). Furl is an online book marking service that helps you save information that’s important to you, share it with others and see what others are saving and find important.
With Furl you can:
Whether you are browsing the Web, travel planning, recipe sharing, house hunting or researching, Furl is a wonderful tool. Join the thousands of people that are using Furl every day to find, save and share information that is important to them.
Postscript: Chris Fillius who was involved with Zeal early on shares some thoughts on its demise in Zeal.com, R.I.P. at his Search Lounge blog, plus Guest Obituary for Zeal, by Alice Swanberg there has some further thoughts on the closure.
Posted by Danny Sullivan at 3:31 PM | Permalink
Sad news that Paul Flaherty, one of the founders of AltaVista, passed away on Friday. I don't have more details at the moment, other than that there was to be a wake and funeral for him today. The news came to me from Jeff Black, a former general manager of AltaVista. I'll postscript more below as I hear. FYI, Richard Selzter has a nice guide to some early AltaVista alumnus here. An old home page for Paul is here. Chris Sherman has some early history of AltaVista here.
Postscript: More information on funeral arrangements, Paul's background, family donations can now be found here.
Postscript by Barry: On March 26th, the KTLA posted this update on Paul Flaherty.
AP also has coverage here.Want to comment or express condolences or thoughts? You can also visit our SEW Forums thread, An AltaVista Founder, Paul Flaherty, Passes Away.
Posted by Danny Sullivan at 7:02 PM | Permalink
Posted by Danny Sullivan at 10:45 PM | Permalink
Jeeves, the butler mascot for Ask Jeeves, is retiring, with the search engine slimming down to its long used but little promoted ask.com domain name. More about Jeeves' retirement, including highlights of his career, in today's SearchDay article, Jeeves Retires.
Posted by Chris Sherman at 5:16 AM | Permalink
Battelle over at Searchblog reports that two sources have told him that Lycos may have let most of its search team go and will keep just a skeleton crew in place to run the service.
For the search historians out there, the Lycos crawling technology was originally developed at Carnegie Mellon University in Pittsburgh and then entered into an exclusive licensing deal with CMG@Ventures on June 20, 1995. Here's the original press release. The news release credits Michael Maudlin as the lead developer of the Lycos spider. He is also listed as the inventor on two U.S. Patents. An interesting read by Maudlin from 1997, "Lycos: Design choices in an Internet search service," is available here.
Postscript: If you would like to read more early web search announcements, here's a compilation of them I put together several years ago.
Postscript by Barry: Loren Baker reports that Lycos Responds to Rumors of Abandoning Search. Where he reports on a letter received by Jim Hedger from Lycos stating; "Lycos has had a recent restructuring, which did involve members of the search team. But Lycos is NOT divesting in Search and has not abandoned search." More information at Search Engine Journal.
Posted by Gary Price at 5:34 PM | Permalink
For your, "the importance of search marketing folder."
An article from DMNews: FTD: Online Search Misstep Cost Sales, points out that FTD (the flower delivery people) said that not doing more search engine marketing during the Christmas season caused the consumer end of FTD not to do as well as they had planned. The comments were made on Wednesday when FTD announced their quarterly earnings.
From the statement (full text here): The consumer business's order growth for the 2005 Christmas season was below expectations [because of] our decision not to pursue high-cost order volume associated with online search," FTD president/CEO Michael J. Soenen said in a statement. "In anticipation of continued competitiveness in the online search environment and to better manage the consumer segment business going forward, we have made management changes within this segment including the replacement of our head of marketing."
I wonder if the now replaced head of marketing received some flowers to make him or her feel better? (-:
Want to comment or discuss? Visit our SEW Forums thread, FTD's Head Of Marketing Replaced Over Failure In Maximizing Search.
Posted by Gary Price at 12:06 AM | Permalink
Those of you who track Google's each and every move might be interested to learn that the company has just updated their Google Milestones page that serves as a brief corporate history with more key events from 2005. They've also already added few events from 2006 (Google Video Store, Google Pack, and Larry Page's keynote at CES).
Posted by Gary Price at 11:08 AM | Permalink
Today, metasearch tool Dogpile begins celebrating its 10th birthday. Congrats! Here's the Dogpile home page from December of 1996 and a USENET announcement from its original developer, Aaron Flin.
A decade is a long time both for technology and for me (I had a lot more hair on my head back then). These days, as many of my posts reflect, I'm a strong believer in meta/federated search concept and what it can potentially offer the searcher. It's still far from perfect but the technology from various providers is getting better all of the time.
To celebrate their 10th birthday, the Dogpile team has put together a page of their "favorite combinations." It's a fun list and they invite users to submit their favorites.
Examples: Spoon + Fork=Spork Turkey + Duck + Chicken=Turducken
Posted by Gary Price at 2:16 PM | Permalink
Tomorrow, the Google Library Project will be one year old. It's been quite a year of news and controversy. Here's a link to our first SearchDay article about the Google Library Project from December 14, 2004. In this article, I made sure to mention other digitization projects like Project Gutenberg that have been around since 1971.
What's crucial to remember is that Google Print (recently renamed Google Book Search) itself was around before Google's announcement to digitize the full or partial holdings of five large university libraries.
Let's review.
Google Print for books (materials direct from publishers) was opened "widely To publishers" on October 6, 2004. However, the existence of Google Print goes back even before that to December 17, 2003, when Google began offering book searches. The original Google Book search indexed, "only a small excerpt from each book, typically taken from the inside cover, jacket reviews, author biographies or the book's introduction.
To this day, Google Book Search (the material direct from publishers) and the Google Library Project are frequently confused. SEW BLOG has tried to make the differences clear since day one. Recently, Danny did a great job of explaining the important differences in his post: Once Again -- The Difference Between Google Print & Google Library. This post also contains links to many other articles about the project, digitization, and opinion about copyright issues.
The remainder of this post will offer a few key posts about the project (yes, I could have included more), a timeline of sorts, from the past year, along with links to some other Google Print/Book Search/Library Project related documents.
+ Questions & Answers Recap On Google Library This Info Today article by Barbara Quint is loaded with details about the project.
+ France, Google & The Need for Digitization Project Cooperation
+ Copyright Questions On Google Digitization Project
+ Some Publishers Not Happy With Google's Library Digitization Program
+ Google Library Digitization Agreement With University Of Michigan Now Available
+ The Digitization Of The Library
+ More On Publisher Concerns On Google Library Project
+ Google Gives Publishers Opt-Out From Library Scanning Project; One Group Still Not Happy
+ More Publishing Trade Groups Weigh In On Changes to Google's Library Scanning Project
+ Legal Experts Say Google Library Digitization Project Likely OK; Will It Revolve Around Snippets?
+ Breaking Down The Google Print 5 Libraries An article from Digital Libraries.
+ Google's Library Scanning Project Heads to Court
+ A New Alternative to Google Print Say hello to the Open Content Alliance!
+ Google Print Press Review & Just A Bit About Search Inside the Book This post includes a link to Eric Schmidt's op/ed column in the Wall St. Journal.
+ Association of American Publishers Sues Google over Library Digitization Plan
+ Great Google Print Controversy Bibliography Includes link to, "The Google Print Controversy: A Bibliography" by Charles W. Bailey, Jr. Impressive!!!
+ Microsoft Announces MSN Book Search; Joins Open Content Alliance
+ Google Gears Up to Resume Book Scanning
+ Google Print Now Publishing Out-Of-Copyright Works Gained Through Library Scanning Program
Yes, it has been quite a year and I could have listed more reports. I'm sure year two will have as many, if not, news stories, court hearings and events as the The Google Library Project's first 365 days.
Postscipt: I've recently posted about two other services, currently available, that offer the full text (no limit on how much you can read) called ebrary and NetLibrary. Looking for public domain full text books? Visit, "Public Domain Books: More than 25,000 Full Text Books in a Single Database."
Want to comment or discuss? Visit our SEW Forums thread, Google's Library Project One Year Old Already.
Posted by Gary Price at 3:50 PM | Permalink
How about some web search industry history? Long before two major search engines, Google and Yahoo offered Google Answers and Yahoo Answers, their human powered question answering services and/or communities, another major engine offered something very similar. It was called Answer Point and came from Ask Jeeves. It existed (it's long gone) around 1999-2000. This page (via the Wayback Machine) has it looking similar to what we see today from others.
Answer Point appears to have been a free service to ask questions and with the help of other people have them answered. The service had its own logo and used the slogan: "The Ask Jeeves Answer Point is the place where you can ask and answer questions. Have a question? Post it! Know the answer? Post it!"
Subcategories were moderated.
The last archived version of the site I could find in Wayback was from late 2001. Answer Point also allowed users to register to become Answer Point "enthusiasts" and receive points and rankings for how many times they posted. Something we recently said is not the best idea. You could even personalize with "My Answer Point."
Before I go any further, I'm well aware of other question answering services involving humans. This post is not to recollect about all of them. My point here was just a bit of web history and to note that another large web search company was doing something similar years ago.
I've sent a note to Jim Lanzone, Senior Vice President of Search Properties at AJ, and asked if he could provide a bit more background about the service itself. What kind of usage (not much would be the answer I would bet on) did AP receive? Why and when was the plug pulled? If Jim, sends a note back, I'll add it as a postscript.
I would like to encourage you to take a look (if you haven't already) at my post about "other" question answering services that discusses what libraries of all types offer "virtually" in terms of question answering without having to visit the library itself. In many cases we're talking 24x7 access.
Finally, a point I failed to make in that article was the recent launch of a nationwide "virtual reference service" in the UK called "Enquire."
More about it here. Click the Enquire button. Enquire is part of The People's Network. Btw, Australia also has a national virtutal reference service named Ask Us. Again, more about these services and others in my other post. Postscipt: Well, it seems the Jim Lanzone from Ask Jeeves reads his email on the weekend. What follows are his comments about Answer Point.
Ask Jeeves' AnswerPoint operated from early 2000 through May '02. AnswerPoint wasn't a failure, nor a smashing success. At that point in Ask's turnaround we simply had to make choices, so we shut AnswerPoint down (among other things) to focus our energy on things like Teoma and Smart Answers. The user base of AnswerPoint was actually pretty upset about it: they were a very small, but very loyal group, which made it a difficult decision for us. Ironically, as I recall, Google Answers launched the same week that we shut down AnswerPoint. I commend Yahoo for joining sites like Wondir in trying the free model again. Beyond the obvious issues like spam, I can share a few challenges with community-driven question-answering that we experienced. First, as a free service, there was little incentive for people to answer other people's questions. I think the dynamic of question-answering is/was different than other user-generated content. With user reviews, like those found on Amazon, TripAdvisor or Citysearch, people are playing "critic", a long-standing model from newspapers and magazines. With Wikipedia, participants are creating specialized content, in one centralized location, for the masses to consume. With De.icio.us and Flickr, tagged items are made public, but the initial motive is borne at least somewhat from self-interest: organization of bookmarks and photos. With question-answering, on the other hand, it takes a true good samaritan to spend the time to provide answers to one-off questions for people you don't know. (And an even better samaritan to perform this good deed repeatedly, over time, for free.) Meanwhile, if you do it for ego, your answers get lost in the system pretty quickly. So neither motive was that compelling. We observed that only a small group of "experts" took the time to answer questions for others. Secondly, if not enough people provide answers, then you can't answer enough questions. This is a problem when search has such a long tail of queries, as we showed at Web 2.0. Most searches are unique. This is why search engines are so useful, even though relevance is far from perfect: we can cast a very broad net. The notion of waiting for an answer is also in conflict with one of the biggest user needs in search: speed. Most things that people search for are things they want an answer to, or a solution for, almost immediately. In theory people will put in more effort to get a better answer, but in practice they seldom do. For example, 30% of users surveyed say they want advanced search, but only 1% of them ever use it. The same thing applied to AnswerPoint. It was usually just faster and easier for people to search normally, iterating on their searches, than to submit a question to the community and wait for an answer. Lastly, there's the reason we created Smart Answers in the first place: people like to search from one box. Getting them to head to a different part of our site for results is always an uphill battle for any engine. It's true that there are subjective answers out there that search engines are not (yet) able to respond to accurately. And sites Yahoo's own Groups product, started by our own Mark Fletcher, have proven that communities can generate valuable information in search. It will be interesting to see if and how community-based "answers" search can evolve to plug the gaps that exist.
Note: In many situations the "good samaritan" Lanzone writes about might be the subject expert (via an AskA service or a librarian (via a virtual reference service).
Postscript 2 (from Danny): Along the same lines as Ask Jeeves, LookSmart Live was an online answer service that was born in 1999 but quietly died. More about the service from when it launched is here, LookSmart Live Looks-Up Answers.
Posted by Gary Price at 6:09 PM | Permalink
Andrei Broder, former vice president of research at AltaVista and until recently Distinguished Engineer & CTO, IBM Research, is joining Yahoo as research fellow and vice president of emerging search technology at Yahoo Research, according to this News.com article.
Broder has been involved in a wide-range of research activities related to the web and information retrieval, including the famous "bow-tie" study of web size and connectivity, and the web archaeology project together with other-well known researchers Krishna Bharat (Google news) and Monika Henzinger (Google research).
Postscript from Gary: Here's a list with a few research papers and articles that Broder has authored or co-authored that might be of interest.
Title: A Taxonomy of Web Search Author: Andrei Broder Source: ACM SIGIR Forum 8 pages; PDF Abstract: "Classic IR (information retrieval) is inherently predicated on users searching for information, the socalled "information need". But the need behind a web search is often not informational -- it might be navigational (give me the url of the site I want to reach) or transactional (show me sites where I can perform a certain transaction, e.g. shop, download a file, or find a map). We explore this taxonomy of web searches and discuss how global search engines evolved to deal with web-specific needs."
Title: Sampling Search-Engine Results Authors: Aris Anagnostopoulos, Andrei Z. Broder, David Carmel Source: WWW 14 Conference (2005) 12 pages; PDF. From the abstract: We consider the problem of efficiently sampling Web search engine query results. In turn, using a small random sample instead of the full set of results leads to efficient approximate algorithms for several applications, such as: • Determining the set of categories in a given taxonomy spanned by the search results; • Finding the range of metadata values associated to the result set in order to enable “multi-faceted search;” • Estimating the size of the result set; • Data mining associations to the query terms. -- Title: Sic Transit Gloria Telae: Towards an Understanding of the Web's Decay Source: WWW 13 Conference (2004) Authors: Z. BarYossef, A. Broder, R. Kumar and A. Tomkins 10 pages; PDF. From the Abstract: "The rapid growth of the web has been noted and tracked extensively. Recent studies have however documented the dual phenomenon: web pages have small half lives, and thus the web exhibits rapid death as well. Consequently, page creators are faced with an increasingly burdensome task of keeping links up-to-date, and many are falling behind. In addition to just individual pages, collections of pages or even entire neighborhoods of the web exhibit significant decay, rendering them less effective as information resources. Such neighborhoods are identified only by frustrated searchers, seeking a way out of these stale neighborhoods, back to more up-to-date sections of the web; measuring the decay of a page purely on the basis of dead links on the page is too naive to reflect this frustration." -- Title: Towards the next generation of enterprise search technology Authors: A. Z. Broder and A. C. Ciccolo Source: IBM Systems Journal (2004) Abstract: "Unstructured information represents the vast majority of data collected and accessible to enterprises. Exploiting this information requires systems for managing and extracting knowledge from large collections of unstructured data and applications for discovering patterns and relationships. This paper elucidates the differences between search systems for the Web and those for enterprises, with an emphasis on the future of enterprise search systems. It also introduces the Unstructured Information Management Architecture (UIMA) and provides the context for the unstructured information management (UIM) papers that follow." -- Title: A technique for measuring the relative size and overlap of public Web search engines Authors: Krishna Bharat and Andrei Broder Source: WWW 7 Conference From the Abstract: " Search engines are among the most useful and popular services on the Web. Users are eager to know how they compare. Which one has the largest coverage? Have they indexed the same portion of the Web? How many pages are out there? Although these questions have been debated in the popular and technical press, no objective evaluation methodology has been proposed and few clear answers have emerged. In this paper we describe a standardized, statistical way of measuring search engine coverage and overlap through random queries."
Posted by Chris Sherman at 12:25 PM | Permalink
This was one of those, "should I skip it" decisions, but I did find it interesting. Phillip at Google Blogoscoped in Yahoo in Battle Mode summarizes how Yahoo's mail team was given a statue (yep, there's even a picture) for "kicking an enemy's ass." That would be Google's bottom being whacked, specifically.
Phillip then points to Google's Kevin Fox having long commentary on the statue. Kevin used to be at Yahoo, and he does a compare and contrast feeling that Google's about making better products while Yahoo's focused on "how to beat Google" and finds the competition goes too far with the statue's comparison to Britain fighting Nazi Germany.
The comments after Kevin's post go all over the place and are fun to read -- pro-Yahoo, anti-Yahoo, pro-Google, anti-Google. Phillip also points to two Yahoo employees who comment on the statue as well (Ryan Kennedy suggests a toned-down description for the statue; this employee prefers the "be humble" approach).
Yahoo's new email interface is way, way cool (double verified by checking with my wife, who is a regular user) -- but honestly, the old system was already kicking Google's butt for the simple fact that anyone could sign-up for it without getting someone to send you an invite or having to get text messaged a secret code. When Gmail's freely open to anyone, then let the weigh-up really take place.
Speaking of statues, how about Yahoo putting a little message on the Bob's Big Boy statue that Chris and I came across in one of the Yahoo buildings when visiting this summer. I'll see about getting the photo off Chris's phone -- I made him stand there and take it. But it looks just like this, except the hamburger was replaced with the Inktomi logo.
Bob's an old friend I remember well, from my days of visiting Inktomi. He was in the lobby, and I'd sit next to him waiting for someone to come meet me.
If memory serves, Inktomi founder Eric Brewer bought him to represent the serving/caching service that Inktomi used to provide. When Yahoo bought Inktomi, Bob came over -- and apparently was nearly tossed out until someone gave him a home.
He deserves a better home and maybe his own message devoted to the Yahoo web search team -- those from Inktomi, plus the AltaVista and FAST/AllTheWeb vets. They assembled a great product that directly rivals Google's core search results. Heck, put Bob out in the main entrance of Yahoo! Just make the message praising the efforts without dissing the competition, and I suppose everyone will be happy.
Posted by Danny Sullivan at 7:58 AM | Permalink
Happy Birthday to the British Pathe Digital Archive!
Two years ago, Chris wrote this great introduction: British Pathe Develops Huge Historic Picture Archive, of the newly digitized archive of British Pathe newsreel content.
From SearchDay: British Pathe is offering free access to a digitized collection of more than 12 million historic photographs from its 20th century cinema newsreel archive. Archivists and technicians at ITN, which now operates the British Pathe library, have created the images by re-scanning the newsreel's 3,500 hours of 35 millimeter film. Images are displayed as a "storyboard" of thumbnails. Since the images were digitized from motion pictures, a full page of thumbnails represents about 50 seconds of material from a video clip...The site offers free search and preview storyboards for anyone to view. To save a preview image, simply right click it and save it to your hard drive. Preview files display a large copyright notice, but can be used for personal or educational purposes. Enhanced high resolution versions of the images are also available for web publishing and use in power point presentations for a fee.
So, Happy 2nd Birthday to all involved. Here are a few stats about how things have been going in the first two years:
If you've never visited the British Pathe archive, it's more than worthy of your time, attention, and use.
Posted by Gary Price at 8:43 PM | Permalink
Yesterday, I wrote of how search engines could do a better job of query refinement and indeed did so in the past, especially because there was more human involvement in the search process. That drew out Jim Lanzone, senior vice president of search properties at Ask Jeeves, who sent me an email raising a good point that humans haven't gone away just because of the expense. Humans also have a "scale" problem. More comments from Jim on that are below, along with a further look at the scale issue and the need for the Google Generation to rediscover query refinement.
I agree with both Jim and Christopher Coulter, who commented on Robert Scoble's reference to my blog post:
Yeah well the Yahoo Human-Edited model couldn't scale, so you get Google Pagerank automational noised chaos. It's back to 1996 all over again. And the best 'search' is with a database
Indeed, humans haven't scaled well in terms of helping us gather content from across the web. Crawlers do a great job of that. I used a library metaphor on my ODP Founder Comments & Moving Past Directories post earlier this year to explain why directories, after some promise, went away in the face of crawlers.
In short, imagine you go into a library and can use one of two card catalogs to find books on a topic you are interested in:
The crawler-built catalog is far more comprehensive. It's also far more up-to-date. Remember, in the library of the web, the books often rewrite themselves or add pages in the way books in a physical library do not. Humans simply can't keep up with that activity.
The key, of course, is that the crawler service isn't just comprehensive but relevant. It will find not just all the matching pages but often rank them so you are getting the very best ones.
While humans don't scale well in the info gathering and retrieval side, they can play a role. More on that in a moment, but first, here's what Jim said in response to my post:
To say that the problem with human editors was due to it being "expensive" is true, but I don’t think it goes far enough to explain the problem.
Sure, Ask had great relevancy, but only for a single-digit percentage of the overall query stream. That is not how people search, and neither you or I or any number of Web Search Universities is going to change that for the vast majority of searchers.
Algorithmic search was the only solution to that problem because only an index of billions of pages could meet the user need that exists across the long tail of rare queries.
At its peak, the Ask Jeeves "knowledgebase," as it was known internally, matched on about 85% of searches. That was a lot. However, it was only picked 20-25 percent of the time, despite having premium placement at the top of the page.
Sure, some searches resulted in far higher pick rates than this. But the vast majority did not. Therein lay the problem. And due to the exquisite overpromise made by the premise of question-answering and the butler, this had consequences for the Ask Jeeves brand.
The brand was lucky, on the other hand, to gain a foothold in the market early on, and to hold on to millions of users because of it. But at the end of the day, people use a search engine to find what they need - quickly. That foothold would have