Seems our nine-year old is having severe growing pains lately. Come on, everyone reading this considers Google part of their lives, so why not a nine-year old child. And fellow relatives, we are all being impacted by the growth spurts lately.
Here are just a few of the things that have been mentioned in the past week or so.
Seems the bulk upload for Adwords is throwing a 502 error - had not heard of a 502 before. Others are seeing a lot of time outs when searching. I have had a lot of problems staying logged in to GMail especially the GChat part. Google Analytics is having timing problems also. Other countries are still dealing with pervasive spam in the search results. The #6, -50, 950 penalties - to name just a few - seem to be a little too aggressive. Long turnaround time for removing submitted URLs. Being overly generous to new sites.
And don't get me started on the WiFi efforts or Docs etc. They just seem to be pushing so hard at the envelope they have forgotten about the letter inside.
But we have to be patient after all Google is only nine years old. And if we are not able to deal now think of what the teenage years are going to bring!
Posted by aussiewebmaster at 3:05 PM | Permalink
Last week a group of technology companies got together to announce the Climate Savers Computing Initiative. Led by Intel, other participants included: Dell, Microsoft, HP, IBM, and The World Wildlife Fund.
While only tangentially related to search, I think it's great when companies in our space take a direct hand in environmental issues, such as Yahoo! did when it announced its plans to become carbon neutral.
I spoke this past Thursday with John Weisblatt, who heads up the Power and Cooling Group initiative at Dell. He indicated that there were two major components of the initiative:
Dell is showing great leadership in these initiatives as well. Jon Weisblatt indicated to me that Michael Dell committed the company to becoming the "greenest tech company on the planet".
Posted by Eric Enge at 10:03 AM | Permalink
Okay the Quality Score discussions have been busy lately. While we would all like to have a light turned on inside the 'black box' there has been some insight given recently if you have been watching carefully.
Yesterday, Barry Schwartz posted about Google's announcement of more small changes to the algorithm. He noted that while Google claimed not many advertisers would be impacted by the changes, the members at both DigitalPoint and WebMasterWorld seem to differ on that point.
We all love finding our accounts littered with inactivated terms without ant prior notice.
Peter Hershberg posted an interesting piece of informatioin inside his take on QS. In an email back from Google about the impact of recent QS changes he was told this nugget: keywords are not dynamically inserted into your ad text because their corresponding Quality Scores aren't high enough to qualify for keyword insertion.
Not only does it add information about the QS but answers a question people have been asking about problems with keyword inserts.
Here are a few more articles worth a read on this topic.
Amy Konefal Susan Esparza Geordie Carswell Greg Meyers
Posted by aussiewebmaster at 12:44 PM | Permalink
Google held an in company conference this week, gathering techies from their various departments. They assembled "engineers from Testing, Development, User Experience, and other groups to submit conference sessions: tool presentations, tutorials, workshops, panels, and experience reports", the Google testing blog reported.
Testapalooza was a big sucess, they reported in the blog.
"The idea for Testapalooza came out of discussions about how to build a vibrant testing community here at Google. Many diverse groups work daily on quality-related activities, but each group uses different tools and has different ideas for testing an application, so it can be difficult to find out what others are doing. So we decided to put on a conference!"
"All Testapalooza sessions were video recorded (many were videoconferenced to other offices). We want to publish as many of these videos as possible, and will review them over the coming weeks to publish sessions which did not contain any confidential information. Watch this space for more information on the videos."
Posted by aussiewebmaster at 9:11 PM | Permalink
At the annual American Association for the Advancement of Science conference Larry Page commented , that the human brain algorithm is a simple one.
Guess Google is really pushing forward with its studies into Artifical Intelligence - so now we have to wonder if the Google future is going to be an "AI" world or a "Terminator" world.
I think it will be time to run to the mountains if they launch the first robot and it looks like California Governor Arnold.....
Imagine a robot armed just with Google tools.... today it would be more than half way to the Terminator model.... we should be keeping an eye on their future company purchases.
Posted by aussiewebmaster at 12:22 PM | Permalink
Over at WebmasterWorld Tedster references an interesting short paper about creating your own search engine by Googler Anna Lynn Patterson. This document makes for a good read.
The paper was published in April of 2004 when she was a student at Stanford University. She is also the person whose name appears on the recent Google patent application titled Detecting spam documents in a phrase based information retrieval system.
Basically, she breaks it down into hard drive space, having lots of servers, and CPU power. Anna's document is a good initial primer, but there is another aspect of building a search engine that deserves some emphasis.
The search engine companies have built the largest networks of servers the world has ever known. When I think of Google's core technology assets, I don't think about search engine algorithms, I think about massively deployed server networks operating in close harmony.
Posted by Eric Enge at 10:46 AM | Permalink
As Kevin Newcomb mentioned yesterday, Danny Sullivan had an outstanding write-up yesterday about Google's enhancement of its "Link:" operator which allows researchers to discover many of the links that Google has indexed as pointing to a particular URL: Google Releases New Link Reporting Tools.
Google will allow users of its Webmaster Central tools to see more thorough reports of inbound links as measured to domains and even particular pages. More information is also available at the Webmaster Central blog.
This underscores the importance of working with Google by signing up with Webmaster Central. Not only will it help to get important pages of a Web site indexed, but it will also assist webmasters in conducting important competitor analysis. In the past, many researchers have almost completely ignored the Google "Link:" command, or operator, since it is known that Google does not display all of the links it knows about. Others have continued to use it, thinking that the ones that Google shows "must be more valuable" than others.
This has in fact been an often discussed topic at the Search Engine Watch Forums, where a sticky thread discusses the topic of the difference in inbound link reporting at various engines, and reveals that the current link discovery tool of consensus choice is the one found at Yahoo! Site Explorer. Although it is unlikely that those many converts will now abandon Yahoo! to use only the Google enhanced version, this news has made many webmasters and search engine optimization specialists happy.
(Begin editorial) Our engineers at Avenue A | Razorfish love to use the Webmaster Central tools, but, believe it or not, sometimes have problems with getting clients to approve the use, since it requires verification code to be placed on the Web site. Perhaps if Google would be more open about sharing its information without requiring this code, it would get a better reputation with some marketers that feel that they require too much "inside information." Google has done a great job in helping webmasters with their Web sites, but still needs to improve its relationship and willingness to work with agencies and other SEO companies, in the opinion of some. (/end editorial)
Posted by Chris Boggs at 11:13 AM | Permalink
The US Patent Office has issued Google two more patents.
One patent, for a similarity search engine - to check duplicate content - was first filed in 2001.
The other, Google's Digital Mapping Systems patent, describes 'various methods, systems and apparatus for implementing aspects of a digital mapping system'.
The similarity engine has had variations filed by IBM and Hitachi, according to a report by VNUNET.
Posted by aussiewebmaster at 12:54 PM | Permalink
This weekend The Register published an article named Google developing eavesdropping software. The article describes how Google uses existing PC microphones fingerprinting technology to show relevant ads that appeal more to you. The article goes on to explain how the sound fingerprinting works; it "breaks sound into a five-second snippets to pick out audio from a TV, reducing the snippet to a digital "fingerprint", which it matches on an internet server." Privacy folks are worried about the repercussions of such software.
Postscript Barry: I should link to Google Paper Explains Listening To Your TV Can Help It Put Ads & Info On Your Computer we covered back in Jun. 9, 2006.
Posted by Barry Schwartz at 10:50 AM | Permalink
A New York Times article has a detailed analysis of Google's infrastructure and discussion with Urs Hölzle, senior vice president for operations at Google. Here are some of the key points I pulled from that article.
+ Google tends builds from ground up versus buying. + Google's computing costs are half those of other large Internet companies and a tenth those of traditional corporate technology users. + Critics call Google's philosophy "unnecessary and inefficient." + "Google is reducing cost while maintaining performance by shifting the burden of reliability from hardware to software — individual hardware components can fail, but software automatically shifts the local task and the data to other machines." + Google is among Advanced Micro's five largest clients.
Posted by Barry Schwartz at 9:51 AM | Permalink
New York Times Looks At Google's Hardware & InfrastructureA New York Times article has a detailed analysis of Google's infrastructure and discussion with Urs Hölzle, senior vice president for operations at Google. Here are some of the key points I pulled from that article.
+ Google tends builds from ground up versus buying. + Google's computing costs are half those of other large Internet companies and a tenth those of traditional corporate technology users. + Critics call Google's philosophy "unnecessary and inefficient." + "Google is reducing cost while maintaining performance by shifting the burden of reliability from hardware to software — individual hardware components can fail, but software automatically shifts the local task and the data to other machines." + Google is among Advanced Micro's five largest clients.
Posted by Kevin Heisler at 9:51 AM | Permalink
New York Times Looks At Google's Hardware & InfrastructureA New York Times article has a detailed analysis of Google's infrastructure and discussion with Urs Hölzle, senior vice president for operations at Google. Here are some of the key points I pulled from that article.
+ Google tends builds from ground up versus buying. + Google's computing costs are half those of other large Internet companies and a tenth those of traditional corporate technology users. + Critics call Google's philosophy "unnecessary and inefficient." + "Google is reducing cost while maintaining performance by shifting the burden of reliability from hardware to software — individual hardware components can fail, but software automatically shifts the local task and the data to other machines." + Google is among Advanced Micro's five largest clients.
Posted by Kevin Heisler at 9:51 AM | Permalink
New York Times Looks At Google's Hardware & InfrastructureA New York Times article has a detailed analysis of Google's infrastructure and discussion with Urs Hölzle, senior vice president for operations at Google. Here are some of the key points I pulled from that article.
+ Google tends builds from ground up versus buying. + Google's computing costs are half those of other large Internet companies and a tenth those of traditional corporate technology users. + Critics call Google's philosophy "unnecessary and inefficient." + "Google is reducing cost while maintaining performance by shifting the burden of reliability from hardware to software — individual hardware components can fail, but software automatically shifts the local task and the data to other machines." + Google is among Advanced Micro's five largest clients.
Posted by Kevin Heisler at 9:51 AM | Permalink
Google patents the Google File System, Microsoft claims a Functional Object Model for mobile devices, and Yahoo! (Overture) describes an autonotification process to inform advertisers of when a certain condition has been met concerning one of their ads.
The authors of a paper on the Google File System (pdf) are listed as the inventors of this patent filing. Another similarity between the two documents is that both cite mostly the same reference documents. The patent and paper appear to cover much of the same ground. This looks like the patent for the Google File System.
Leasing scheme for data-modifying operations Invented by Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Assigned to Google US Patent 7,065,618 Granted on June 20, 2006 Filed on June 30, 2003
Abstract
A system may facilitate performance of a data-modifying operation in a file network that includes multiple servers that store replicas of data. One of the servers may serve as a primary replica for one of the replicas of data and at least one other one of the servers may serve as at least one secondary replica for the replica of data. The system may send data associated with the data-modifying operation to the primary replica and the at least one secondary replica based on a network topology and independently send a data-modifying control signal that requests execution of the data-modifying operation using the data associated with the data-modifying operation to the primary replica and the at least one secondary replica.Microsoft
When presenting a web page on a mobile device, it's sometimes best not to display the whole page. But trying to decide which parts to show, and which not to display can be difficult. More information is sometimes needed about the web page.
Microsoft has been experimenting with ways to identify what different parts of a web page do based upon the layout and functions of parts of pages, and a paper from Microsoft that has seen some popularity recently on this type of analysis has been one on Block-level Link Analysis (pdf).
It wasn't a surprise to see Wei-Ying Ma's name on this patent application, as one of the authors of that paper, and an earlier paper on VIPS: a Vision-based Page Segmentation Algorithm.
Another Wei-Ying Ma paper on that topic is Efficient Browsing of Web Search Results on Mobile Devices Based on Block Importance Model (pdf). It cites a function based analysis like the one described in this patent, and points to a document that explains some of the concepts - Function-Based Object Model Towards Website Adaptation (pdf). The other inventor listed in this patent, Jin-Lin Chen, is one of the authors of that paper. Taking a look at those papers may make understanding this patent easier.
Segmenting and indexing web pages using function-based object models Invented by Jin-Lin Chen and Wei-Ying Ma Assigned to Microsoft US Patent 7,065,707 Granted on June 20, 2006 Filed on June 24, 2002
Abstract
By understanding a website author's intention through an analysis of the function of a website, website content can be adapted for presentation or rendering in a manner that more closely appreciates and respects the function behind the website. A website's function is analyzed so that its content can be adapted to different client environments. A function-based object model (FOM) identifies objects associated with a website, and analyzes those objects in terms of their functions. Desktop oriented websites are adapted for mobile devices based on the FOM and on a mobile control intermediary language. While the FOM attempts to understand a website author's intention based on functional analysis of web content, the mobile control intermediary language enables the author to create web content that can be presented in various mobile devices by processing the objects, by extracting forms from the objects, and by generating a file in the mobile control intermediary language for each form.Yahoo
This patent describes an autonotification system, enabling automated messages to be sent to an advertiser regarding their paid search listings when certain pre-defined conditions are met. Here are the areas those conditions listed in the patent encompass:
Automatic advertiser notification for a system for providing place and price protection in a search result list generated by a computer network search engine Invented by Narinder Pal Singh, Scott W. Snell, Douglas T. Huffman, Darren J. Davis, Thomas A. Soulanille, and Dominic Dough-Ming Cheung. Assigned to Overture Services, Inc. US Patent 7,065,500 Granted on June 20, 2006 Filed on September 26, 2001
Abstract
A notification method in a computer database system includes receiving a notification instruction from an owner associated with a search listing stored in the computer database system, monitoring conditions specified by the notification instruction for the search listing, and sending a notification to the owner upon detection of a changed condition of the search listing.My usual reminder about patents: Some of the processes and technology described in patents are created in house, and some are developed with the assistance of contractors and partners. A percentage are never developed in a tangible manner, but may serve as a way to attempt to exclude others from using the technology, or even to possibly mislead competitors into exploring an area that they might not have an interest in (sometimes skepticism is good.)
There are times when a Google or Yahoo acquires a company to gain access to the intellectual property of that company, or the intellectual prowess and expertise of that company's employees. And sometimes patents are just purchased.
Want to comment or discuss? Visit our Search Technology & Relevancy area of the Search Engine Watch Forums.
Posted by Bill Slawski at 3:41 AM | Permalink
New Search Patents: June 22, 2006 - Google File System, Microsoft Blocks, and Yahoo AutonotificationsGoogle patents the Google File System, Microsoft claims a Functional Object Model for mobile devices, and Yahoo! (Overture) describes an autonotification process to inform advertisers of when a certain condition has been met concerning one of their ads.
The authors of a paper on the Google File System (pdf) are listed as the inventors of this patent filing. Another similarity between the two documents is that both cite mostly the same reference documents. The patent and paper appear to cover much of the same ground. This looks like the patent for the Google File System.
Leasing scheme for data-modifying operations Invented by Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Assigned to Google US Patent 7,065,618 Granted on June 20, 2006 Filed on June 30, 2003
Abstract
A system may facilitate performance of a data-modifying operation in a file network that includes multiple servers that store replicas of data. One of the servers may serve as a primary replica for one of the replicas of data and at least one other one of the servers may serve as at least one secondary replica for the replica of data. The system may send data associated with the data-modifying operation to the primary replica and the at least one secondary replica based on a network topology and independently send a data-modifying control signal that requests execution of the data-modifying operation using the data associated with the data-modifying operation to the primary replica and the at least one secondary replica.Microsoft
When presenting a web page on a mobile device, it's sometimes best not to display the whole page. But trying to decide which parts to show, and which not to display can be difficult. More information is sometimes needed about the web page.
Microsoft has been experimenting with ways to identify what different parts of a web page do based upon the layout and functions of parts of pages, and a paper from Microsoft that has seen some popularity recently on this type of analysis has been one on Block-level Link Analysis (pdf).
It wasn't a surprise to see Wei-Ying Ma's name on this patent application, as one of the authors of that paper, and an earlier paper on VIPS: a Vision-based Page Segmentation Algorithm.
Another Wei-Ying Ma paper on that topic is Efficient Browsing of Web Search Results on Mobile Devices Based on Block Importance Model (pdf). It cites a function based analysis like the one described in this patent, and points to a document that explains some of the concepts - Function-Based Object Model Towards Website Adaptation (pdf). The other inventor listed in this patent, Jin-Lin Chen, is one of the authors of that paper. Taking a look at those papers may make understanding this patent easier.
Segmenting and indexing web pages using function-based object models Invented by Jin-Lin Chen and Wei-Ying Ma Assigned to Microsoft US Patent 7,065,707 Granted on June 20, 2006 Filed on June 24, 2002
Abstract
By understanding a website author's intention through an analysis of the function of a website, website content can be adapted for presentation or rendering in a manner that more closely appreciates and respects the function behind the website. A website's function is analyzed so that its content can be adapted to different client environments. A function-based object model (FOM) identifies objects associated with a website, and analyzes those objects in terms of their functions. Desktop oriented websites are adapted for mobile devices based on the FOM and on a mobile control intermediary language. While the FOM attempts to understand a website author's intention based on functional analysis of web content, the mobile control intermediary language enables the author to create web content that can be presented in various mobile devices by processing the objects, by extracting forms from the objects, and by generating a file in the mobile control intermediary language for each form.Yahoo
This patent describes an autonotification system, enabling automated messages to be sent to an advertiser regarding their paid search listings when certain pre-defined conditions are met. Here are the areas those conditions listed in the patent encompass:
Automatic advertiser notification for a system for providing place and price protection in a search result list generated by a computer network search engine Invented by Narinder Pal Singh, Scott W. Snell, Douglas T. Huffman, Darren J. Davis, Thomas A. Soulanille, and Dominic Dough-Ming Cheung. Assigned to Overture Services, Inc. US Patent 7,065,500 Granted on June 20, 2006 Filed on September 26, 2001
Abstract
A notification method in a computer database system includes receiving a notification instruction from an owner associated with a search listing stored in the computer database system, monitoring conditions specified by the notification instruction for the search listing, and sending a notification to the owner upon detection of a changed condition of the search listing.My usual reminder about patents: Some of the processes and technology described in patents are created in house, and some are developed with the assistance of contractors and partners. A percentage are never developed in a tangible manner, but may serve as a way to attempt to exclude others from using the technology, or even to possibly mislead competitors into exploring an area that they might not have an interest in (sometimes skepticism is good.)
There are times when a Google or Yahoo acquires a company to gain access to the intellectual property of that company, or the intellectual prowess and expertise of that company's employees. And sometimes patents are just purchased.
Want to comment or discuss? Visit our Search Technology & Relevancy area of the Search Engine Watch Forums.
Posted by Kevin Heisler at 3:41 AM | Permalink
New Search Patents: June 22, 2006 - Google File System, Microsoft Blocks, and Yahoo AutonotificationsGoogle patents the Google File System, Microsoft claims a Functional Object Model for mobile devices, and Yahoo! (Overture) describes an autonotification process to inform advertisers of when a certain condition has been met concerning one of their ads.
The authors of a paper on the Google File System (pdf) are listed as the inventors of this patent filing. Another similarity between the two documents is that both cite mostly the same reference documents. The patent and paper appear to cover much of the same ground. This looks like the patent for the Google File System.
Leasing scheme for data-modifying operations Invented by Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Assigned to Google US Patent 7,065,618 Granted on June 20, 2006 Filed on June 30, 2003
Abstract
A system may facilitate performance of a data-modifying operation in a file network that includes multiple servers that store replicas of data. One of the servers may serve as a primary replica for one of the replicas of data and at least one other one of the servers may serve as at least one secondary replica for the replica of data. The system may send data associated with the data-modifying operation to the primary replica and the at least one secondary replica based on a network topology and independently send a data-modifying control signal that requests execution of the data-modifying operation using the data associated with the data-modifying operation to the primary replica and the at least one secondary replica.Microsoft
When presenting a web page on a mobile device, it's sometimes best not to display the whole page. But trying to decide which parts to show, and which not to display can be difficult. More information is sometimes needed about the web page.
Microsoft has been experimenting with ways to identify what different parts of a web page do based upon the layout and functions of parts of pages, and a paper from Microsoft that has seen some popularity recently on this type of analysis has been one on Block-level Link Analysis (pdf).
It wasn't a surprise to see Wei-Ying Ma's name on this patent application, as one of the authors of that paper, and an earlier paper on VIPS: a Vision-based Page Segmentation Algorithm.
Another Wei-Ying Ma paper on that topic is Efficient Browsing of Web Search Results on Mobile Devices Based on Block Importance Model (pdf). It cites a function based analysis like the one described in this patent, and points to a document that explains some of the concepts - Function-Based Object Model Towards Website Adaptation (pdf). The other inventor listed in this patent, Jin-Lin Chen, is one of the authors of that paper. Taking a look at those papers may make understanding this patent easier.
Segmenting and indexing web pages using function-based object models Invented by Jin-Lin Chen and Wei-Ying Ma Assigned to Microsoft US Patent 7,065,707 Granted on June 20, 2006 Filed on June 24, 2002
Abstract
By understanding a website author's intention through an analysis of the function of a website, website content can be adapted for presentation or rendering in a manner that more closely appreciates and respects the function behind the website. A website's function is analyzed so that its content can be adapted to different client environments. A function-based object model (FOM) identifies objects associated with a website, and analyzes those objects in terms of their functions. Desktop oriented websites are adapted for mobile devices based on the FOM and on a mobile control intermediary language. While the FOM attempts to understand a website author's intention based on functional analysis of web content, the mobile control intermediary language enables the author to create web content that can be presented in various mobile devices by processing the objects, by extracting forms from the objects, and by generating a file in the mobile control intermediary language for each form.Yahoo
This patent describes an autonotification system, enabling automated messages to be sent to an advertiser regarding their paid search listings when certain pre-defined conditions are met. Here are the areas those conditions listed in the patent encompass:
Automatic advertiser notification for a system for providing place and price protection in a search result list generated by a computer network search engine Invented by Narinder Pal Singh, Scott W. Snell, Douglas T. Huffman, Darren J. Davis, Thomas A. Soulanille, and Dominic Dough-Ming Cheung. Assigned to Overture Services, Inc. US Patent 7,065,500 Granted on June 20, 2006 Filed on September 26, 2001
Abstract
A notification method in a computer database system includes receiving a notification instruction from an owner associated with a search listing stored in the computer database system, monitoring conditions specified by the notification instruction for the search listing, and sending a notification to the owner upon detection of a changed condition of the search listing.My usual reminder about patents: Some of the processes and technology described in patents are created in house, and some are developed with the assistance of contractors and partners. A percentage are never developed in a tangible manner, but may serve as a way to attempt to exclude others from using the technology, or even to possibly mislead competitors into exploring an area that they might not have an interest in (sometimes skepticism is good.)
There are times when a Google or Yahoo acquires a company to gain access to the intellectual property of that company, or the intellectual prowess and expertise of that company's employees. And sometimes patents are just purchased.
Want to comment or discuss? Visit our Search Technology & Relevancy area of the Search Engine Watch Forums.
Posted by Kevin Heisler at 3:41 AM | Permalink
New Search Patents: June 22, 2006 - Google File System, Microsoft Blocks, and Yahoo AutonotificationsGoogle patents the Google File System, Microsoft claims a Functional Object Model for mobile devices, and Yahoo! (Overture) describes an autonotification process to inform advertisers of when a certain condition has been met concerning one of their ads.
The authors of a paper on the Google File System (pdf) are listed as the inventors of this patent filing. Another similarity between the two documents is that both cite mostly the same reference documents. The patent and paper appear to cover much of the same ground. This looks like the patent for the Google File System.
Leasing scheme for data-modifying operations Invented by Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Assigned to Google US Patent 7,065,618 Granted on June 20, 2006 Filed on June 30, 2003
Abstract
A system may facilitate performance of a data-modifying operation in a file network that includes multiple servers that store replicas of data. One of the servers may serve as a primary replica for one of the replicas of data and at least one other one of the servers may serve as at least one secondary replica for the replica of data. The system may send data associated with the data-modifying operation to the primary replica and the at least one secondary replica based on a network topology and independently send a data-modifying control signal that requests execution of the data-modifying operation using the data associated with the data-modifying operation to the primary replica and the at least one secondary replica.Microsoft
When presenting a web page on a mobile device, it's sometimes best not to display the whole page. But trying to decide which parts to show, and which not to display can be difficult. More information is sometimes needed about the web page.
Microsoft has been experimenting with ways to identify what different parts of a web page do based upon the layout and functions of parts of pages, and a paper from Microsoft that has seen some popularity recently on this type of analysis has been one on Block-level Link Analysis (pdf).
It wasn't a surprise to see Wei-Ying Ma's name on this patent application, as one of the authors of that paper, and an earlier paper on VIPS: a Vision-based Page Segmentation Algorithm.
Another Wei-Ying Ma paper on that topic is Efficient Browsing of Web Search Results on Mobile Devices Based on Block Importance Model (pdf). It cites a function based analysis like the one described in this patent, and points to a document that explains some of the concepts - Function-Based Object Model Towards Website Adaptation (pdf). The other inventor listed in this patent, Jin-Lin Chen, is one of the authors of that paper. Taking a look at those papers may make understanding this patent easier.
Segmenting and indexing web pages using function-based object models Invented by Jin-Lin Chen and Wei-Ying Ma Assigned to Microsoft US Patent 7,065,707 Granted on June 20, 2006 Filed on June 24, 2002
Abstract
By understanding a website author's intention through an analysis of the function of a website, website content can be adapted for presentation or rendering in a manner that more closely appreciates and respects the function behind the website. A website's function is analyzed so that its content can be adapted to different client environments. A function-based object model (FOM) identifies objects associated with a website, and analyzes those objects in terms of their functions. Desktop oriented websites are adapted for mobile devices based on the FOM and on a mobile control intermediary language. While the FOM attempts to understand a website author's intention based on functional analysis of web content, the mobile control intermediary language enables the author to create web content that can be presented in various mobile devices by processing the objects, by extracting forms from the objects, and by generating a file in the mobile control intermediary language for each form.Yahoo
This patent describes an autonotification system, enabling automated messages to be sent to an advertiser regarding their paid search listings when certain pre-defined conditions are met. Here are the areas those conditions listed in the patent encompass:
Automatic advertiser notification for a system for providing place and price protection in a search result list generated by a computer network search engine Invented by Narinder Pal Singh, Scott W. Snell, Douglas T. Huffman, Darren J. Davis, Thomas A. Soulanille, and Dominic Dough-Ming Cheung. Assigned to Overture Services, Inc. US Patent 7,065,500 Granted on June 20, 2006 Filed on September 26, 2001
Abstract
A notification method in a computer database system includes receiving a notification instruction from an owner associated with a search listing stored in the computer database system, monitoring conditions specified by the notification instruction for the search listing, and sending a notification to the owner upon detection of a changed condition of the search listing.My usual reminder about patents: Some of the processes and technology described in patents are created in house, and some are developed with the assistance of contractors and partners. A percentage are never developed in a tangible manner, but may serve as a way to attempt to exclude others from using the technology, or even to possibly mislead competitors into exploring an area that they might not have an interest in (sometimes skepticism is good.)
There are times when a Google or Yahoo acquires a company to gain access to the intellectual property of that company, or the intellectual prowess and expertise of that company's employees. And sometimes patents are just purchased.
Want to comment or discuss? Visit our Search Technology & Relevancy area of the Search Engine Watch Forums.
Posted by Kevin Heisler at 3:41 AM | Permalink
Four patent applications from Google describe fighting spam in emails, providing product review searches, moving large amounts of data, and autolinking. Yahoo matches, and raises with five patent filings. One on watching deletions to choose better ads, another on serving dynamic information through a additional browser interface, and three more on multimedia and RSS.
Microsoft goes TV 2.0 with an electronic program guide, and describes a way of matching advertising content with certain search queries before those searches are made. IBM comes up with a unique way of presenting the results of a search from more than one search engine, and a way of reducing the amount of irrelevant results in a search by analyzing an initial set of results, identifying an appropriate additional query term from those results, and searching the original results again but with the additional query term included in the search.
Go Daddy describes a way of fighting spam in emails. Xerox employs collaborative filtering from previous users' searches to predict search results. Apostolos Gerasoulis, from Ask.com, with a couple of co-inventors, ranks and displays pages (objects) based upon linkage and textual data, and then defines a way to identifiy and assign topics to them.
Email Spam
Emails with links in them could be considered spam if the links point to pages that are in a conceptual category considered spammy. This patent application really doesn't describe the concept categorization part of the process. That's done in a related patent application mentioned within this document, and the related document lists Georges Harik as one inventor. Dr. Harik's name is on a very large percentage of the patent applications involving Gmail-type processes.
Method and system to detect e-mail spam using concept categorization of linked content Invented by Johnny Chen US Patent Application 20060122957 Published June 8, 2006 Filed December 3, 2004
Abstract
A system and method for detecting undesired electronic messages (e.g., spam) using concept categorization of hyperlinks is disclosed. A server receives an electronic message and retrieves web pages that correspond to hyperlinks in the message. The server performs concept categorization on the retrieved web pages based on semantic relationships in the received information to determine whether the electronic message meets predefined criteria associated with undesired messages.Searching and Aggregating Product Reviews
If Google wanted to get into the product or services review business, the next patent filing describes a blue print for the process that might make an effective and innovative system.
Method and system for finding and aggregating reviews for a product Invented by Jan Matthias Ruhl and Mayur D. Datar US Patent Application 20060129446 Published June 15, 2006 Filed December 14, 2004
Abstract
The embodiments disclosed herein include new, more efficient ways to collect product reviews from the Internet, aggregate reviews for the same product, and provide an aggregated review to end users in a searchable format. One aspect of the invention is a graphical user interface on a computer that includes a plurality of portions of reviews for a product and a search input area for entering search terms to search for reviews of the product that contain the search terms.Scaling and Distributing Data
Arvind Jain is the head of Research and Development in Google's Bangalore office, and has spoken at a number of conferences on infrastructure projects and issues involving such things as Google's crawl and indexing system, distributed file replication system, and compression techniques for large scale storage systems. He's listed as the inventor for this next Google filing.
System and method for scalable data distribution Invented by Arvind Jain US Patent Application 20060126201 Published June 15, 2006 Filed December 10, 2004
Abstract
A system having a resource manager, a plurality of masters, and a plurality of slaves, interconnected by a communications network. To distribute data, a master determined that a destination slave of the plurality slaves requires data. The master then generates a list of slaves from which to transfer the data to the destination slave. The master transmits the list to the resource manager. The resource manager is configured to select a source slave from the list based on available system resources. Once a source is selected by the resource manager, the master receives an instruction from the resource manager to initiate a transfer of the data from the source slave to the destination slave. The master then transmits an instruction to commence the transfer.Autolinking
Google's Autolink raised a lot of eyebrows, and brought some negative reactions. A Search Engine Watch Blog post from Danny Sullivan, Google Toolbar's AutoLink & The Need For Opt-Out defined many of the issues around the toolbar feature. The following patent application explains how such a system might work from the search engine's perspective.
Providing useful information associated with an item in a document Invented by Gueorgui Djabarov US Patent Application 20060129910 Published June 15, 2006 Filed December 14, 2004
Abstract
A method includes recognizing an item within a first document based on a pattern associated with the item but not the exact content of the item. The method further includes identifying a link for the item and providing a second document that includes information associated with the item when the link for the item is selected.Yahoo
Choosing Better Ads through User Behavior
Some queries involve the use of concepts and units, as described in at least five Yahoo patent filings (see previous patent posts in the Yahoo sections from Yahoo Units and Microsoft Redundancy Filters and More Yahoo Concepts and Google Predictive Searches.)
But sometimes a two term query isn't a concept as much as it is a couple of keywords that someone may use to search for something. If that person performs a second search after deleting one of the words, then the record of that deletion and second search might help Yahoo calculate "deletion probability scores" for words being used in these kind of two term queries.
This can be helpful when there isn't a good keyword based advertising match for that query, but there might be a good match individually for each of the terms that make up the query. The "deletion probability scores" can help determine which of the two terms to show keyword-based advertising for in search results.
System and methods for ranking the relative value of terms in a multi-term search query using deletion prediction Invented by Rosemary Jones and Daniel C. Fain US Patent Application 20060129534 Published June 15, 2006 Filed December 14, 2004
Abstract
The likely relevance of each term of a search-engine query of two or more terms is determined by their deletion probability scores. If the deletion probability scores are significantly different, the deletion probability score can be used to return targeted ads related to the more relevant term or terms along with the search results. Deletion probability scores are determined by first gathering historical records of search queries of two or more terms in which a subsequent query was submitted by the same user after one or more of the terms had been deleted. The deletion probability score for a particular term of a search query is calculated as the ratio of the number of times that particular term was itself deleted prior to a subsequent search by the same user divided by the number of times there were subsequent search queries by the same user in which any term or terms including that given term was deleted by the same user prior to the subsequent search. Terms are not limited to individual alphabetic words.Browser Interface Helpers
This next document describes some ways to provide additional dynamic information to someone via a toolbar styled interface, while they are browsing pages on the web.
Method of controlling an Internet browser interface and a controllable browser interface Invented by Thomas J. Shafron Assigned to Yahoo US Patent Application 20060129937 Published June 15, 2006 Filed February 2, 2006
Abstract
The present invention is directed to a method of dynamically controlling and displaying an Internet browser interface, and to a dynamically controllable Internet browser interface. In accordance with the present invention, a browser interface may be customized using a controlling software program that may be provided by an Internet content provider, an ISP, or that may reside on an Internet user's computer. The controlling software program enables the Internet user, the content provider, or the ISP to customize and control the information and/or functionality of a user's browser and browser interface.RSS Enhancements
The following three Yahoo filings all list the same inventors, including John Thrall who is the head of media search engineering, for Yahoo Search. They provide different aspects of using RSS with multimedia files.
Syndicating multiple media objects with RSS Invented by Andrew R. Volk, David D. Hall, and John J. Thrall US Patent Application 20060129917 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for syndicating more than one media object in an element using Real Simple Syndication (RSS). In one embodiment, multiple media objects with at least one shared characteristic are syndicated under the same element. For example, a single media object can come in multiple formats and/or compression rates.Syndicating multimedia information with RSS Invented by Andrew R. Volk, David D. Hall, John J. Thrall US Patent Application 20060129907 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for adding descriptive information to a Real Simple Syndication (RSS) document. The descriptive information describes the content of media objects syndicated through the document. The descriptive information can be used to provided additional information to a subscriber, and can be used in searching for syndicated media content.RSS rendering via a media player Invented by Andrew R. Volk, David D. Hall, John J. Thrall US Patent Application 20060129916 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for syndicating media objects through a link to a media player using Real Simple Syndication (RSS). A content provider may not want to give direct access to a media object to a subscriber. Instead a content provider can give the subscriber a link to a media player that can access the media object.Microsoft
Searching electronic program guide data Invented by Pradhan S. Rao, David Hendler Sloo, Daniel Danker, and George K. Nyako Assigned to Microsoft US Patent Application 20060130098 Published June 15, 2006 Filed December 15, 2004
Abstract
Searching electronic program guide (EPG) data is described. The EPG data may be compartmentalized into channel metadata that describes characteristics of one or more channels and content metadata that describes characteristics of one or more content items. In a implementation, a method includes searching channel metadata and content metadata. A result of the searching is formed for output in conjunction with an electronic program guide (EPG).System and method for indexing and prefiltering Invented by Brian Burdick, Joshua J. Forman, Kevin P. Kornelson, Murali Vajjiravel, and Rajeev Prasad Assigned to Microsoft US Patent Application 20060129555 Published June 15, 2006 Filed December 9, 2004
Abstract
A method and system are provided for selecting advertisements for presentation to a user in response to a user search query. The system may include a keyword server for parsing the user search query and an index server for receiving the parsed search query. The index server may include an index of advertising phrases and pre-filtering components for comparing index entries to the parsed user search query in order to discard non-matching index entries and locate matching entries. The pre-filtering components may include either a phrase length pre-filtering component or a word hash pre-filtering component. The system may additionally include a listing server for sorting through the matching entries located by the index server and further filtering the matching entries for retrieval and presentation to the user.IBM
Ring method, apparatus, and computer program product for managing federated search results in a heterogeneous environment Invented by Wade Shelby Beavers and David Joseph Borrillo Assigned to IBM US Patent Application 20060129530 Published June 15, 2006 Filed December 9, 2004
Abstract
A method, apparatus and computer program product are provided for managing federated search results in a heterogeneous environment. A user enters a search term and the search term is submitted to multiple selected search engines. Search results are gathered from each selected search engine. A search ring is generated including a ring section to represent each of the selected search engines for enabling the user to view search results from one or more of the selected search engines.Method and system for suggesting search engine keywords Invented by Cary Lee Bates Assigned to IBM US Patent Application 20060129531 Published June 15, 2006 Filed December 9, 2004
Abstract
A search engine receives a search query having one or more keywords. The documents in the result set from that search query are analyzed to identify one or more additional keywords that further segment, or separate, the initial result set. These additional keywords are presented to the user who then selects whether to include or exclude documents matching the additional keywords. In this way, the number of documents in the initial result set is reduced in a relatively quick and effortless manner.Go Daddy
Email filtering system and method Invented by Brad Owen and Jason Steiner US Patent Application 20060129644 Published June 15, 2006 Filed December 14, 2004
Abstract
Systems and methods of the present invention allow filtering out spam and phishing email messages based on the links embedded into the email messages. In a preferred embodiment, an Email Filter extracts links from the email message and obtains desirability values for the links. The Email Filter may route the email message based on desirability values. Such routing includes delivering the email message to a Recipient, delivering the message to a Quarantine Mailbox, or deleting the message.Xerox
Personalized web search method Invented by Lisa S. Purvis Assigned to Xerox Corporation US Patent Application 20060129533 Published June 15, 2006 Filed December 15, 2004
Abstract
A method for contextualizing search results is disclosed. The method includes performing a traditional web query that returns a set of result pages, using collaborative filtering techniques to generate a set of predicted pages, comparing the set of predicted pages with the set of result pages, and ranking the set of result pages so that result pages that are also included in the set of predicted pages are ranked higher than those that are not. Methods herein also contemplate using the search history of the user or others to refine the results of searches.Ask.com
Relevancy-based database retrieval and display techniques Invented by Tao Yang, Wei Wang, and Apostolos Gerasoulis US Patent Application 20060129552 Published June 15, 2006 Filed February 2, 2006
Abstract
Techniques to retrieve, rank and display data objects retrieved form a database are described. In particular, methods to assign a global ranking value to a data object based on a combination of that object's link-based (e.g., vector-space cluster analysis) and text-based (e.g., word frequency) ranks are described. Additional techniques to determine a set of concepts, topics or key words associated with each retrieved data objects are described.My usual reminder about patents: Some of the processes and technology described in patents are created in house, and some are developed with the assistance of contractors and partners. A percentage are never developed in a tangible manner, but may serve as a way to attempt to exclude others from using the technology, or even to possibly mislead competitors into exploring an area that they might not have an interest in (sometimes skepticism is good.)
There are times when a Google or Yahoo acquires a company to gain access to the intellectual property of that company, or the intellectual prowess and expertise of that company's employees. And sometimes patents are just purchased.
Want to comment or discuss? Visit our Search Technology & Relevancy area of the Search Engine Watch Forums.
Posted by Bill Slawski at 8:42 PM | Permalink
New Search Patent Applications: June 19, 2006 - Autolinking, and Better Advertising through Deletion PredictionsFour patent applications from Google describe fighting spam in emails, providing product review searches, moving large amounts of data, and autolinking. Yahoo matches, and raises with five patent filings. One on watching deletions to choose better ads, another on serving dynamic information through a additional browser interface, and three more on multimedia and RSS.
Microsoft goes TV 2.0 with an electronic program guide, and describes a way of matching advertising content with certain search queries before those searches are made. IBM comes up with a unique way of presenting the results of a search from more than one search engine, and a way of reducing the amount of irrelevant results in a search by analyzing an initial set of results, identifying an appropriate additional query term from those results, and searching the original results again but with the additional query term included in the search.
Go Daddy describes a way of fighting spam in emails. Xerox employs collaborative filtering from previous users' searches to predict search results. Apostolos Gerasoulis, from Ask.com, with a couple of co-inventors, ranks and displays pages (objects) based upon linkage and textual data, and then defines a way to identifiy and assign topics to them.
Email Spam
Emails with links in them could be considered spam if the links point to pages that are in a conceptual category considered spammy. This patent application really doesn't describe the concept categorization part of the process. That's done in a related patent application mentioned within this document, and the related document lists Georges Harik as one inventor. Dr. Harik's name is on a very large percentage of the patent applications involving Gmail-type processes.
Method and system to detect e-mail spam using concept categorization of linked content Invented by Johnny Chen US Patent Application 20060122957 Published June 8, 2006 Filed December 3, 2004
Abstract
A system and method for detecting undesired electronic messages (e.g., spam) using concept categorization of hyperlinks is disclosed. A server receives an electronic message and retrieves web pages that correspond to hyperlinks in the message. The server performs concept categorization on the retrieved web pages based on semantic relationships in the received information to determine whether the electronic message meets predefined criteria associated with undesired messages.Searching and Aggregating Product Reviews
If Google wanted to get into the product or services review business, the next patent filing describes a blue print for the process that might make an effective and innovative system.
Method and system for finding and aggregating reviews for a product Invented by Jan Matthias Ruhl and Mayur D. Datar US Patent Application 20060129446 Published June 15, 2006 Filed December 14, 2004
Abstract
The embodiments disclosed herein include new, more efficient ways to collect product reviews from the Internet, aggregate reviews for the same product, and provide an aggregated review to end users in a searchable format. One aspect of the invention is a graphical user interface on a computer that includes a plurality of portions of reviews for a product and a search input area for entering search terms to search for reviews of the product that contain the search terms.Scaling and Distributing Data
Arvind Jain is the head of Research and Development in Google's Bangalore office, and has spoken at a number of conferences on infrastructure projects and issues involving such things as Google's crawl and indexing system, distributed file replication system, and compression techniques for large scale storage systems. He's listed as the inventor for this next Google filing.
System and method for scalable data distribution Invented by Arvind Jain US Patent Application 20060126201 Published June 15, 2006 Filed December 10, 2004
Abstract
A system having a resource manager, a plurality of masters, and a plurality of slaves, interconnected by a communications network. To distribute data, a master determined that a destination slave of the plurality slaves requires data. The master then generates a list of slaves from which to transfer the data to the destination slave. The master transmits the list to the resource manager. The resource manager is configured to select a source slave from the list based on available system resources. Once a source is selected by the resource manager, the master receives an instruction from the resource manager to initiate a transfer of the data from the source slave to the destination slave. The master then transmits an instruction to commence the transfer.Autolinking
Google's Autolink raised a lot of eyebrows, and brought some negative reactions. A Search Engine Watch Blog post from Danny Sullivan, Google Toolbar's AutoLink & The Need For Opt-Out defined many of the issues around the toolbar feature. The following patent application explains how such a system might work from the search engine's perspective.
Providing useful information associated with an item in a document Invented by Gueorgui Djabarov US Patent Application 20060129910 Published June 15, 2006 Filed December 14, 2004
Abstract
A method includes recognizing an item within a first document based on a pattern associated with the item but not the exact content of the item. The method further includes identifying a link for the item and providing a second document that includes information associated with the item when the link for the item is selected.Yahoo
Choosing Better Ads through User Behavior
Some queries involve the use of concepts and units, as described in at least five Yahoo patent filings (see previous patent posts in the Yahoo sections from Yahoo Units and Microsoft Redundancy Filters and More Yahoo Concepts and Google Predictive Searches.)
But sometimes a two term query isn't a concept as much as it is a couple of keywords that someone may use to search for something. If that person performs a second search after deleting one of the words, then the record of that deletion and second search might help Yahoo calculate "deletion probability scores" for words being used in these kind of two term queries.
This can be helpful when there isn't a good keyword based advertising match for that query, but there might be a good match individually for each of the terms that make up the query. The "deletion probability scores" can help determine which of the two terms to show keyword-based advertising for in search results.
System and methods for ranking the relative value of terms in a multi-term search query using deletion prediction Invented by Rosemary Jones and Daniel C. Fain US Patent Application 20060129534 Published June 15, 2006 Filed December 14, 2004
Abstract
The likely relevance of each term of a search-engine query of two or more terms is determined by their deletion probability scores. If the deletion probability scores are significantly different, the deletion probability score can be used to return targeted ads related to the more relevant term or terms along with the search results. Deletion probability scores are determined by first gathering historical records of search queries of two or more terms in which a subsequent query was submitted by the same user after one or more of the terms had been deleted. The deletion probability score for a particular term of a search query is calculated as the ratio of the number of times that particular term was itself deleted prior to a subsequent search by the same user divided by the number of times there were subsequent search queries by the same user in which any term or terms including that given term was deleted by the same user prior to the subsequent search. Terms are not limited to individual alphabetic words.Browser Interface Helpers
This next document describes some ways to provide additional dynamic information to someone via a toolbar styled interface, while they are browsing pages on the web.
Method of controlling an Internet browser interface and a controllable browser interface Invented by Thomas J. Shafron Assigned to Yahoo US Patent Application 20060129937 Published June 15, 2006 Filed February 2, 2006
Abstract
The present invention is directed to a method of dynamically controlling and displaying an Internet browser interface, and to a dynamically controllable Internet browser interface. In accordance with the present invention, a browser interface may be customized using a controlling software program that may be provided by an Internet content provider, an ISP, or that may reside on an Internet user's computer. The controlling software program enables the Internet user, the content provider, or the ISP to customize and control the information and/or functionality of a user's browser and browser interface.RSS Enhancements
The following three Yahoo filings all list the same inventors, including John Thrall who is the head of media search engineering, for Yahoo Search. They provide different aspects of using RSS with multimedia files.
Syndicating multiple media objects with RSS Invented by Andrew R. Volk, David D. Hall, and John J. Thrall US Patent Application 20060129917 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for syndicating more than one media object in an element using Real Simple Syndication (RSS). In one embodiment, multiple media objects with at least one shared characteristic are syndicated under the same element. For example, a single media object can come in multiple formats and/or compression rates.Syndicating multimedia information with RSS Invented by Andrew R. Volk, David D. Hall, John J. Thrall US Patent Application 20060129907 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for adding descriptive information to a Real Simple Syndication (RSS) document. The descriptive information describes the content of media objects syndicated through the document. The descriptive information can be used to provided additional information to a subscriber, and can be used in searching for syndicated media content.RSS rendering via a media player Invented by Andrew R. Volk, David D. Hall, John J. Thrall US Patent Application 20060129916 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for syndicating media objects through a link to a media player using Real Simple Syndication (RSS). A content provider may not want to give direct access to a media object to a subscriber. Instead a content provider can give the subscriber a link to a media player that can access the media object.Microsoft
Searching electronic program guide data Invented by Pradhan S. Rao, David Hendler Sloo, Daniel Danker, and George K. Nyako Assigned to Microsoft US Patent Application 20060130098 Published June 15, 2006 Filed December 15, 2004
Abstract
Searching electronic program guide (EPG) data is described. The EPG data may be compartmentalized into channel metadata that describes characteristics of one or more channels and content metadata that describes characteristics of one or more content items. In a implementation, a method includes searching channel metadata and content metadata. A result of the searching is formed for output in conjunction with an electronic program guide (EPG).System and method for indexing and prefiltering Invented by Brian Burdick, Joshua J. Forman, Kevin P. Kornelson, Murali Vajjiravel, and Rajeev Prasad Assigned to Microsoft US Patent Application 20060129555 Published June 15, 2006 Filed December 9, 2004
Abstract
A method and system are provided for selecting advertisements for presentation to a user in response to a user search query. The system may include a keyword server for parsing the user search query and an index server for receiving the parsed search query. The index server may include an index of advertising phrases and pre-filtering components for comparing index entries to the parsed user search query in order to discard non-matching index entries and locate matching entries. The pre-filtering components may include either a phrase length pre-filtering component or a word hash pre-filtering component. The system may additionally include a listing server for sorting through the matching entries located by the index server and further filtering the matching entries for retrieval and presentation to the user.IBM
Ring method, apparatus, and computer program product for managing federated search results in a heterogeneous environment Invented by Wade Shelby Beavers and David Joseph Borrillo Assigned to IBM US Patent Application 20060129530 Published June 15, 2006 Filed December 9, 2004
Abstract
A method, apparatus and computer program product are provided for managing federated search results in a heterogeneous environment. A user enters a search term and the search term is submitted to multiple selected search engines. Search results are gathered from each selected search engine. A search ring is generated including a ring section to represent each of the selected search engines for enabling the user to view search results from one or more of the selected search engines.Method and system for suggesting search engine keywords Invented by Cary Lee Bates Assigned to IBM US Patent Application 20060129531 Published June 15, 2006 Filed December 9, 2004
Abstract
A search engine receives a search query having one or more keywords. The documents in the result set from that search query are analyzed to identify one or more additional keywords that further segment, or separate, the initial result set. These additional keywords are presented to the user who then selects whether to include or exclude documents matching the additional keywords. In this way, the number of documents in the initial result set is reduced in a relatively quick and effortless manner.Go Daddy
Email filtering system and method Invented by Brad Owen and Jason Steiner US Patent Application 20060129644 Published June 15, 2006 Filed December 14, 2004
Abstract
Systems and methods of the present invention allow filtering out spam and phishing email messages based on the links embedded into the email messages. In a preferred embodiment, an Email Filter extracts links from the email message and obtains desirability values for the links. The Email Filter may route the email message based on desirability values. Such routing includes delivering the email message to a Recipient, delivering the message to a Quarantine Mailbox, or deleting the message.Xerox
Personalized web search method Invented by Lisa S. Purvis Assigned to Xerox Corporation US Patent Application 20060129533 Published June 15, 2006 Filed December 15, 2004
Abstract
A method for contextualizing search results is disclosed. The method includes performing a traditional web query that returns a set of result pages, using collaborative filtering techniques to generate a set of predicted pages, comparing the set of predicted pages with the set of result pages, and ranking the set of result pages so that result pages that are also included in the set of predicted pages are ranked higher than those that are not. Methods herein also contemplate using the search history of the user or others to refine the results of searches.Ask.com
Relevancy-based database retrieval and display techniques Invented by Tao Yang, Wei Wang, and Apostolos Gerasoulis US Patent Application 20060129552 Published June 15, 2006 Filed February 2, 2006
Abstract
Techniques to retrieve, rank and display data objects retrieved form a database are described. In particular, methods to assign a global ranking value to a data object based on a combination of that object's link-based (e.g., vector-space cluster analysis) and text-based (e.g., word frequency) ranks are described. Additional techniques to determine a set of concepts, topics or key words associated with each retrieved data objects are described.My usual reminder about patents: Some of the processes and technology described in patents are created in house, and some are developed with the assistance of contractors and partners. A percentage are never developed in a tangible manner, but may serve as a way to attempt to exclude others from using the technology, or even to possibly mislead competitors into exploring an area that they might not have an interest in (sometimes skepticism is good.)
There are times when a Google or Yahoo acquires a company to gain access to the intellectual property of that company, or the intellectual prowess and expertise of that company's employees. And sometimes patents are just purchased.
Want to comment or discuss? Visit our Search Technology & Relevancy area of the Search Engine Watch Forums.
Posted by Kevin Heisler at 8:42 PM | Permalink
New Search Patent Applications: June 19, 2006 - Autolinking, and Better Advertising through Deletion PredictionsFour patent applications from Google describe fighting spam in emails, providing product review searches, moving large amounts of data, and autolinking. Yahoo matches, and raises with five patent filings. One on watching deletions to choose better ads, another on serving dynamic information through a additional browser interface, and three more on multimedia and RSS.
Microsoft goes TV 2.0 with an electronic program guide, and describes a way of matching advertising content with certain search queries before those searches are made. IBM comes up with a unique way of presenting the results of a search from more than one search engine, and a way of reducing the amount of irrelevant results in a search by analyzing an initial set of results, identifying an appropriate additional query term from those results, and searching the original results again but with the additional query term included in the search.
Go Daddy describes a way of fighting spam in emails. Xerox employs collaborative filtering from previous users' searches to predict search results. Apostolos Gerasoulis, from Ask.com, with a couple of co-inventors, ranks and displays pages (objects) based upon linkage and textual data, and then defines a way to identifiy and assign topics to them.
Email Spam
Emails with links in them could be considered spam if the links point to pages that are in a conceptual category considered spammy. This patent application really doesn't describe the concept categorization part of the process. That's done in a related patent application mentioned within this document, and the related document lists Georges Harik as one inventor. Dr. Harik's name is on a very large percentage of the patent applications involving Gmail-type processes.
Method and system to detect e-mail spam using concept categorization of linked content Invented by Johnny Chen US Patent Application 20060122957 Published June 8, 2006 Filed December 3, 2004
Abstract
A system and method for detecting undesired electronic messages (e.g., spam) using concept categorization of hyperlinks is disclosed. A server receives an electronic message and retrieves web pages that correspond to hyperlinks in the message. The server performs concept categorization on the retrieved web pages based on semantic relationships in the received information to determine whether the electronic message meets predefined criteria associated with undesired messages.Searching and Aggregating Product Reviews
If Google wanted to get into the product or services review business, the next patent filing describes a blue print for the process that might make an effective and innovative system.
Method and system for finding and aggregating reviews for a product Invented by Jan Matthias Ruhl and Mayur D. Datar US Patent Application 20060129446 Published June 15, 2006 Filed December 14, 2004
Abstract
The embodiments disclosed herein include new, more efficient ways to collect product reviews from the Internet, aggregate reviews for the same product, and provide an aggregated review to end users in a searchable format. One aspect of the invention is a graphical user interface on a computer that includes a plurality of portions of reviews for a product and a search input area for entering search terms to search for reviews of the product that contain the search terms.Scaling and Distributing Data
Arvind Jain is the head of Research and Development in Google's Bangalore office, and has spoken at a number of conferences on infrastructure projects and issues involving such things as Google's crawl and indexing system, distributed file replication system, and compression techniques for large scale storage systems. He's listed as the inventor for this next Google filing.
System and method for scalable data distribution Invented by Arvind Jain US Patent Application 20060126201 Published June 15, 2006 Filed December 10, 2004
Abstract
A system having a resource manager, a plurality of masters, and a plurality of slaves, interconnected by a communications network. To distribute data, a master determined that a destination slave of the plurality slaves requires data. The master then generates a list of slaves from which to transfer the data to the destination slave. The master transmits the list to the resource manager. The resource manager is configured to select a source slave from the list based on available system resources. Once a source is selected by the resource manager, the master receives an instruction from the resource manager to initiate a transfer of the data from the source slave to the destination slave. The master then transmits an instruction to commence the transfer.Autolinking
Google's Autolink raised a lot of eyebrows, and brought some negative reactions. A Search Engine Watch Blog post from Danny Sullivan, Google Toolbar's AutoLink & The Need For Opt-Out defined many of the issues around the toolbar feature. The following patent application explains how such a system might work from the search engine's perspective.
Providing useful information associated with an item in a document Invented by Gueorgui Djabarov US Patent Application 20060129910 Published June 15, 2006 Filed December 14, 2004
Abstract
A method includes recognizing an item within a first document based on a pattern associated with the item but not the exact content of the item. The method further includes identifying a link for the item and providing a second document that includes information associated with the item when the link for the item is selected.Yahoo
Choosing Better Ads through User Behavior
Some queries involve the use of concepts and units, as described in at least five Yahoo patent filings (see previous patent posts in the Yahoo sections from Yahoo Units and Microsoft Redundancy Filters and More Yahoo Concepts and Google Predictive Searches.)
But sometimes a two term query isn't a concept as much as it is a couple of keywords that someone may use to search for something. If that person performs a second search after deleting one of the words, then the record of that deletion and second search might help Yahoo calculate "deletion probability scores" for words being used in these kind of two term queries.
This can be helpful when there isn't a good keyword based advertising match for that query, but there might be a good match individually for each of the terms that make up the query. The "deletion probability scores" can help determine which of the two terms to show keyword-based advertising for in search results.
System and methods for ranking the relative value of terms in a multi-term search query using deletion prediction Invented by Rosemary Jones and Daniel C. Fain US Patent Application 20060129534 Published June 15, 2006 Filed December 14, 2004
Abstract
The likely relevance of each term of a search-engine query of two or more terms is determined by their deletion probability scores. If the deletion probability scores are significantly different, the deletion probability score can be used to return targeted ads related to the more relevant term or terms along with the search results. Deletion probability scores are determined by first gathering historical records of search queries of two or more terms in which a subsequent query was submitted by the same user after one or more of the terms had been deleted. The deletion probability score for a particular term of a search query is calculated as the ratio of the number of times that particular term was itself deleted prior to a subsequent search by the same user divided by the number of times there were subsequent search queries by the same user in which any term or terms including that given term was deleted by the same user prior to the subsequent search. Terms are not limited to individual alphabetic words.Browser Interface Helpers
This next document describes some ways to provide additional dynamic information to someone via a toolbar styled interface, while they are browsing pages on the web.
Method of controlling an Internet browser interface and a controllable browser interface Invented by Thomas J. Shafron Assigned to Yahoo US Patent Application 20060129937 Published June 15, 2006 Filed February 2, 2006
Abstract
The present invention is directed to a method of dynamically controlling and displaying an Internet browser interface, and to a dynamically controllable Internet browser interface. In accordance with the present invention, a browser interface may be customized using a controlling software program that may be provided by an Internet content provider, an ISP, or that may reside on an Internet user's computer. The controlling software program enables the Internet user, the content provider, or the ISP to customize and control the information and/or functionality of a user's browser and browser interface.RSS Enhancements
The following three Yahoo filings all list the same inventors, including John Thrall who is the head of media search engineering, for Yahoo Search. They provide different aspects of using RSS with multimedia files.
Syndicating multiple media objects with RSS Invented by Andrew R. Volk, David D. Hall, and John J. Thrall US Patent Application 20060129917 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for syndicating more than one media object in an element using Real Simple Syndication (RSS). In one embodiment, multiple media objects with at least one shared characteristic are syndicated under the same element. For example, a single media object can come in multiple formats and/or compression rates.Syndicating multimedia information with RSS Invented by Andrew R. Volk, David D. Hall, John J. Thrall US Patent Application 20060129907 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for adding descriptive information to a Real Simple Syndication (RSS) document. The descriptive information describes the content of media objects syndicated through the document. The descriptive information can be used to provided additional information to a subscriber, and can be used in searching for syndicated media content.RSS rendering via a media player Invented by Andrew R. Volk, David D. Hall, John J. Thrall US Patent Application 20060129916 Published June 15, 2006 Filed December 1, 2005
Abstract
System and method for syndicating media objects through a link to a media player using Real Simple Syndication (RSS). A content provider may not want to give direct access to a media object to a subscriber. Instead a content provider can give the subscriber a link to a media player that can access the media object.Microsoft
Searching electronic program guide data Invented by Pradhan S. Rao, David Hendler Sloo, Daniel Danker, and George K. Nyako Assigned to Microsoft US Patent Application 20060130098 Published June 15, 2006 Filed December 15, 2004
Abstract
Searching electronic program guide (EPG) data is described. The EPG data may be compartmentalized into channel metadata that describes characteristics of one or more channels and content metadata that describes characteristics of one or more content items. In a implementation, a method includes searching channel metadata and content metadata. A result of the searching is formed for output in conjunction with an electronic program guide (EPG).System and method for indexing and prefiltering Invented by Brian Burdick, Joshua J. Forman, Kevin P. Kornelson, Murali Vajjiravel, and Rajeev Prasad Assigned to Microsoft US Patent Application 20060129555 Published June 15, 2006 Filed December 9, 2004
Abstract
A method and system are provided for selecting advertisements for presentation to a user in response to a user search query. The system may include a keyword server for parsing the user search query and an index server for receiving the parsed search query. The index server may include an index of advertising phrases and pre-filtering components for comparing index entries to the parsed user search query in order to discard non-matching index entries and locate matching entries. The pre-filtering components may include either a phrase length pre-filtering component or a word hash pre-filtering component. The system may additionally include a listing server for sorting through the matching entries located by the index server and further filtering the matching entries for retrieval and presentation to the user.IBM
Ring method, apparatus, and computer program product for managing federated search results in a heterogeneous environment Invented by Wade Shelby Beavers and David Joseph Borrillo Assigned to IBM US Patent Application 20060129530 Published June 15, 2006 Filed December 9, 2004
Abstract
A method, apparatus and computer program product are provided for managing federated search results in a heterogeneous environment. A user enters a search term and the search term is submitted to multiple selected search engines. Search results are gathered from each selected search engine. A search ring is generated including a ring section to represent each of the selected search engines for enabling the user to view search results from one or more of the selected search engines.Method and system for suggesting search engine keywords Invented by Cary Lee Bates Assigned to IBM US Patent Application 20060129531 Published June 15, 2006 Filed December 9, 2004
Abstract
A search engine receives a search query having one or more keywords. The documents in the result set from that search query are analyzed to identify one or more additional keywords that further segment, or separate, the initial result set. These additional keywords are presented to the user who then selects whether to include or exclude documents matching the additional keywords. In this way, the number of documents in the initial result set is reduced in a relatively quick and effortless manner.Go Daddy
Email filtering system and method Invented by Brad Owen and Jason Steiner US Patent Application 20060129644 Published June 15, 2006 Filed December 14, 2004
Abstract
Systems and methods of the present invention allow filtering out spam and phishing email messages based on the links embedded into the email messages. In a preferred embodiment, an Email Filter extracts links from the email message and obtains desirability values for the links. The Email Filter may route the email message based on desirability values. Such routing includes delivering the email message to a Recipient, delivering the message to a Quarantine Mailbox, or deleting the message.Xerox