The Search Lounge » Question #3, Part 5 - Phrase Matching and Blog Searching

9/24/2005

Question #3, Part 5 - Phrase Matching and Blog Searching

Filed under: — Chris

What can search engines could do better?

One person responded:

    * …I’ll do a search and get results with just one of the key words in it…e.g. if I did a search on Tamil Tigers, I would probably get results with just the word Tigers, which I’m not at all interested in.

    Comments: partial phrase matching is indeed an issue with search engines. Quoting searches can help with this, but can also overly limit the result set. Proximity can be a problem, as in pages that mention Tamil and Tigers, but they don’t mention them connected as a phrase (probably unlikely for this example, but you get the point). Phrase matching is really the key to relevancy. For most one-word queries it’s either easy enough to assume what the user is searching for (eBay), or ambiguous enough (jaguar) that a breadth of results is appropriate. But with a multi-term query you get into the world of concept matching, as in when a web site defines an entity as Navy Blue Shirt, but the searcher searches for Dark Blue Shirt.

    But maybe the bigger challenge is the challenge of “small mentions". These days, with a two-word query like Tamil Tigers I think most results will match on both terms. However, what about pages that make only a passing reference to the term?

Another respondent wrote:

    * Update blog searches quicker.

    Comments: most blog search engines are near real-time right now, though I guess “near” is not real-time. I notice a lag sometimes between when I publish a posting on the Search Lounge and when it gets pushed to my RSS feeds.

Comments »

The URI to TrackBack this entry is: http://searchlounge.org/wp-login.php?redirect_to=/wp-admin/wp-login.php?redirect_to=/wp-admin/wp-login.php?redirect_to=/wp-admin/wp-admin/wp-admin/wp-trackback.php/62

No comments yet.

RSS feed for comments on this post.

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>


Powered by WordPress