Monthly Archives: October 2004

You are browsing the site archives by month.


Snap (No, not the old Cnet/NBC Snap. This is a new Snap)
Type of Engine: Popularity based on user data* and shopping.
Overall: Good.
If this engine were a drink it would be…a Bloody Mary. There’s different ways to make it; there’s lots of ingredients inside it; and you can see all the various colors and vegetables floating around inside whether you want to or not.

*I couldn’t think of a better way to say this, but basically Snap incorporates data into their algorithm about which sites users clicked on. In the search industry that’s known as CTR (click-through rate). If you know a cleaner way to express what kind of engine Snap is, please email me.

Snap uses click data, including which sites get clicked on and how many sub-pages are viewed by users, to help determine relevancy. They state they have data from 1 million users going back to January, 2004. This is an idea that has been kicked around and used to various degrees by other companies including LookSmart and Google AdWords (maybe others too). But Snap has taken things a step further; not only do they incorporate click-through rates but you can also manipulate your results by data from other users. I’ll explain more later.

–A quick aside: If you’re starting a new company why use the name of a deceased company that did business in the same industry?, the departed search engine that was jointly owned by Cnet and NBC and later became part of NBCi, existed for several years. So I was very surprised to see that a new Phoenix had arisen from its ashes (there is no affiliation between the two Snaps as far as I know).–

UI and Features
Because there’s a lot going on with Snap’s UI I’ve broken this section into subsections.
The Home Page
The first thing you’ll notice is their cluttered homepage. It’s jumbled compared to the nowadays ubiquitous Google type of UI. But if you take a moment there’s some pretty good stuff to see. On the left of the page are a few summaries of popular queries: Top Products, Top People and Top Music. On the right are the latest articles from Snap’s Blog. In the middle of the homepage there’s a lot…there is a running tally of the number of searches done. There are also graphs; click on those and you’ll see all kinds of things like stats about Revenue, Search and Advertisers. Check it out. Seeing as how I have a weird hobby of analyzing search queries, I like the keyword statistics page, but I’m not sure the average user has much use for it. But it’s all part of Snap’s goal of transparency.

It’s certainly a different type of homepage, but I think they’d do better to put some of this information into their About Us section or elsewhere. When I first logged onto the Snap site I wasn’t sure if I’d reached a regular search engine or if it was something else like an enterprise search where I was being sold a search product. I am glad that they have all this information, but I wish they’d remove it from the landing page and put it elsewhere.

Related Keywords/Count
After you do a search you’ll see in the upper right corner a list of similar searches and their frequency. Interestingly enough, I remember that the original (the Cnet/NBC one) had a similar feature that displayed related searches. I tried a search for library disaster plans and the related keywords were things like library of congress, floor plans, wet, libraries, etc. None of the terms were relevant to my search as stand-alone terms. I clicked on wet because I was curious about it, and of course the related keywords for wet were porn queries that I won’t repeat here.

Refining Queries
After you do a search there’s a search box called Type Here To Refine where you can search within results, almost like a “Control F” find functionality merged with a search functionality. Give it a try, it’s fun to play around with.

On the search results page there are several columns that appear along with the site results. Clicking on columns lets you sort results in different ways. This is a nice feature, but again it’s one that most searches really don’t need in my opinion. I can see its value for advertisers who may want to purchase a keyword –think of the Overture bidding model – but for the average searcher sorting by conversion rate isn’t necessary. (By the way, the founder of Snap is Bill Gross, the same guy who founded Overture.) You can even reverse sort by rank, but again I’m not sure why I’d really want to do that. The columns in the results are as follows (note: I started to define each one but then realized Snap has a nice glossary so I took it directly from them.):

1. No. of Clicks is the count of users in the Snap Network since January 2004 who did the same search you just did, and who then clicked on this listing. Typically, the higher the better. An asterisk* indicates that the data is estimated.

2.Average Page Views is the average number of pages of this site that were viewed by users in the Snap Network who did the same search that you just did. Again, higher numbers are typically better. An asterisk* indicates that the data is estimated.

3.Cost to Advertiser is the amount that an advertiser pays Snap for referring a customer who subscribes, purchases, bids, registers, or downloads.

4.Conversion Rate is the percentage of users who subscribe, purchase, bid, register, or download at an advertiser’s site. You guessed it, higher numbers are better — they typically mean that this advertiser is better at fulfilling customer needs.

5.Domain is the top level domain of the site. Most commercial sites in the United States are ‘.com’ sites. However, depending on the search term, there are often excellent results in other domains, such as .edu (educational) and .gov (government).

You can filter multiple columns simultaneously. For example, in a column that displays numbers, type in ‘>’ (greater-than), ‘=’ (equal-to), or ‘< ' (less-than), followed by a number, and at the same time filter on a phrase in another column.

Shopping Queries
Shopping queries, like ipod, produce different columns. I could sort by price range, type of memory, amount of memory, weight, etc. These columns vary based on the shopping query, so if you search for laptops you’ll see columns like OS, screen size, etc. Very nice because what they’re doing is customizing the UI based on the query, which I love. If you click on one of the results you’ll get a nice preview lower on the page. Although all the results were relevant in that they were indeed portable music players, many of them were actually ipod competitors. It’s a common practice for shopping engines to do this, but I think it impacts negatively on relevance. If I wanted to see ipod competitors I’d search for something like portable MP3 players. Many commercial searches, such as bose wave didn’t get any special shopping columns. Not so good but maybe they’re still building their shopping list.

Corporate Queries
Another type of query that gets special treatment are companies like Amazon. The interface is different: there is a list of most popular sub-pages, a cached snapshot of Amazon’s homepage, a company snapshot, and news headlines. A nice feature, but again it’s inconsistent because when I searched for Nike I got a regular results page.

Next to each result there’s a spot for a logo. It’s at the domain level so that AOL member personal pages have the AOL logo even though they’re not corporate-sponsored. But as with other things on Snap this is inconsistent. For my Nike query the only sites that had logos were a sub-page from MSN and a sub-page from Amazon.

Query Examples
I cheated a little and included several query examples in the UI and Features section in order to illustrate Snap’s features. But in terms of relevancy, Snap is pretty good. For my library disaster plans query the results were good. Though I should point out that result #1 wasn’t so good because it’s a FEMA information sub-page titled “library.” But I can see why Snap returned it so I’ll let it slide. I reordered results by number of clicks and things looked slightly better because the FEMA site went away. Though again I should point out that for this obscure query the highest number of clicks was only 5. For context the highest number of clicks for Nike was 1,735 followed in second place by 917.

I tried cliff house restaurant san francisco and the results were good. The official page was #1, though it’s interesting to note that it had 1 click whereas the second result, a Yahoo Travel restaurant review, was highest with 8 clicks. This is a good example of when click data can be misleading for algorithms. Even if more people click on the review the official page needs to be the first result. And Snap got it right.

I like what Snap’s trying to do, but I fear they’re overloading the average user with too much information. They’re showing the guts of their technology rather than incorporating it seamlessly behind the scenes. After you do a search and are waiting for the results, which can be slow, the screen will say things like Preparing Data for Display, De-Duplicating Listings and Getting Top Listings. I appreciate the honesty but I’m not sure it enhances my experience.

The columns are fun to play around with but I think they take up valuable real estate that could be used for displaying more metadata and text about the sites. Maybe Snap can play around with its UI the way A9 has and make it more customizable so that I can eliminate or add columns as I see fit.

They also need to take some of their nice features, such as the differentiated results page for shopping queries, and roll that out for many more terms. I’ll give them the benefit of the doubt and assume they’ll get to this soon.

I want to say one more thing though: Snap is pushing things and I really appreciate that. I like very much that they’re making an effort to be unique and I will definitely pop in periodically to see how they’re doing and what new innovations they’ve implemented.


Type of Engine: User ratings.
Overall: Needs Improvement.
If this engine were a drink it would be…a wine cooler. I like wine and I like juice, but the two together aren’t so great.

Netnose offers the ability for everyone to rate the relevancy of search terms to specific web sites. If I understand it correctly, sites only appear in Netnose results if they’ve been rated as relevant by a user. I guess that explains why so many queries I tried performed very poorly; the content simply had not been added to their database. I like the idea of user ratings, but a little-known engine like Netnose needs to do more to supplement their immediate relevance, otherwise users won’t come back. Right now it seems like they’re waiting for the community to do enough to make it a usable search tool, but they’re far from that goal.
I almost hate to give them a Needs Improvement rating, but my perspective is that of the searcher and relevance is my goal. So although the technology and goals of letting users add ratings may be admirable, I won’t be totally convinced until I see the relevancy improve.

UI and Features
One thing that bothered me was how do I rate specific sites rather than random ones? Let me explain. If you click on rating you’ll go to a page that lets you rate now. A Javascript pop-up window then opens along with a random web site. On the pop-up window there’s a selection of suggested query terms for the site. There’s also a ratings scale of bad, fair, good, better or best. You can also add commercial and adult tags. So you rate each term and then go to the next random site. Generally the suggested terms were decent. OK, so far so good, I suppose. But what about when I do a search? I want to be able to rate the search results right then and there and I don’t see a way to do that. You can assign a category at that point, but not a rating other than dead link/totally irrelevant. It seems like mapping the search results to the rating process would facilitate the process and garner more ratings.

The site submission page actually lets you enter metadata for newly addes sites, including 5 search terms and a category. Although I love the category idea, the categories don’t make a whole lot of sense to me: Just for Kids, Research, How To, Entertainment, Real Time Stats, Shopping, Business Related. Go ahead and try assigning one of these categories, it’s not always easy. There are also questions about whether the site sells products, has adult content, uses pop-ups, or has paid content. Letting the public add metadata is a noble idea and I’m far from giving up on it as viable, but having worked for two different search engines I know first-hand how devious spammers can be. Though I should add that sites don’t go live in search results until other user(s) rate them. All very interesting ideas similar to what some other community sustained sites have done, such as Zeal and ODP. Though those two directories are more browsing-based and Netnose is purely search-driven.

Query Examples
The relevancy is not so good. In fact, some times it’s very bad. For example, I searched for Barry Zito and the 1st and 4th results were about Barry Manilow. The 2nd was about Dave Barry and the 3rd Barry Choral Society. So then I went to the Advanced Search page and required both terms Barry and Zito and then I got no results. That’s an outcome of no one having mapped Barry Zito to a web page about him. It would be better to default to fall-through results from another engine than to show me sites about Barry Manilow that only match the first term of a two-term phrase query. Bad.

Try doing a search and then refreshing the browser. The ordering of results slightly changes each time. Odd…

I like the idea of Netnose, but for something like this to work there really has to be a critical mass of users working diligently to make a difference. Right now I don’t think that’s the case. Letting users rate sites is an idea that’s been discussed for years. I know Google played around with a toolbar rating system. I don’t think that ever caught on, but does anyone else know differently? It’s a chicken and egg thing here, because you won’t get enough users interested unless the search is good, but the search won’t get good until enough users rate sites.


Type of Engine: Basic search engine.
Overall: Average.
If this engine were a drink it would be…a glass of brännvin, as in Swedish schnapps. A traditional drink that may never make it into your repertoire but it’s good to know it’s out there.

Entireweb is a Swedish engine. It’s basic and straight-forward with a few nice touches. Nothing on the front-end drastically sets them apart.
On their About Us page they state very clearly that the search is currently a Beta version, though I don’t know how long that’s been the case.
Their spider is called SpeedySpider.

UI and Features
Entireweb offers several ways to refine queries. There are more than 30 languages to choose from, or you can limit by continent (when did Scandinavia become a continent?), domain, country or a combination thereof.

There’s an advanced search page for Boolean searching. One option they have is “How many rows of ‘Page Content’ to show.” This is s a nice feature though at present the max is 5 lines. I’d be interested in playing around with setting it higher than that. I’ve always felt that site descriptions on search engines is a major area for improvement.

I like the language and region metadata they show on search results page. Here’s an example (I snipped description and page content to save space):
1. Brad Friedel – Latest News Headlines [SpeedyView]
Page content:
Language: english Region:uk
Related links:

I like the language and region tags. I’m not so excited about Related Links, but that’s because for all the searches I did, I never saw URLs that were not sub-pages from the same domain. So instead of Related Links, it should be called More results from same source. Speedy View is a page preview feature.

Check out their Speedy Spy page. On it you’ll find their Top 20 queries and also the latest 20 queries. I kept entering a search and then quickly refreshing the Speedy Spy page, but I never did see my query appear on the list. I appreciate it though when engines give insights into user behavior.

Query Examples
I had a little soccer injury last week so I searched for “strained calf muscle”. I wasn’t too impressed with the results. The first couple seem to be from commercial sites and when I’m looking for medical info I really want a trusted source. Some of the other results were OK, but they were focused on the knee or Achilles heel with the calf being mentioned in passing. A better results page would have had content specifically about calf muscles from non-commercial medical sites.

The number of results is always pretty low. I searched for “huckabees” and got 4 results. All four were about the movie. The first two were very relevant and the second too were too specific for my query. But where is IMDB? Where is the trailer? Where are the movie reviews? And none of the results had the word “huckabee” anywhere in the visual display text and that’s bad.

I’ve had fun reviewing visual and clustering and music and meta-search engines, so it’s nice to return to a classic, AltaVista/Google/FAST/Wisenut/Teoma search engine. Entireweb seems to have a good base to build on. They say they’ve been around since May of 2000, but that in 2002-2003 they redesigned their search technology. So, I’m not sure where that leaves us now at the end of 2004. Is the 2002/03 search the beta search that’s live? If so, that’s not very impressive. I don’t need them to revolutionize their UI or search features, but their relevance needs some improvement. They need to build up the size of their index as well as improving the order of results.



Type of Engine: Visual and clustering.
Overall: Good.
If this engine were a drink it would be…an Emu Export, it’s Australian, has a funny name, I’d never heard of it until very recently, and it’s a safe bet that you’ve never heard of it.

Mooter is a visual clustering search engine and I like it. They’re from Australia and have been live only about a year.

According to their Technology site, which actually provides some useful information about what they’re up to, “Mooter gets it results from its own spidering, and a unique index of websites. While we are growing, we are supplementing our index with metasearch, and comparing the results from various engines before applying our analysis algorithms.” This is an interesting statement and I’m not exactly sure what they mean by it. If I had to guess it sounds like they’re spidering other engines’ indexes to create their clusters. Is this different from what Clusty or Clush does? I’m not sure, but would love to know the answer. Please email me if you know. It sounds like they plan to generate an entire web index, but that could be wishful thinking.

UI and Features
For the most part I like their interface, it’s simple and almost cheesy, but somehow likable. The Overture supplied Sponsored Links are killing me though. When you click into a cluster, the Sponsored Links take up nearly half the screen; bad, very bad.

You can click “All Results” to get the full list of results. Mooter maxes out at 120 results, or at least I didn’t find any queries that produced more than that.

If you don’t like the first cluster you see, click on the “Next Clusters” icon (the icon needs some improvement; it looks like a cluster of red pimples) to see another cluster.

Query Example
For phrase searches, each word usually becomes a cluster. For the search “William Styron” one of the clusters was “William.” Not good, but then I clicked on the cluster link and the sites were indeed about William Styron, and not just any old William. But still a “William” cluster doesn’t really help me.

Even if the name of a cluster doesn’t sound relevant, the links contained therein were generally on target. So I’d say they’re getting the back-end organization of clusters correct, but what they need to do is improve their cluster names and concepts. Maybe more phrase matching rather than pulling out just single terms, as if I know what I’m talking about.

They could also make the visual part of their results more compelling. As it is right now, it almost doesn’t need to be visual because the visual part of it doesn’t add much beyond novelty (and even the novelty is wearing off as more Kartoo-style visual engines appear).


Overall: Average quality, yet still very enjoyable to play around with.
If this engine were a drink it would be…a mint julep. It’s not your everyday drink, but you’ll find it a sweet break from the norm.

Musicplasma is a music search tool that lets you discover music artists similar to ones you already like. Oh, and it’s visual, like Kartoo.

I’m not really sure how they determine similarities. If I had to guess I’d say they base it on an ontology of genres (rock, rap, etc.), and on mining something like Amazon’s “Customers who bought that, also bought these” type of functionality.
UI and Features
You can zoom in or out on clusters, thereby focusing or expanding your view of similar artists.

Clicking on the links – those ethereal lines – scrolls the page in that direction. Nice feature!
Clicking on other clusters will refocus the clusters around that artist.
The Design panel allows for changing colors and other appearances if you’re into that kind of thing.

Query Examples
Sometimes the clusters make total sense. Try a search for Guided by Voices and the closest cluster will be Robert Pollard, the lead singer who has done solo albums. Sometimes the clusters are a bit off. Try searching for David Byrne and for some reason Paul Westerberg – lead singer of the Replacements – comes between Byrne and the Talking Heads. I’m not saying that’ss incorrect, but my first reaction was surprise. It could be accurate that people who like David Byrnes’ solo stuff, which doesn’t sound much like the Talking Heads, might like Paul Westerberg, Warren Zevon and Roxy Music (all closer than the Talking Heads).
I noticed that powerhouses like the Rolling Stones and Neil Young shows up in lots of places. I searched for Prince Buster, the 60s ska pioneer, and there’s Neil’s cluster. A search for Bad Brains similarly showed the Stones lurking one link away. Now obviously Neil Young and the Stones have influenced tons of groups, but I’m not sure that Bad Brains should be one link away. Anyone know why that would be?

I’d like to know more about the links. Is one artist linked to another because they collaborated? Or are they linked because they play similar music? Or are on the same label?
OK, so it’ss fun to play with, but give me some song samples.
How about letting me type in more than one group so I can really focus in?
Focus by time period. I really like early Stones, when they sounded like, say the Small Faces, but I hate recent Stones, when they sound like, say crap.

Musicplasma is fun to play with, but it needs to be more practical. Take the visual music search engine and turn it into an audio search engine. If that’s too far-flung, then at least show more context on how artists are linked. But like I said, it sure is fun…

Post Navigation