BrainBoost is a natural language search engine. Ask BrainBoost questions in plain English and youâ€™ll get answers in plain English. BrainBoost is automated and uses no human editorial invention. The legend goes that BrainBoost was created by 24-year-old software programmer Assaf Rozenblatt. It took him a year to build it and he built it so that his fiancÃ© could better do her college research.
For more information and analysis, check out the review of BrainBoost I did for the Search Lounge.
Hi Assaf, thank you so much for joining me here at the Search Lounge. I know you started BrainBoost, but what exactly is your role these days? And can you provide some more background about the size and structure of the company?
We are a very small team at the moment, with only a handful of developers.
We are still primarily focused on development, but we will be switching gears soon to the sales and marketing of our licensable AnswerRank technology.
I am still very hands-on with the software development and continually help improve the technology on an ongoing basis.
A big issue in Internet search is evaluating the trustworthiness of sources. This issue is amplified in BrainBoost because the answers are shown right on the search results page and do not require users to click through to investigate the trustworthiness of the source. For example, for the search what is the population of Scotland?, the first three answers are slightly different (5.2 million, 5.1 million, just over 5 million. Like I said, just a slight difference.) Maybe if you included a published/crawled date, would that help? Or some kind of page rank metric? Do you have any suggestions for how BrainBoost users should address this issue?
We are currently working on a PageRank like system to help identify trustworthy sources.
How do you evaluate the relevancy and quality of the results that are returned on BrainBoost? Do you have a formal process in place for doing this? And, what subjects or types of queries do you think BrainBoost is particularly good at? How about subjects or types of queries that need some improvement?
For QA, we compiled a database of common questions and manually researched the answers for each of them. We then run the questions through the BrainBoost engine, which in turn automatically goes out to find answers. Precision is then easily determined by comparing what percent of the automatically generated BrainBoost answers match our manually found answers
There really isnâ€™t a question type that is problematic for us at this time.
BrainBoost is 100% automated, but would you consider blending BrainBoostâ€™s technology with some editorial content or mapping of results for certain types of queries?
Extracting answers from unstructured documents is what really sets us apart from existing â€˜Answer Enginesâ€™ like Ask Jeeves and the new MSN search. Itâ€™s a much trickier problem to solve, and we are going to continue focusing on it for the time being.
Can you provide any insight into how BrainBoost reformulates a query when it sends it to another engine? Any chance you might be willing to provide an example of how this works?
Query reformulation helps ensure search engines return web pages that most likely contain answers somewhere within them. A simple example: â€œwhat does NASA stand forâ€ gets reformulated into â€œNASA stands forâ€. This simple reordering of words (and the conjugation of the verb) greatly boosts the likelihood that relevant documents are returned by the engines. With larger and especially multipart questions this can get very complicated.
Thereâ€™s something I donâ€™t quite understand about BrainBoost. I enter a search on BB; BB reformulates my query and sends the new query against other engines; the other engines provide results; BB gathers those results and ranks them. OK, so hereâ€™s the question: how does BB take a result from another engine and then show a different description (and title?) than what I would see on the other engine? Or am I missing a piece of the puzzle?
BrainBoost does not just display the results it gathers from other engines. It merely uses those results as itâ€™s starting point. The core technology of BrainBoost is a system we call AnswerRank. The AnswerRank system is given a question and a collection of documents. AnswerRank then analyzes the documents line by line and automatically extracts the very best answers from those documents. The top few hundred search results from the popular engines are what we feed into AnswerRank. BrainBoost begins where the search engines leave off.
Does BrainBoost give a higher weight to certain sources? How about results from certain engines?
No, not at this time. All sources begin processing with an equal weight.
Iâ€™ve noticed that it matters if I donâ€™t format my search like a question. Compare these two queries: population of Scotland vs. what is the population of Scotland?. Is that done on purpose?
BrainBoost pays close attention to all words in the question. The type of words you use and the order in which you use them determines what classification, or algorithm, BrainBoost will use to answer your question. Whereas most search engines ignore words like â€˜whatâ€™, â€˜whereâ€™, â€˜whenâ€™ and â€˜howâ€™, BrainBoost very much relies on them. In this case, the wording of the two questions resulted in two distinct classifications.
Sometimes I see repeat phrases being displayed, such as for the query What is BrainBoost, the following phrase is repeated several times:
-BrainBoost is a Question Answering search engine.-
This probably is not too big a deal, and in fact it may even be a good thing because it shows agreement, but what is your opinion about it?
We chose not to filter out answers that provide the same information in slightly different ways. Like you said, it really does help with identifying agreement towards a specific answer.
I read some helpful information you posted about BrainBoost in a thread on Search Guild. You wrote: â€œBrainBoost classifies incoming questions into distinct categories. Classification enables BrainBoost to predict what lexical properties the answer will most likely contain.â€ Can you expound on this? Do you classify searches based on the subject or topic of the search? Or do you parse the query to look for clues in the phrasing of the search? Orâ€¦?
Its best to give an example: When asked â€œhow long do cats live?â€ BrainBoost recognizes that the user is looking for sentences that quantify the answer in terms of years/months/weeks etc. Responding with an answer that talks about inches/feet/centimeters would not be very intelligent at all. BrainBoost has many dozens of these types of classifications, all of which help ensure suitable answers are returned.
It seems like I hear very little about BrainBoost. Are you purposefully trying to keep a low profile? Or might that change in the future? I like BrainBoost and since it is so easy to use I think a lot of other people would like it too.
Yes, we have been trying to keep a low profile. Itâ€™s given us the luxury of time we needed to perfect our AnswerRank system.
A considerable amount of time was also spent on packaging AnswerRank technology into a licensable software component that can be â€˜plugged intoâ€™ any existing keyword-based search system, allowing for companies to add Question Answering to their existing in-house search.
What do you see as the current state of natural search engines on the web? Would you care to predict for us what the world of natural search will look like a couple years from now?
I think Natural Language question answering mixed with sophisticated personalization is the future of search.
Lastly, what is your favorite drink?
Triple Grande Latte
Assaf, thank you for your time. Is there anything else you would like to add?
Thanks for your time Chris.