05.15.08

New Sidebar Widget

Posted in Clipping, Contextual, ImplicitWeb, NLP, Orchestr8, Uncategorized, Widgets at 9:02 am by elliot

I’ve just added a new sidebar widget to my blog: “Related Content”.

This is a demonstration of “contextual widgets” from my company’s AlchemyGrid service.

Contextual widgets utilize a custom-engineered “statistical topic keyword extraction from Natural Language Text” facility we’ve recently integrated into our products. If you’re familiar with the “Yahoo Term Extraction” API, our system is essentially doing the same sort of stuff. Natural language processing is fun (and challenging) stuff. Here’s a few notes regarding our implementation:

1. AlchemyGrid’s Term Extraction facility supports multiple languages (English, German, French, Italian, Spanish, and Russian!). This was an important requirement for us, to enable contextual content generation for non-English websites/blogs. There are significant differences between languages in terms of punctuation rules, word stemming, and other details. Hats off to our Term Extraction developers, you’ve done a great job ensuring good initial language coverage.

2. Our Term Extraction facility is entirely statistical in its basis, not using a hard-coded lexicon, etc. This enables it to extract contextually-relevant topic keywords even when they’re (a) new topics, (b) rarely used common nouns/people-names, or (c) misspelled.

We’ve just integrated Term Extraction into our Grid service, so there may be a few minor kinks to work out in the coming weeks — but overall we’re happy with the initial results. Contextual capability vastly expands the utility of AlchemyGrid widgets, as their content can now be automatically customized to relate to your content. This applies to *any* input-enabled widget in the grid (ALL widgets are contextual). Here’s another contextual example (a related Amazon book):

We’ll be enabling the other supported languages in coming weeks, as well as rolling out some additional enhancements to our text processing algorithms (for the geeks in the audience, enhancements to our sentence boundaries detector, inline punctuation processor, etc.).