05.15.08

New Sidebar Widget

Posted in Clipping, Contextual, ImplicitWeb, NLP, Orchestr8, Uncategorized, Widgets at 9:02 am by elliot

I’ve just added a new sidebar widget to my blog: “Related Content”.

This is a demonstration of “contextual widgets” from my company’s AlchemyGrid service.

Contextual widgets utilize a custom-engineered “statistical topic keyword extraction from Natural Language Text” facility we’ve recently integrated into our products. If you’re familiar with the “Yahoo Term Extraction” API, our system is essentially doing the same sort of stuff. Natural language processing is fun (and challenging) stuff. Here’s a few notes regarding our implementation:

1. AlchemyGrid’s Term Extraction facility supports multiple languages (English, German, French, Italian, Spanish, and Russian!). This was an important requirement for us, to enable contextual content generation for non-English websites/blogs. There are significant differences between languages in terms of punctuation rules, word stemming, and other details. Hats off to our Term Extraction developers, you’ve done a great job ensuring good initial language coverage.

2. Our Term Extraction facility is entirely statistical in its basis, not using a hard-coded lexicon, etc. This enables it to extract contextually-relevant topic keywords even when they’re (a) new topics, (b) rarely used common nouns/people-names, or (c) misspelled.

We’ve just integrated Term Extraction into our Grid service, so there may be a few minor kinks to work out in the coming weeks — but overall we’re happy with the initial results. Contextual capability vastly expands the utility of AlchemyGrid widgets, as their content can now be automatically customized to relate to your content. This applies to *any* input-enabled widget in the grid (ALL widgets are contextual). Here’s another contextual example (a related Amazon book):

We’ll be enabling the other supported languages in coming weeks, as well as rolling out some additional enhancements to our text processing algorithms (for the geeks in the audience, enhancements to our sentence boundaries detector, inline punctuation processor, etc.).

10.16.07

Web Clipping in OSX Leopard

Posted in Clipping, Mashups, Orchestr8 at 11:05 am by elliot

Techcrunch has posted an interesting write-up on a new feature in OS X Leopard: Web Clipping

It’s great seeing innovative companies like Apple embracing web clipping technology. I’m a big believer in the “cut-and-paste web” and my company has been working for some time now to make this concept a reality.

I’ve blogged on this subject previously, discussing some of the technical hurdles that must be overcome to reliably clip arbitrary web content. Regarding clipping in Leopard: Apple’s solution is somewhat limited in that it only displays clipped content in a mini-browser; it isn’t capable of inserting clipped content into other web pages or applications.

For those interested in seeing mouse-based clipping of web content in action, check out any of these screencasts:

Clipping a 10-day Weather Forecast and Inserting It Into Another Webpage

Clipping Search Results from Yahoo News and Integrating Into Google Search Results

A tutorial on how to clip web content is also available here.

08.23.07

The Cut-and-Paste Web

Posted in Clipping, ImplicitWeb at 1:59 pm by elliot

A few days ago I noticed a great quote from Steve Rubel @ Micro Persuasion:

“Imagine for a moment that you can take any piece of online content that you care about – a news feed, an image, a box score, multimedia, a stream of updates from your friends – and easily pin it wherever you want.”

[...snip...]

“This isn’t some far off vision. It’s the near-term future. It’s the coming era of the Cut and Paste Web.”

It’s exciting to see discussion on this topic, as this is something my company has been working towards for some time now. Our AlchemyPoint mashup platform enables the visual cutting and pasting of web content, even dynamic content (like search results). “Clipped” content can be inserted anywhere — into your home page or blog, Google results pages, CNN articles, etc.

Below are several screencasts that illustrate cut-and-paste clipping of web content:

Adding Yahoo Image Search Results into Google Search Results

Integrating the Google News Top-Story into the Rocky Mountain News Homepage

These screencasts illustrate two things:

  1. Grabbing content from a page via the mouse, and storing it in a “Clipboard” for later reuse.
  2. Inserting content into a new page, selecting from the available “Clipboard” of previously grabbed content.

Using this methodology one can clip any arbitrary piece of web content (images, articles, headlines, blog posts, etc.) and insert it into any other web page. It’s worth noting that this process occurs almost entirely using the mouse; the only keyboard interaction required involves typing out a name to identify the clipped content.

On a technical level, cutting and pasting web content is difficult; one cannot simply grab and re-insert raw HTML fragments into web pages. There are a number of hurdles to overcome in order to perform these types of manipulations reliably. A few items that must be considered include: relative URL links, CSS content, Javascript, name/class/id conflicts between a web page and any pasted content, character set differences, how remote servers deal with Referrer headers, etc. We’ve had a good time working out solutions to these issues and others not mentioned above.

For those interested in playing around with cutting-and-pasting web content, we’re going to be opening up invitatations to our AlchemyPoint Technology Preview in the next few weeks. This preview supports the ability to perform all sorts of web manipulations, cut-and-paste of web content being just one example.