05.20.08

Automated Content Alerting

Posted in Coding, ImplicitWeb, Orchestr8, Protocols at 10:15 am by elliot

Too many websites. Too little syndication.

This is a theme that’s been bouncing around my subconscious for months; something I’ve blogged about in the past.

But really, syndication is only part of the problem. Syndication normalizes data, and makes it readily accessible to 3rd parties — but it doesn’t push data where you want it. It’s a pull-focused technology.

For push, we need some sort of alerting capability.

Recently, I’ve been in the habit of checking delegate counters for the 2008 Presidential Election Primary Races. I check them daily; seeing updates to pledged delegates, super-delegates, etc.

Checking for updates doesn’t take a significant amount of time, but it’s yet another activity that can conceivably interrupt my work flow. Leveraging of automation would be a much better way to do this.

Recently, my company added Alerts capability to our AlchemyGrid beta service. You can create an alert based on anything — any sort of web content (syndicated or not). Alerts can travel over many communications mediums (Email, AIM, SMS, Twitter, etc.). They support lots of customization options (regarding how often to check for updates, what’s considered a “unique update”, etc.).

I used this new service to create an Alert that monitors delegate counts for the Democratic Presidential candidates (GOP has already chosen their candidate). Any updates to delegate counts are automatically posted to the twitter account “demdelegate08″. You can see the Twitter feed (and follow it if you wish) here:

http://twitter.com/demdelegate08

If you aren’t a Twitter user, and are interested in getting delegate updates via Email, SMS, or AIM, you can “Subscribe to / Follow” my Alert here:

DCW Delegate Alert

A few implementation notes, for the geeks out there: We’re using a custom-engineered AIM, Email, and SMS backend for our Alerts implementation. We’re interacting with external services directly at the Protocol/API level, not piggy-backing off 3rd-party gateways or using other unreliable modes of communication.

05.15.08

New Sidebar Widget

Posted in Clipping, Contextual, ImplicitWeb, NLP, Orchestr8, Uncategorized, Widgets at 9:02 am by elliot

I’ve just added a new sidebar widget to my blog: “Related Content”.

This is a demonstration of “contextual widgets” from my company’s AlchemyGrid service.

Contextual widgets utilize a custom-engineered “statistical topic keyword extraction from Natural Language Text” facility we’ve recently integrated into our products. If you’re familiar with the “Yahoo Term Extraction” API, our system is essentially doing the same sort of stuff. Natural language processing is fun (and challenging) stuff. Here’s a few notes regarding our implementation:

1. AlchemyGrid’s Term Extraction facility supports multiple languages (English, German, French, Italian, Spanish, and Russian!). This was an important requirement for us, to enable contextual content generation for non-English websites/blogs. There are significant differences between languages in terms of punctuation rules, word stemming, and other details. Hats off to our Term Extraction developers, you’ve done a great job ensuring good initial language coverage.

2. Our Term Extraction facility is entirely statistical in its basis, not using a hard-coded lexicon, etc. This enables it to extract contextually-relevant topic keywords even when they’re (a) new topics, (b) rarely used common nouns/people-names, or (c) misspelled.

We’ve just integrated Term Extraction into our Grid service, so there may be a few minor kinks to work out in the coming weeks — but overall we’re happy with the initial results. Contextual capability vastly expands the utility of AlchemyGrid widgets, as their content can now be automatically customized to relate to your content. This applies to *any* input-enabled widget in the grid (ALL widgets are contextual). Here’s another contextual example (a related Amazon book):

We’ll be enabling the other supported languages in coming weeks, as well as rolling out some additional enhancements to our text processing algorithms (for the geeks in the audience, enhancements to our sentence boundaries detector, inline punctuation processor, etc.).

02.14.08

Automated Content Monitoring

Posted in ImplicitWeb, Orchestr8, Scraping at 12:54 pm by elliot

I use the Internet constantly during the course of my everyday life — looking up telephone numbers, reading restaurant reviews, etc.

One task I’m frequently engaged in is Content Monitoring; that is, checking (and re-checking) websites of interest for updates and new information.

Now wait a sec — wasn’t syndication (RSS, ATOM, etc.) supposed to do this for me? Sure, if a website actually exposes data feeds. If they don’t, you’re mostly out of luck.

Alas, there are many websites out there with no form of syndicated access. This is just plain irritating.

Luckily tools are starting to appear that can eliminate this irritant. My company released a new service earlier this week, which makes great strides at solving this problem.

This new service performs Automated Content Monitoring: a way of programmatically monitoring information sources that currently lack syndication features.

I’m a big fan of leveraging automated techniques to optimize my daily workflow — many of my previous blog posts have focused on this topic. Leveraging algorithms to improve efficiency, access to information, and integration of data is a central theme to the Implicit Web, both a personal interest of mine and business interest of my company. Automated Content Monitoring fits perfectly within this arena.

I’m currently using our new Automated Content Monitoring service to track a variety of information sources: new events at my preferred concert venues, special deals offered by local radio stations, etc. Monitoring each of these information sources automatically frees up my time for more useful activities, and gives notification of website updates far sooner than if I were performing these tasks manually.

We’ll likely see increased uptake of Automated Content Monitoring solutions in the near future, as more individuals succumb to the dreaded Attention Crash.

12.19.07

A Car In Every Garage..

Posted in API, Orchestr8, Scraping at 12:42 pm by elliot

And a Content Scraping API On Every Desktop :)

11.02.07

Attention Profiling & APML

Posted in APML, Attention, Orchestr8, SemanticWeb at 9:02 am by elliot

APML, a new standardized format for expressing attention preferences, has been receiving a lot of buzz in recent weeks. Mashable covers the topic here, Brad Feld here, Jeff Nolan here, and Read/WriteWeb here.

It’s great to see an increasing number of folks getting behind the concept of ‘standardized structured attention’ and embracing this emerging standard.

Attention has always been a topic of interest to me, something I’ve blogged about in the past, on a number of occasions. At my company Orchestr8, we’ve been working on solutions that can automatically capture the ‘context’ of a user’s attention and leverage this data in various ways. We’re currently implementing APML support into the next version of our software, which should provide for some really interesting capabilities.

The thing that excites me about APML is that it’s a relatively straight-forward standard (far, far simpler than the many RSS/ATOM variants). This will ease adoption and simplify portability of attention preference data across many products / services. Since APML expresses attention in a relatively abstract way, multiple products (even product domains, for instance Web versus Email) can leverage the same attention data.

Additional tech. note: Thank you, APML authors, for strictly standardizing the date format in the APML spec (ISO8601). If only we could have been so lucky with RSS/ATOM. Now lets hope people actually stick to the date formats!

10.16.07

Web Clipping in OSX Leopard

Posted in Clipping, Mashups, Orchestr8 at 11:05 am by elliot

Techcrunch has posted an interesting write-up on a new feature in OS X Leopard: Web Clipping

It’s great seeing innovative companies like Apple embracing web clipping technology. I’m a big believer in the “cut-and-paste web” and my company has been working for some time now to make this concept a reality.

I’ve blogged on this subject previously, discussing some of the technical hurdles that must be overcome to reliably clip arbitrary web content. Regarding clipping in Leopard: Apple’s solution is somewhat limited in that it only displays clipped content in a mini-browser; it isn’t capable of inserting clipped content into other web pages or applications.

For those interested in seeing mouse-based clipping of web content in action, check out any of these screencasts:

Clipping a 10-day Weather Forecast and Inserting It Into Another Webpage

Clipping Search Results from Yahoo News and Integrating Into Google Search Results

A tutorial on how to clip web content is also available here.

09.17.07

Company Website / Tech. Preview Launch

Posted in ImplicitWeb, Mashups, Orchestr8 at 11:49 am by elliot

My company (Orchestr8) launched its corporate website today:

We’re letting a select number of individuals take a “first look” at our AlchemyPoint platform in the form of a Technology Preview. We’re looking for user feedback to help us expand and improve the AlchemyPoint system before its final release.

Access to the Technology Preview is being provided in a staged fashion with priority given to users who apply first, so sign up today!

Please note: The AlchemyPoint Technology Preview is pre-Beta software. It may contain software bugs or other limitations.