05.16.12

Google Knowledge Graph: A Tipping Point

Posted in Contextual, News, SemanticWeb at 2:05 pm by elliot

The semantic web / NLProc world is abuzz today with the news of Google’s Knowledge Graph.

I’m thrilled and fascinated by Google’s work in this arena.  They’re taking a true “web scale” approach towards knowledge extraction.  My company (AlchemyAPI) has been working in this area intensely over the past year, examining large swaths of the web (billions of pages), performing language / structural analysis, and extracting reams of factual and ontological data.  We’re using gathered data for different purposes than Google (we’re enhancing our semantic analysis service, AlchemyAPI — whereas Google is improving the search experience for their customers), but we are both using some analogous approaches to find and extract this sort of information.

What’s interesting to me, however, is how this is really a sort of tipping point for Google. We’re witnessing their evolution from “search engine” to “knowledge engine”, something many have expected for years — but which carries a number of consequences (intended and unintended).

Google has always maintained a careful balance of risk/reward with content owners/creators. They provide websites with referral traffic (web page hits), while performing what some may argue is wholesale copyright infringement (copying entire web pages, images, even screenshots of web pages).

This has historically worked out quite well for Google. Website owners get referral traffic — thus can show ads, sell subscriptions, and get paid. Google copies their content (showing snippits/images/etc on Google.com properties) to make this virtuous cycle happen.

Stuff like the “Knowledge Graph” potentially torpedoes this equation. Instead of pointing users to the web page that contains the answer to their search, Google’s semantic algorithms can directly display an answer, without the user ever leaving Google.com.

Say you’re a writer for About.com — spending your time gathering factual information on your topic of choice (aka, “Greek Philosophers”). You carefully curate your About.com page, and make money on ads shown to users who read your content (many of whom are referred from Google.com).

If Google can directly extract the “essence” of these pages (the actual entities and facts contained within), and show this information to users — what incentive do these same individuals have to visit your About.com page? And where does this leave content creators?

The risk here isn’t necessarily a legal one — there’s quite a bit of established precedent which states that “facts” cannot be easily owned or copyrighted. But sites could start blocking Google’s crawlers. Noone is likely to do this anytime soon as Google’s semantic features are only just getting started and “referral traffic” is still the biggest game in town. But what does the future hold?

I’m guessing Google will work out these sort of bumps in the road on their path towards becoming a true Knowledge Engine. But it’s an interesting point to think about.

PS: Google Squared could be argued as an earlier “tipping point”, but was largely more of an experiment. The Google Knowledge Graph represents a true, web-scale commercial effort in this arena. A real tipping point.

05.11.12

3d Visualization of Semantic Data using MS-Kinect

Posted in Coding, NLP, Personal, SemanticWeb at 1:02 pm by elliot

Here’s a fun little demo app myself and a co-worker built:

This application leverages the MS Kinect to manipulate 3d visualizations of social media data. The application tracks 3d motion of a person’s hand, using it as a virtual mouse cursor.

Social media data mined from tens of millions of news articles and blog posts over a period of 1+ month, using natural language processing algorithms to analyze article/blog contents, identify named entities and trends, and track momentum over time.

Info on this app:

  • real-time 3d visualization of social media data, represented as a force-directed-graph.
  • social media data was mined from tens of millions of news articles and blog posts over a 1+ month period.
  • news / blog data analyzed using natural language processing (NLP) algorithms including: named entity extraction, keyword extraction, concept tagging, sentiment extraction.
  • high-performance temporal data-store enables visualization of connections between named entities (eg, “Nicolas Sarkozy -> Francois Hollande”)
  • system tracks billions of data-points (persons, companies, organizations, …) for tens of millions of pieces of content.

This is an example “20% time” employee project at my company, AlchemyAPI. We do fun projects like this to spur the imagination and as a creative diversion. Other projects (which I’ll get around to posting at some point) involve speech recognition, robots, and other geektacular stuff.

05.18.10

Web 3.0 & Disruptive Technology

Posted in Denver, SemanticWeb at 11:11 am by elliot

I’ve always been fascinated by data — both of the companies I’ve founded have addressed aspects of the “data overload” problem. The first, MimeStar, developed NIDS (Network Intrusion Detection System) technology that analyzed gigabits of network traffic every second, reconstructing every IP frame, TCP session, and application-layer protocol stream — looking for computer intrusions and other inappropriate activity. MimeStar was acquired in early 2000 and our products are still protecting government and corporate networks 10 years later. NIDS is fascinating technology, reducing massive packet flows down to intelligible event/activity streams & security alerts.

My present company builds natural language processing (computational linguistics) technology to make sense of the huge quantities of unstructured text residing across the web and within company data warehouses. We’re helping build the semantic web, by “bootstrapping” unstructured content into a form that is understandable by machines. NLP is an exciting space, with real disruption potential. It’s becoming a critical technology for Semantic & Web 3.0 applications/services.

What’s that? You haven’t heard of the Semantic Web? Check out this fantastic video, created by Kate Ray of NYU. Her short documentary does a great job of summing up many of the drivers behind the Semantic Web (such as data overload), and touches upon many of the future applications of this technology.

Web 3.0 from Kate Ray on Vimeo.

If disruptive innovation, artificial intelligence, and Web 3.0 are your bread-and-butter, AlchemyAPI is currently hiring. We’re based in Denver, CO and are growing rapidly. Join our team and help build the next generation of semantic technology!

01.04.10

Colorado Technology & Entrepreneurship

Posted in Boulder, Colorado, Companies, Denver at 1:02 pm by elliot

Colorado has a truly vibrant entrepreneurial ecosystem.  Everywhere you look, new startup companies are being formed to solve interesting technology and clean energy problems.  I cannot stress how much this is truly the case — in my neighborhood, on my street, there are at least three entrepreneurs involved with startup companies.  Truly amazing.

Some talented folks put together a video that details aspects of the startup / technology scene in Boulder, a close neighbor of Denver, CO.  Boulder is a fantastic place (my company has a number of customers in Boulder, so I’m up there quite often).  If you’re interested in what it’s like to work in a technology startup in Boulder, check out this video:


If you’re interested in forming a technology / clean-tech startup and are looking to plant some roots, check out Colorado.  Denver, and its northern cousin, Boulder, are fantastic places to run a startup.

03.27.09

AlchemySnap - OCR+Photo Search for TMobile G1

Posted in API, Coding, Contextual, NLP, Technology, Twitter at 11:56 am by elliot

Here’s a demo app I created for the T-Mobile G1, to show off my company’s AlchemyAPI image / text mining infrastructure service.

Watch the video for more info:

01.20.09

Jan. 20th, 2009

Posted in Uncategorized at 4:05 pm by elliot

43rd peaceful transfer of power.  44th president.  democracy in action.

11.04.08

It’s November 4th..

Posted in Colorado, Defrag, Uncategorized at 10:30 am by elliot

Go Vote!

- posted from the Defrag ‘08 conference

10.27.08

American President

Posted in Colorado, Denver at 9:56 am by elliot

Colorado has been buzzing with political activity this year: the Democratic National Convention, rallies, protests, volunteers knocking on doors — you name it.

Yesterday my wife and I went to an Obama rally in Denver.  We were two of around 100,000 folks in attendence.

Regardless of your political persuasion, stuff like this should make you proud.  People taking part in the political process at this level is just amazing.

10.20.08

Colorado Early Voting Starts Today

Posted in Colorado at 8:36 am by elliot

Early voting in Colorado starts today!  To find your closest polling location go here.

Get out and vote!

09.22.08

I want an OmniGlobe!

Posted in Colorado, Technology at 7:37 am by elliot

OK, the OmniGlobe is just super-cool (and made by a Colorado company! — go Colorado!):

(right-click above and choose “Play” to watch the OmniGlobe in action, or just click here)

« Previous entries