Thursday, June 18, 2009

The Meaning of "semantic"

Going through the exhibit hall of the Semantic Technologies conference one quickly notices that there are two types of vendors:

  1. Providers of the middleware and tools that leverage Semantic Web standards (RDF, RDFS, OWL and SPARQL) to support a variety of business applications.This is an area where TopQuadrant plays.
  2. Providers of the search and text mining products.
The word "semantic" is used when talking about both types of products. The reason for this is quite clear. Before W3C coined the term "Semantic Web", the word "semantics" was used to describe "smart software" capable of extracting some meaning from text. With the development of the Semantic Web standards, it is increasingly being used to describe semantics (schemas) of the data and to support integration across different sources and formats - databases, XML, spreadsheets, etc. As well as to support new, model driven ways to develop applications.

I wonder if this creates confusion. In fact, I know it does. At a recent customer meeting, we gave a two hour product presentation. There were many good questions afterward. I did not think there was any confusion until one of the attendees asked if our software had multi-lingual support. I explained that yes, language specific labels can be provided and information can then be displayed in a selected language. He looked puzzled, thought a little and then said "I am not talking about labels. Different languages have different semantics, like sentence structure, etc. How do you address that?"

I explained that TopBraid Suite does not directly provide text extraction. Instead, we integrate with software that has these capabilities. It can be a product like Calais which provides the extracted concepts directly in RDF or a product that provides results in XML. We can convert it to RDF in an automated way. Integration with the text mining software makes it possible to bring together (and, consequently, query) structured and unstructured information using a common RDF infrastructure.

With this, I wander if there should be two very distinct tracks at the conference? Is there some other space where these two different streams of technologies come together? If so, can the intersection be better explained and positioned?

0 comments:

This is a blog by TopQuadrant, developers of the TopBraid Suite, created to support the pursuit of our ongoing mission - to explode strange semantic myths, to seek out new models that support a new generation of dynamic business applications, to boldly integrate data that no one has integrated before.