Friday, January 22, 2010

How to: create and run a semantic web service

An updated version of this blog post is now available as a chapter in the TopBraid Application Development Quickstart Guide (pdf).

When you know how to define a new function and create a SPARQLMotion script in TopBraid Composer, you can put these two skills together to create a RESTful web service that can be called from any computer with HTTP access to the server running your web service. In a production environment where many people and systems might call your web service, TopBraid Live would be the best way to host it, but for developing and testing you can use the HTTP server built into TopBraid Composer Maestro Edition. Today we're going to see how to create a web service that searches the database about the Kennedy family included with TopBraid Composer and returns RDF/XML that lists everyone with a particular first name.

But first, what makes this a "semantic" web service? TopBraid Composer puts all the power of semantic web technology, such as SPARQL, OWL, and more, behind your development efforts, and it lets you read and write both semantic web data formats and more traditional ones. By delivering RDF, your web service can plug into other semantic web applications, although you're certainly not limited to RDF as a delivery format—you can deliver XML, a spreadsheet, or even HTML, so that a browser hitting the URI you designate for your service would actually be displaying a dynamically generated web page.

If you've never defined your own functions or created a SPARQLMotion script in TopBraid Composer, you'll want to review the earlier entries How to: write new SPARQL functions with SPIN and How to: create and run a SPARQLMotion script before continuing.

Creating your script

Start by creating a SPARQLMotion File, but on the Create SPARQLMotion File dialog box, remember to check the "Script will declare (Web) Services or Functions (.sms extension)" checkbox. If you name your file ws1, you'll see that TopBraid Composer then saves it with the full name ws1.sms.n3.

Once you've created this file, turning it into a web service script involves two steps:

  1. Defining the function that the web service URI will call.
  2. Defining the script that will be triggered by the function call.

Defining the function to call

Create a new subclass of spin:Functions and name it searchKennedys. Add an sp:arg1 argument as a spin:constraint value using "Create from SPIN template" (as described in How to: write new SPARQL functions with SPIN) with a predicate value of sp:arg1. For the argument's comment field, enter "String to search for." Also, set its valueType to xsd:string so that the function receiving the passed argument value knows to treat it as a string and not as a URI. With that, you'll be finished configuring your spl:Argument and can click the OK button.

A given SPARQLMotion script can have several possible endpoints return different variations on how the data is processed, so identifying a set of script modules to run means identifying a specific endpoint in a script. On the class form for your new searchKennedys function, set the sm:returnModule property by clicking the white triangle to display its context menu and then selecting "Create and add..." You will create this return module before adding it because you haven't created your script yet, so there's no endpoint module to point to.

On the Create and add dialog box, pick sml:ReturnRDF as the first module to add, as shown below. (It's a subclass of ExportToRemoteModules because your web service will return its results to a remote caller; you'd pick something from ExportToLocalModules if it were going to save the results in a local file.) Name your new module ReturnSearchResults.

Click the OK button.

Creating the SPARQLMotion script

Next, select "Edit SPARQLMotion script" from the Scripts menu. You'll see that two modules have already been added to your new script: your ReturnSearchResults module and an Argument module to represent the arg1 argument being passed to your script.

Note the gray "arg1" with a little arrow on a circle in the lower-right of the Argument module, which shows the name of the variable that's being set and passed along from it. Your script will use this variable to specify what to search for in the Kennedy family data.

The next step is to add something to the script to read the data that it will search. Drag an Import RDF from Workspace icon from the Import from Local section of the palette onto the workspace and name it GetKennedysData. To configure it, double-click its icon, set its sml:sourceFilePath property to /TopBraid/Examples/kennedys.owl, and then click the Edit GetKennedyData dialog box's Close button.

Now we'll add the module that searches through the data and extracts the subset that we need. Drag an Apply Construct module from the palette's RDF Processing section onto the workspace and name it SelectData. Set its sml:replace property to True so that it only passes along the selected data and not the input data as well. Set the sml:constructQuery value to the following:

PREFIX k: <http://topbraid.org/examples/kennedys#>
CONSTRUCT {
?s k:firstName ?first .
?s k:lastName ?last .
}
WHERE {
?s k:firstName ?first .
?s k:lastName ?last .
FILTER regex(?first, ?arg1, "i" )
}

(After you press Enter, TopBraid Composer will replace the "k" namespace prefixes with the namespace URIs.) If you wanted this query to always search for firstName values with "John" in them, you could put that as the second argument of the regex function, but this query has something more flexible: a reference to the arg1 variable that will be passed in from the Argument module. (The third argument of "i" to this function tells it to ignore case when searching.) The use of the regex function means that it will search for substring matches as well, so that a search for "Car" will turn up Caroline Kennedy, Carolyn Bessete, and Cart Hood.

You're finished configuring this module, so click the Close button and connect the two input modules to it like this (don't worry about connecting up the Return Search Results module just yet, and remember that at any time you can drag the module icons around to make the script's flow easier to understand):

Select the SelectData module and then click the debug icon near the upper-left of the workspace to test it. Because no value has been passed for the Argument, a SPARQLMotion Script Input dialog box prompts you for one. Enter "jean" as a test, press Enter, and then click the Next button. When it's finished, you should see triples storing the firstName and lastName values for Jean Olssen, Jean Kennedy, and Jeannie Ripp appear in the SPARQLMotion Results view.

Your web service script can now do everything except return the data that it extracted from the data source to the process that called the service. Connect the Select data module to the Return search results module, which still needs one bit of configuration: to tell it to return the data as RDF/XML, double-click the icon and click the context menu for the sml:serialization property. Don't pick "Add empty row" this time; you want to add one of the predefined values from the sml:RDFSerialization class, so pick "Add Existing." On the Add existing dialog box, click on sml:RDFSerialization, select sml:RDFXML on the right (note the other choices available to you), click this dialog box's OK button, and then click the Edit ReturnSearchResults dialog box's Close button.

Save your work. The final script should look something like this:

Registering and calling your web service

Before something can tell TopBraid Live or the server built in to TopBraidComposer to call the searchKennedys function that calls this script, the function must be registered with the server. To do this, select "Refresh/Display SPARQLMotion functions..." from the Script menu, and after a few seconds you'll see the updated list of registered functions on the Console view.

To test your web service, enter the following into any web browser running on the same computer as your copy of TopBraid Composer; note how its two parameters identify the function to call and the argument value to pass to it:

http://localhost:8083/tbl/actions?action=sparqlmotion&id=searchKennedys&arg1=Rob

After making this call, you should see RDF/XML data in your web browser about the six people in the database who have "Rob" in their first name: five Roberts and a Robin. (For some browsers, you may need to to a View Source to see all the XML.)

Congratulations! You now have a working, RESTful, semantic web service up and running. Any tool that can send a URI to the HTTP server built into TopBraid Composer and then parse the result can use it. (Most modern program languages include libraries that make this simple.) If you install your web service and the appropriate data files on a computer running TopBraid Live, multiple systems can retrieve data from that service at once.

When you consider the possibilities of using data from more diverse, large-scale data sources and creating SPARQLMotion scripts that take advantage of a wider range of the modules available on the workspace palette, you'll start to see the tremendous possibilities of what you can do with your TopBraid semantic web services.

0 comments:

This is a blog by TopQuadrant, developers of the TopBraid Suite, created to support the pursuit of our ongoing mission - to explode strange semantic myths, to seek out new models that support a new generation of dynamic business applications, to boldly integrate data that no one has integrated before.