Skip to content

Latest commit

 

History

History

taverna-scufl2-wfdesc

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Convert SCUFL2 to wfdesc

This is an extension to Scufl2 API which provides the export capability to wfdesc ontology from the Wf4Ever [2] RO wfdesc ontology.

You can use the tavlang convert command line tool for converting Taverna workflows to wfdesc.

Example

The output is a RDF Turtle document containing statements about the workflow structure according to the RO ontology wfdesc ontology, with additional service descriptions using the wf4ever ontology where supported.

: stain@ralph ~/src/wf4ever/scufl2-wfdesc; cat src/test/resources/helloworld.wfdesc.ttl
	@base <http://ns.taverna.org.uk/2010/workflowBundle/8781d5f4-d0ba-48a8-a1d1-14281bd8a917/workflow/Hello_World/> .
	@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
	@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
	@prefix owl: <http://www.w3.org/2002/07/owl#> .
	@prefix prov: <http://www.w3.org/ns/prov#> .
	@prefix wfdesc: <http://purl.org/wf4ever/wfdesc#> .
	@prefix wf4ever: <http://purl.org/wf4ever/wf4ever#> .
	@prefix roterms: <http://purl.org/wf4ever/roterms#> .
	@prefix dc: <http://purl.org/dc/elements/1.1/> .
	@prefix dcterms: <http://purl.org/dc/terms/> .
	@prefix comp: <http://purl.org/DP/components#> .
	@prefix dep: <http://scape.keep.pt/vocab/dependencies#> .
	@prefix biocat: <http://biocatalogue.org/attribute/> .
	@prefix : <#> .

	<datalink?from=processor/hello/out/value&to=out/greeting> a wfdesc:DataLink ;
		wfdesc:hasSource <processor/hello/out/value> ;
		wfdesc:hasSink <out/greeting> .

	<> a wfdesc:Workflow , wfdesc:Description , wfdesc:Process ;
		dc:creator "Stian Soiland-Reyes" ;
		dcterms:description "One of the simplest workflows possible. No workflow input ports, a single workflow output port \"greeting\",  outputting \"Hello, world!\" as produced by the String Constant \"hello\"." ;
		dcterms:title "Hello World" ;
		rdfs:label "Hello_World" ;
		wfdesc:hasOutput <out/greeting> ;
		wfdesc:hasSubProcess <processor/hello/> ;
		wfdesc:hasDataLink <datalink?from=processor/hello/out/value&to=out/greeting> .

	<out/greeting> a wfdesc:Output , wfdesc:Description , wfdesc:Input ;
		rdfs:label "greeting" .

	<processor/hello/> a wfdesc:Process , wfdesc:Description ;
		rdfs:label "hello" ;
		wfdesc:hasOutput <processor/hello/out/value> .

	<processor/hello/out/value> a wfdesc:Output , wfdesc:Description ;
		rdfs:label "value" .

Extracted annotations

Annotations in the workflow are also extracted for the workflow, processors and input ports (see namespaces above):

  • dc:creator
  • dcterms:description
  • dcterms:title
  • biocat:exampleData

Richer semantic annotations (e.g. on Taverna Components) are extracted verbatim, e.g.:

	<> a wfdesc:Workflow , wfdesc:Description , wfdesc:Process ;
		comp:fits comp:MigrationAction ;
		comp:migrates _:node18musbm56x1 .

	_:node18musbm56x1 a comp:MigrationPath ;
		comp:fromMimetype "image/tiff" ;
		comp:toMimetype "image/tiff" .

See valid_component_imagemagickconvert.wfdesc.ttl for the complete example.

Example:

import org.apache.taverna.scufl2.api.container.WorkflowBundle;
import org.apache.taverna.scufl2.api.io.ReaderException;
import org.apache.taverna.scufl2.api.io.WorkflowBundleIO;
import org.apache.taverna.scufl2.api.io.WriterException;

..
File original = new File("helloworld.t2flow");
File output = new File("helloworld.wfdesc.ttl");
String original = null; // to guess filetype
WorkflowBundle wfBundle = io.readBundle(original, null);
io.writeBundle(wfBundle, output, "text/vnd.wf4ever.wfdesc+turtle");

Common queries

These queries can be executed using SPARQL on the returned wfdesc Turtle, and assume these PREFIXes:

PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX wfdesc: <http://purl.org/wf4ever/wfdesc#>
PREFIX wfprov: <http://purl.org/wf4ever/wfprov#>
PREFIX tavernaprov: <http://ns.taverna.org.uk/2012/tavernaprov/>
PREFIX cnt:  <http://www.w3.org/2011/content#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wf4ever: <http://purl.org/wf4ever/wf4ever#>
PREFIX comp: <http://purl.org/DP/components#>

How do I exclude nested workflows from a query?

The wfdesc model does not have a concept of "top-level" workflow, however you can use a SPARQL NOT EXISTS filter to skip any nested workflows:

SELECT ?wf WHERE {
  ?wf a wfdesc:Workflow .

  FILTER NOT EXISTS {
      ?processInParent prov:specializationOf ?wf .
      ?parent wfdesc:hasSubProcess ?processInParent
  } .
}

Thus any workflow ?wf which (specialization) has been used as a sub-process in ?parent are excluded from the result. As SCUFL2 workflows have unique URIs within each workflow bundle, e.g. http://ns.taverna.org.uk/2010/workflowBundle/01348671-5aaa-4cc2-84cc-477329b70b0d/workflow/Hello_Anyone/ - the above would not be excluding a workflow just because its structure has been reused elsewhere.

How do I select processors in the top-level workflows?

A variant of the above is if you are only interested in processors in top-level workflows:

SELECT ?proc WHERE {
  ?wf a wfdesc:Workflow ;
      wfdesc:hasSubProcess ?proc .

  FILTER NOT EXISTS {
      ?processInParent prov:specializationOf ?wf .
      ?parent wfdesc:hasSubProcess ?processInParent
  } .
}