Skip to content

02_How to run

Mario Micudaj edited this page Sep 24, 2019 · 8 revisions

Run via Docker

The easiest way to run bpmn.ai is via the provided Docker images from Docker Hub. This way you neither need to build the application, nor do you need to have Apache Spark setup locally.

The latest tag is synchronised with the master branch. So the best way is to use the latest tag or a specific version tag.

When specifying paths as parameters it is important to use paths from inside the Docker container, which have to be mapped in order for the container to be able to access it.

The Docker container will automatically use all CPUs available to Docker.

docker run -it --rm \
	-v <local_folder_to_map_into_docker_container>:/data \
	viadee/bpmn.ai:latest \
	de.viadee.bpmnai.core.CSVImportAndProcessingApplication \
	-fs /data/<path_to_input_csv_using_path_inside_docker_container> \
	-fd /data/<path_to_target_folder_for_results_using_path_inside_docker_container> \
	-d <field_delimiter>

All application parameters available in each bpmn.ai application can be found in this document.

Make sure to run the applciation twice if no configuration file was existing during first run, as then it will only create the configuration file, but not do the complete processing.

Run with spark-submit

In order to run the application with spark-submit you first need to package the applcation to a JAR file with maven.

mvn clean package

Then you can run the spark-submit command from the bin folder of your Apache Spark installation by referencing the created JAR file and the Spark application class you would like to run including its parameters.

bin/spark-submit \
	--class de.viadee.bpmnai.core.CSVImportAndProcessingApplication \
	--master "local[*]" \
	--deploy-mode client \
	--name ViadeeBpmnai \
	<path_to_packaged_jar_with_dependencies> \
	-fs <path_to_input_csv> \
	-fd <path_to_target_folder_for_results> \
	-d <field_delimiter>

All application parameters available in each bpmn.ai application can be found in this document.

Make sure to run the applciation twice if no configuration file was existing during first run, as then it will only create the configuration file, but not do the complete processing.

Setup and run in IntelliJ / Eclipse

In order for the bpmn.ai-core applications to run in IntelliJ the run configuration needs to be amended. Try to run the Applicaiton once as a Java Application and the add the following parameters in the run configuration:

VM arguments

mandatory

This defines that Spark should run in local standalone mode and utilise all CPUs.

-Dspark.master=local[*]

optional

-Dspark.executor.memoryOverhead=1g 
-Dspark.driver.memory=2g

Program arguments

Here you define the parameters of the respective Spark application. You need to define the parameters as listed above, e.g.

-fs <path_to_input_csv> -fd <path_to_target_folder_for_results> -d <field_delimiter>

Now you can run the application via the run configuration.

All application parameters available in each bpmn.ai application can be found in this document.

Make sure to run the applciation twice if no configuration file was existing during first run, as then it will only create the configuration file, but not do the complete processing.

bpmn.ai is built to harvest low hanging fruits with ML. Starting is easy. Take a look at the tutorials in the wiki, to get your Camunda event history into a ML table.

Clone this wiki locally