SQL Server implementation of the LDBC Social Network Benchmark's Interactive workload.
The recommended environment is that the benchmark scripts (Bash) and the LDBC driver (Java 8) run on the host machine, while the SQL Server database runs in a Docker container. Therefore, the requirements are as follows:
- Bash
- Java 8
- Docker 19+
- enough free space in the directory
${MSSQL_DATA_DIR}
To build, use the script: scripts/build.sh
This SQL Server implementation uses the composite-merged-fk
CSV layout, with headers and without quoted fields. To generate data that confirms this requirement, run Datagen without any layout or formatting arguments (--explode-*
or --format-options
).
In Datagen's directory (ldbc_snb_datagen_spark
), issue the following commands. We assume that the Datagen project is built and sbt
is available.
export SF=desired_scale_factor
export LDBC_SNB_DATAGEN_MAX_MEM=available_memory
export LDBC_SNB_DATAGEN_JAR=$(sbt -batch -error 'print assembly / assemblyOutputPath')
rm -rf out-sf${SF}/graphs/parquet/raw
tools/run.py \
--cores $(nproc) \
--memory ${LDBC_SNB_DATAGEN_MAX_MEM} \
-- \
--format csv \
--scale-factor ${SF} \
--mode bi \
--output-dir out-sf${SF}
Before starting the SQL Server Docker instance, change the MSSQL_CSV_DIR
found in .env
file to the path where the dataset is located. E.g.:
MSSQL_CSV_DIR=`pwd`/social-network-sf1-bi-composite-merged-fk/
By default, the dataset is loaded again when the docker container is restarted. To prevent reloading, set the MSSQL_RECREATE
variable to False
. E.g.:
MSSQL_RECREATE=False
To persist the data by storing the database outside a Docker volume, uncomment the following lines in the docker-compose.yml
file:
- type: bind
source: ${MSSQL_DATA_DIR}
target: /var/opt/mssql/data
- type: bind
source: ${MSSQL_DATA_LOGS}
target: /var/opt/mssql/log
- type: bind
source: ${MSSQL_DATA_SECRETS}
target: /var/opt/mssql/secrets
Make sure the following folders are created relative to the docker-compose.yml
:
scratch/data
scratch/logs
scratch/secrets
To run the benchmark, change the following properties in driver/benchmark.properties
:
-
thread_count
: amount of threads to use -
ldbc.snb.interactive.parameters_dir
: path to the folder with the substitution parameters -
ldbc.snb.interactive.updates_dir
: path to the folder with the updatestreams. Make sure the update streams corresponds to thethread_count
. -
ldbc.snb.interactive.scale_factor
: the scale factor to use (must be the same as the substitution parameters and update streams)
To validate the benchmark, change the following properties in driver/validate.properties
:
validate_database
: The validation parameter csv-file to useldbc.snb.interactive.parameters_dir
: path to the folder with the substitution parameters
The dataset is loaded automatically using the db-loader container. To start the SQL Server container and load the data, docker-compose
is used:
docker-compose build
, to build the db-loader containerdocker-compose up
to start the SQL Server container and the db-loader container
To run the benchmark run the following command:
driver/benchmark.sh
To validate the results, run the following command:
driver/validate.sh