pytest-spark

pytest plugin to run the tests with support of pyspark (Apache Spark).

This plugin will allow to specify SPARK_HOME directory in pytest.ini and thus to make "pyspark" importable in your tests which are executed by pytest.

You can also define "spark_options" in pytest.ini to customize pyspark, including "spark.jars.packages" option which allows to load external libraries (e.g. "com.databricks:spark-xml").

pytest-spark provides session scope fixtures spark_context and spark_session which can be used in your tests.

Install

$ pip install pytest-spark

Usage

Set Spark location

To run tests with required spark_home location you need to define it by using one of the following methods:

Specify command line option "--spark_home":
```
$ pytest --spark_home=/opt/spark
```
Add "spark_home" value to pytest.ini in your project directory:
```
[pytest]
spark_home = /opt/spark
```
Set the "SPARK_HOME" environment variable.

pytest-spark will try to import pyspark from provided location.

Note

"spark_home" will be read in the specified order. i.e. you can override pytest.ini value by command line option.

Customize spark_options

Just define "spark_options" in your pytest.ini, e.g.:

[pytest]
spark_home = /opt/spark
spark_options =
    spark.app.name: my-pytest-spark-tests
    spark.executor.instances: 1
    spark.jars.packages: com.databricks:spark-xml_2.12:0.5.0

Using the `spark_context` fixture

Use fixture spark_context in your tests as a regular pyspark fixture. SparkContext instance will be created once and reused for the whole test session.

Example:

def test_my_case(spark_context):
    test_rdd = spark_context.parallelize([1, 2, 3, 4])
    # ...

Using the `spark_session` fixture (Spark 2.0 and above)

Use fixture spark_session in your tests as a regular pyspark fixture. A SparkSession instance with Hive support enabled will be created once and reused for the whole test session.

Example:

def test_spark_session_dataframe(spark_session):
    test_df = spark_session.createDataFrame([[1,3],[2,4]], "a: int, b: int")
    # ...

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
pytest_spark		pytest_spark
test		test
test_env		test_env
.dockerignore		.dockerignore
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.rst		README.rst
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pytest-spark

Install

Usage

Set Spark location

Customize spark_options

Using the `spark_context` fixture

Using the `spark_session` fixture (Spark 2.0 and above)

About

Releases

Packages

Languages

License

skyegecko/pytest-spark

Folders and files

Latest commit

History

Repository files navigation

pytest-spark

Install

Usage

Set Spark location

Customize spark_options

Using the spark_context fixture

Using the spark_session fixture (Spark 2.0 and above)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Using the `spark_context` fixture

Using the `spark_session` fixture (Spark 2.0 and above)

Packages