jsonl

About

jsonl is a lightweight Python library designed to simplify working with JSON Lines data, adhering to the JSON Lines format.

Features

🌎 Provides an API similar to Python's standard json module.
🚀 Supports custom (de)serialization via user-defined callbacks.
🗜️ Built-in support for gzip, bzip2, xz compression formats and ZIP or TAR archives.
🔧 Skips malformed lines during file loading.

Installation

To install jsonl using pip, run the following command:

pip install py-jsonl

Getting Started

Dumping data to a JSON Lines File

Use jsonl.dump to incrementally write an iterable of dictionaries to a JSON Lines file:

import jsonl

data = [
    {"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]},
    {"name": "May", "wins": []},
]

jsonl.dump(data, "file.jsonl")

Loading data from a JSON Lines File

Use jsonl.load to incrementally load a JSON Lines file into an iterable of objects:

import jsonl

iterator = jsonl.load("file.jsonl")
print(tuple(iterator))

Dump multiple JSON Lines Files into an Archive (ZIP or TAR)

Use jsonl.dump_archive to incrementally write structured data to multiple JSON Lines files, which are then stored in a ZIP or TAR archive.

import jsonl

data = [
    # Create `file1.jsonl` withing the archive
    ("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
    # Create `file2.jsonl` within the archive
    ("path/to/file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
    # Append to `file1.jsonl` within the archive
    ("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_archive("archive.zip", data)

Load multiple JSON Lines Files from an Archive (ZIP or TAR)

Use jsonl.load_archive to incrementally load multiple JSON Lines files from a ZIP or TAR archive. This function allows you to filter files using Unix shell-style wildcards.

import jsonl

# Load all JSON Lines files matching the pattern "*.jsonl" from the archive
for filename, iterator in jsonl.load_archive("archive.zip"):
    print("Filename:", filename)
    print("Data:", tuple(iterator))

Dumping data to Multiple JSON Lines Files

Use jsonl.dump_fork to incrementally write structured data to multiple JSON Lines files, which can be useful when you want to separate data based on some criteria.

import jsonl

data = [
    # Create `file1.jsonl` or overwrite it if it exists
    ("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
    # Create `file2.jsonl` or overwrite it if it exists
    ("file2.jsonl", [{"name": "Charlie", "age": 35}, {"name": "David", "age": 40}]),
    # Append to `file1.jsonl`
    ("file1.jsonl", [{"name": "Eve", "age": 28}]),
]
jsonl.dump_fork(data)

Documentation

For more detailed information and usage examples, refer to the project documentation

Development

To contribute to the project, you can run the following commands for testing and documentation:

First, ensure you have the latest version of pip:

python -m pip install --upgrade pip

Running Unit Tests

Install the development dependencies and run the tests:

pip install --group=test  # Install test dependencies
pytest tests/ # Run all tests
pytest --cov jsonl # Run tests with coverage

Running Linter

pip install --group=lint  # Install linter dependencies
ruff check . # Run linter

Building the Documentation

To build the documentation locally, use the following commands:

pip install --group=doc  # Install documentation dependencies
mkdocs serve # Start live-reloading docs server
mkdocs build # Build the documentation site

License

This project is licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.github/workflows		.github/workflows
docs		docs
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
jsonl.py		jsonl.py
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

jsonl

About

Features

Installation

Getting Started

Documentation

Development

Running Unit Tests

Running Linter

Building the Documentation

License

About

Uh oh!

Releases 25

Packages

Uh oh!

Languages

License

rmoralespp/jsonl

Folders and files

Latest commit

History

Repository files navigation

jsonl

About

Features

Installation

Getting Started

Documentation

Development

Running Unit Tests

Running Linter

Building the Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 25

Packages 0

Uh oh!

Languages

Packages