feed-to-sqlite

Download an RSS or Atom feed and save it to a SQLite database. This is meant to work well with datasette.

Installation

pip install feed-to-sqlite

CLI Usage

Let's grab the ATOM feeds for items I've shared on NewsBlur and my instapaper favorites save each its own table.

feed-to-sqlite feeds.db http://chrisamico.newsblur.com/social/rss/35501/chrisamico https://www.instapaper.com/starred/rss/13475/qUh7yaOUGOSQeANThMyxXdYnho

This will use a SQLite database called feeds.db, creating it if necessary. By default, each feed gets its own table, named based on a slugified version of the feed's title.

To load all items from multiple feeds into a common (or pre-existing) table, pass a --table argument:

feed-to-sqlite feeds.db --table links <url> <url>

That will put all items in a table called links.

Each feed also creates an entry in a feeds table containing top-level metadata for each feed. Each item will have a foreign key to the originating feed. This is especially useful if combining feeds into a shared table.

Python API

One function, ingest_feed, does most of the work here. The following will create a database called feeds.db and download my NewsBlur shared items into a new table called links.

from feed_to_sqlite import ingest_feed

url = "http://chrisamico.newsblur.com/social/rss/35501/chrisamico"

ingest_feed("feeds.db", url=url, table_name="links")

Transforming data on ingest

When working in Python directly, it's possible to pass in a function to transform rows before they're saved to the database.

The normalize argument to ingest_feed is a function that will be called on each feed item, useful for fixing links or doing additional work.

It's signature is normalize(table, entry, feed_details, client):

table is a SQLite table (from sqlite-utils)
entry is one feed item, as a dictionary
feed_details is a dictionary of top-level feed information, as a dictionary
client is an instance of httpx.Client, which can be used for outgoing HTTP requests during normalization

That function should return a dictionary representing the row to be saved. Returning a falsey value for a given row will cause that row to be skipped.

Development

Tests use pytest. Run pytest tests/ to run the test suite.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github		.github
feed_to_sqlite		feed_to_sqlite
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

feed-to-sqlite

Installation

CLI Usage

Python API

Transforming data on ingest

Development

About

Releases 4

Sponsor this project

Packages

Contributors 2

Languages

License

eyeseast/feed-to-sqlite

Folders and files

Latest commit

History

Repository files navigation

feed-to-sqlite

Installation

CLI Usage

Python API

Transforming data on ingest

Development

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Sponsor this project

Packages 0

Contributors 2

Languages

Packages