Scrapper with CSS Selectors and XPath. #18

sjehuda · 2025-04-15T07:33:46Z

A dialog which would allow to fetch title, date, content, and link.

It would be possible to create custom rules and also import and export sets of rules.

CSS Selectors and XPath.

Reference: https://github.com/sjehuda/html2atom

I utilize this script html2atom.py.gz (not published).

Example usage with Liferea and Snownews (i.e. stdout):

python .local/share/liferea/scripts/html2atom.py --url 'http://www.slackware.com/' --title 'The Slackware Linux Project' --description 'News' --subtitle '' --language 'en' --root '//center[not(parent::body)]/table' --entry-title './tr[1]//b/text()' --entry-link '@href' --entry-description './tr[2]/td[1]//text()' --entry-date 'normalize-space(./tr[2]/td[2]//b/text())' --date-format '%Y-%m-%d'

Node    : _______________________
Title   : _______________________
Date    : _______________________
Summary : _______________________
Content : _______________________
Language: _______________________
Type    : [                     ] # News, Updates, Catalogue

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scrapper with CSS Selectors and XPath. #18

Scrapper with CSS Selectors and XPath. #18

sjehuda commented Apr 15, 2025 •

edited

Loading

Scrapper with CSS Selectors and XPath. #18

Scrapper with CSS Selectors and XPath. #18

Comments

sjehuda commented Apr 15, 2025 • edited Loading

sjehuda commented Apr 15, 2025 •

edited

Loading