-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalized feed reader #31
base: master
Are you sure you want to change the base?
Conversation
👍 for "epic saga" How to test this? With one feed and one channel, I set the pull interval much smaller than min_time and confirmed that the posts were periodically sent to the channel. With many-to-many I could see this getting fussy. |
I'm currently playing around in #semmons99-test1 and #semmons99-test2 with a handful of different RSS feeds and configurations. |
I think I've worked out the kinks. I'm going to leave my test bot sitting in #semmons99-test1 and #semmons99-test2 for a few hours to make sure things go as planned. If everything looks good I'll merge later tonight. |
Sorry I didn't have anything to contribute here. I should be around for most of the day if there is something else I can help with. |
Sweet. Let me know once this is merged so I can take care of #26 |
@gjp actually there's a couple things you could help with.
It would also be fun to see multiple templates so it's not always "an epic saga". |
👍 I would love to see that! |
@semmons99 what sort of errors were you running across earlier? You'd mentioned every test feed was failing. Was that due to the RSS namespace? alternatives to "epic saga": hmm, should we drop that into #mendicant and see how many suggestions we can gather? |
@gjp mostly missing pubDates. Yeah ask #mendicant for suggestions. |
I've been playing around with this for the past couple hours and have come to the conclusion that most site's RSS feeds are a pile of 💩. We can't rely on much of anything to be there other than
Thoughts? |
@semmons99 the two main RSS feeds I want to support are Community and GitHub. It looks like GitHub is using Would that be enough to get this working? |
I'm pretty new to RSS formats but I think I see what's going on here. There's RSS, there's Atom, and then there's a site like Hacker News which strips out every byte they can to save bandwidth, standards be damned, and calls it RSS anyway. I'd suggest checking both the pubdate and updated attributes to handle both RSS and Atom, but to ignore invalid feeds. Any site which is so busy that they need to strip all dates is probably too spammy for IRC anyway. Or they just suck. unrelated: @sandal suggested combining synonyms of "epic" and "saga". I'll make a list of overused words from book blurbs. It'll be a stunning tour de force. |
@jordanbyron @gjp sounds good. @gjp think you can update this branch to use |
Since we want to support just a few feeds, can't we use a 3rd party tool, e.g. feedburner, to normalize this feeds for us? Or maybe Yahoo! pipes? |
@wicz I'd rather avoid thirdparty tools and just write the logic to handle the feed types we need. The more we can rely on ourselves at mendicant the better. |
@semmons99 👍 I totally agree. Keep it simple |
@wicz: With the exception of Github, Mendicant University does not use third party services unless they are based on open source and easily replaceable. There are a ton of reasons for this, which reminds me that @jordanbyron and I really need to write up our guidelines... |
@semmons99 @jordanbyron Agreed, I don't like to add dependencies either, unless they're extremely necessary. But just to make sure we're on the same page here, I was talking about using the feedburner.com service and not any other library with the same name. Just pointing it out cause I've used it before when needed to aggregate news from a bunch of shitty feeds. EDIT: Anyway, it won't work cause it seems it doesn't support HTTPS :boo: |
@wicz: A service dependency is an even heavier dependency than a third party library. We don't mind using libraries, because we can maintain them ourselves if/when they have problems. Services we have no control over. |
@sandal You're absolutely right! ps: comments ordering got a bit clumsy, but I'm glad we agreed anyway 😉 |
@semmons99 sure, I can roll up some small additions. Would you like to merge this now or wait for me to experiment on this branch? My schedule is uncertain today. |
@gjp got any time to work on this? |
I had open tickets on 4 projects before I drove out of state for the weekend. By the time I was home, 💤. I'll background MoM and Community and start here with pubdate/updated and author. Do you mind if we make wordplay on the announcements a separate patch? I don't want to cause further delay to this one. |
@gjp sure, that's fine. |
@gjp feel free to work on MoM and Community first too. |
@semmons99 @gjp both are blocked by me at this point, so feel free to work on this first :wink2: |
Working on this one. It's trial and error with different test feeds, so I might narrow it down to Community and GitHub and not worry about weird variations. Do you think the refresh interval should remain universal, or set per-feed? |
@gjp Just focus on Community and Github. They're all we care about right now. Also I can see the value in having a per-feed interval. It would allow us to follow github more actively. |
Hi guys, sorry for not responding earlier. This week I'll be working on puzzlenode, but you can count on me to help with this and #32 as well. |
@wicz awesome! |
I'm feeling the temptation to try a different parser. Doing this on multiple fields feels wrong and item.updated is not being cast to a Time but item.pubDate is. 😒 @semmons99 How were you using Pry to help here? I've only noodled around with it. Given how long it's taking me to make one change and test it, I must be doing something wrong. |
Feedzirra is making me happy. |
@gjp awesome, the description alone warrants inclusion into mendibot. I was using pry by inserting: require "pry"
binding.pry where I kept having trouble and inspecting the items as they came through. I really don't use any of it's fancy features. Most of the time I just inspect the object, look at methods, etc. If you're stuck at all and want to try some remote pairing tomorrow, let me know. I'm pretty sure I can setup my vps to function for remote pairing. |
Feedzirra normalizes the feeds nicely. Now I'm stuck on something which is making me feel brain-dead. With a single global timer, we just iterate through all feeds on each refresh. With per-feed timers, we need to know which feed to refresh, when the timers are set up.
Will Cinch accept arguments to the Also I'd prefer to be able to do feed.interval, but feeds are just a hash. |
Let's just define each of the pull methods programmatically. require 'cinch'
require 'open-uri'
require 'rss'
module Mendibot
module Plugins
class RSSRelay
include Cinch::Plugin
Mendibot::Config::RSS_SETTINGS[:feeds].each do |feed|
method = "#{feed[:name]}_pull".to_sym
timer feed[:interval], method: method
define_method method do
open(feed[:url]) do |rss|
#... regular rss stuff
end
end
end
end
end
end |
Let's go by parts 😉 @jordanbyron The thing about one using @gjp I don't think we need a different parser here. The stdlib's RSS is capable of parsing both formats, all we need is to be able to distinguish between them. gh = RSS::Parser.parse(open('https://github.com/jordanbyron/mission_of_mercy/commits/master.atom'))
gh.feed_type # => "atom"
cm = RSS::Parser.parse(open('http://community.mendicantuniversity.org/articles.rss'))
cm.feed_type # => "rss"
gh_entry = gh.items.first
cm_entry = cm.items.first
cm_entry.title.class # => String
cm_entry.title # => "Tempr - no fussin' temporal expressions"
cm_entry.pubDate.class # => Time
cm_entry.pubDate # => 2012-02-16 16:07:15 -0200
gh_entry.title.class # => RSS::Atom::Feed::Title
gh_entry.title.content # => "Now that loading zipcodes is fast, we don't need progress_bar"
gh_entry.updated.content.class # => Time
gh_entry.updated.content # => 2012-02-16 18:32:34 -0200 Note that we have to call About Feedzirra, it seems a nice project, and even though it can make our life easier, there is a heavy tradeoff in game: it depends on ActiveSupport. I think we can cast our own solution based only in the standard RSS parser. Maybe we can create an @gjp @semmons99 You might want take a closer look at pry-nav and pry-stack_explorer Thoughts? |
@semmons99 so I was not missing something? Dynamically creating N methods is the simplest solution? @wicz after a couple hours of seemingly getting a different failure mode each time I changed feeds (grrr), dropping in Feedzirra and having it just work with less code (specific to Mendibot) across community, github, and CNN (a test for missing author) felt like a win. 👍 for abstracting away trivia.
Note how pubDate and updated are normalized to How about if I get at least one working version committed to this branch with Feedzirra before breaking it again, and then we can attempt to reach the same or better level of brevity without it? |
Simplest that comes to mind. |
This amuses me way more than it probably should.
...using the code to monitor changes to itself ✨ |
infinite regression!
|
I should probably explain why feed_interval exists, since my config file is not tracked here. I experimented with having both a global (default) interval and a per-feed interval. It may make more sense to remove this and always set the interval per-feed.
|
tl;dr on the thread but @jordanbyron @semmons99 @gjp, is there any reason why we wouldn't want to switch community to output Atom 1.0 and then not have to consider multiple types of feeds? |
If it's easy enough for Jordan, that'd be fine with me. |
@gjp I think it makes more sense to just require each feed to specify it's own interval. |
I think I might have just awoken my wife having literally lol'd 😂 |
Hum ... that sounds familiar to something I suggested a few days ago 😏
Switching the RSS to Atom or adding an additional Atom feed to Community makes sense to me. |
@jordanbyron Think you can do that this week? Or do you want to keep the focus on PuzzleNode? |
I don't want to overcommit, so let's say I won't have time to do that until next week. Created the ticket for adding an Atom feed in community if anyone is interested in picking it up: https://github.com/mendicant-university/community/issues/73 |
Thanks to @wicz community now has an Atom feed: http://community.mendicantuniversity.org/articles.atom Let me know if there is anything else we need to do on the community site to get this patch merged 😄 |
Here's the beginnings of a generalized RSS plugin for Mendibot. @gjp what do you think?
#30