Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rethinkdb-changefeed-reconnect #77

Open
zllovesuki opened this issue Sep 2, 2017 · 24 comments
Open

Add rethinkdb-changefeed-reconnect #77

zllovesuki opened this issue Sep 2, 2017 · 24 comments

Comments

@zllovesuki
Copy link

I have an application running v2.1.0 and updated the dependency to v3.1.3, and the queue stops working after some period of time.

It works when I initially start up the queue, the handler can process incoming jobs. However, after some time the handler was no longer called (.process(function(job, done) {}))

Could you walk me through the changes? I've eliminated other dependency in the application.

@grantcarthew
Copy link
Owner

Hi @zllovesuki
As per the Change Log the only breaking change from v2 to v3 was the event signatures. Specifically the error event signatures.

Read more about the error event signatures here: https://github.com/grantcarthew/node-rethinkdb-job-queue/wiki/Event.error

So check your event callbacks.

@zllovesuki
Copy link
Author

@grantcarthew I think it's the same error that if a changefeed becomes unavailable, the Queue just silently ignores it and no errors were thrown.

@grantcarthew
Copy link
Owner

grantcarthew commented Sep 7, 2017

Oh. Not seen that before. There is a package called rethinkdb-changefeed-reconnect that might do the job here. Was wondering if it was worth adding it to the queue dependencies.

@zllovesuki
Copy link
Author

It's similar to #72 .

It would be great if you could implement it. :)

@zllovesuki
Copy link
Author

Actually, now that I think about it, it may not seem so easy.

My application connects to a proxy node (same application on the same node connects to the proxy on the same node), and each proxy node then connects to the actual data nodes.

If the proxy node drops (which is connection closed), then the liveness probe will pick it up and nothing terrible will happen.

However, if the data node drops, the error is silently ignored. If I use changefeeds by itself (.changes()), RethinkDB actually complains something like "changefeed unavailable".

@zllovesuki
Copy link
Author

Right now my workaround is to query server_status and check for changes in time_connected (so the application can tell that a data node was dropped and it should restart).

@disarticulate
Copy link

@grantcarthew I used the rethinkdb-changefeed-reconnect almost immediately since the connections can be flaky.

I'd consider it a necessary dependency.

@grantcarthew
Copy link
Owner

OK, I just had a closer look at the project and will add this to the To Do list. My head isn't in this space at the moment so it will not be done asap. Sorry. If anyone wants to contribute?

I imagine exposing the reconnect package options onto the existing rethinkdb-job-queue connection options would work well.

@grantcarthew grantcarthew changed the title What's changed from 2.x to 3.x? Add rethinkdb-changefeed-reconnect Sep 7, 2017
@disarticulate
Copy link

This is the code I had when I inquired about broadcasting queues:

var processChangefeed = require('rethinkdb-changefeed-reconnect')
...
const queueChangeFeedWatcher = (name, q, queueHandler, vm) => {
  // wait for the q ready
  q.on('ready', (queueId) => {
    var r = q.r
    var newTasksFeed = () => { return r.table(name).changes() } // get the changes
    var handleNewTask = ({new_val, old_val}) => { //handle updates to changes
      if ('data' in new_val
        && new_val.status === 'completed'
        && new_val.data.broadcast
        && new_val.data.broadcasters.length > 0) { // watch for a broadcast flag
        var [host,proc,cmd,count,uuid] = new_val.queueId.split(':')
        if (new_val.data.broadcasters.indexOf(hostID()) === -1
            && queueHandler[cmd]) { // check whether processor has already managed
          vm.log('[queueChangeFeedWatcher] > ', new_val.data.broadcasters, host, proc, cmd)
          queueHandler[cmd] (// run handler from handler objects
            new_val,
            (err, result) => {},
            (job, cb) => {}
          )
        }
      }
    }

    var handleError = (err) => {
      vm.log(err.stack)
    }
    processChangefeed(newTasksFeed, handleNewTask, handleError, { 
      changefeedName: name,
      attemptDelay: 30000,
      maxAttempts: 30,
      silent: false,
      logger: console
    })
  })
}

I roughed out what it looks like would take to just swap the reconnect driver.

master...disarticulate:patch-1

@grantcarthew
Copy link
Owner

Ah, that's where I saw it! From your issue @disarticulate. I couldn't remember.

That master compare is a good start thanks, however as I said before the reconnect options need to be exposed as part of the public API. Again I imagine either properties of the cxn object or a nested object on the cxn object. There should also be an options to disable the reconnect feature by ether a specific switch or lack of options.

For reference I did some research into where we are at with respect to reliable change feeds. Here is the main discussion as far as I can tell: rethinkdb/rethinkdb#3471

It doesn't look like it will be anytime soon so the rethinkdb-changefeed-reconnect module looks like the way forward for now.

@grantcarthew
Copy link
Owner

I just published a minor change updating dependencies. I'm not adding features at this time. If someone is interested in working on this feature, please do so. I see this feature as the most useful addition to the queue.

@sagivf
Copy link
Collaborator

sagivf commented May 28, 2018

@grantcarthew Can you help clear up the issue here?
Which changefeed might need restarting?

Is it the one mentioned here - Queue.changeFeed which should only affect custom client event handling functionality if I understand correctly?

Or is it something basic to rethinkdb-job-queue which might make job processing stop working? Thanks 😄

@grantcarthew
Copy link
Owner

grantcarthew commented May 29, 2018

Hi @sagivf, been a while.
I'm not using RethinkDB at the moment and have been working on another package called perj.

The plan is to add rethinkdb-changefeed-reconnect as a dependency to the job queue and replace the change feed with connections mannaged by rethinkdb-changefeed-reconnect.

The code that needs to be updated would be here: https://github.com/grantcarthew/node-rethinkdb-job-queue/blob/master/src/queue-db.js#L25

I'm busy now so can not make the the changes needed.

Adding you as a collaborator.

https://github.com/grantcarthew/node-rethinkdb-job-queue/invitations

@grantcarthew
Copy link
Owner

Added you as a maintainer for the NPM package also @sagivf.

@sagivf
Copy link
Collaborator

sagivf commented May 30, 2018

Thanks @grantcarthew, been to long, good to hear from you :)
I will try to dive into the code next week.

Just to get started can you clarify how changefeeds are used in rethinkdb-job-queue? Are they just for events the user wants to tap into? Or is it core to the way the distributed queuing works?
Is the management and communication of the distributed jobs done via some sort of pulling?

I realise you are not currently dedicated to this project, but you've built a great tool here, one which I use a lot an example. So I hope you stay around in some capacity.

I now the whole RethinkDB situation has been discouraging but just so you know there is an attempt to fork and revive it (https://spectrum.chat/rebirthdb) It seems to be moving along nicely, thelinuxlich has been leading it nicely and we are doing interesting things like rebuilding the admin in electron and separating it from the main repo along with some other important cleanup and plans down the road.

If this effort succeeds I would really love for https://github.com/grantcarthew/node-rethinkdb-job-queue to take a front stage as it is really useful and works great (great docs to 😄).

See you around buddy.

@thelinuxlich
Copy link

Also, since you probably dived deeply in changefeeds, you can help us format what needs to be done for 2.6, where part of the focus may be directed in providing changefeed reliability and performance.

@grantcarthew
Copy link
Owner

Change Feeds

It's great to hear that you are happy using rjq. I don't get much feedback apart from stars.

To answer your questions @sagivf, change feeds are core to how the distributed queue works. Polling is only carried out by the Queue Master. All other workers rely on the change feed. If the feed stops, they stop.

They have been implemented in a very simple way though. Rather than setup a heap of feeds, there is only one change feed applied to the whole table: https://github.com/grantcarthew/node-rethinkdb-job-queue/blob/master/src/queue-db.js#L28

This causes all the Queue objects to receive changes for any changes on the queue. The changes are then processed locally in the queue-change.js file. I have no doubt that there is a better way of doing this.

With this in mind, you can see the benefit of switching to an automatically reconnecting feed project.

RethinkDB Project

Yes I had seen the communications by @thelinuxlich and am a little dissapointed the guys with RethinkDB collaborator permissions can't add an active user like @thelinuxlich in to help. I even tweeted as such asking for help: https://twitter.com/GrantCarthew/status/996555096426074112

I moved away from RethinkDB because of the ghost town that is the original repository. I tried out PostgreSQL for a while and have now moved to MongoDB. This really hurts as you can imagine. I spent six months building this Job Queue with the goal of using it extensively.

I can't really help out with the RethinkDB fork though @thelinuxlich. I would love to spend my time working on such projects but this is not my primary job. The time I spend on open source is my own time. Not surprisingly it doesn't generate any income to help with life. Sorry.

@thelinuxlich
Copy link

@grantcarthew no problem man, any time you can spend helping us test would be great, take your time ^^

@sagivf
Copy link
Collaborator

sagivf commented May 31, 2018

I think the project wasn't adopted much due to:
a) timing - about the same time as the start up closed.
b) not enough PR.

@thelinuxlich I think we can help on the PR side as Job processing (and Job scheduling) is something most stacks need very early on and it's great when it just fits into your existing DB stack.

@thelinuxlich
Copy link

Sure, let's fork and bring the project to the org

@thelinuxlich
Copy link

I think this can get better with the write hooks feature in 2.4

@sagivf
Copy link
Collaborator

sagivf commented May 31, 2018

@thelinuxlich you mean use the right hooks instead of changefeeds? @grantcarthew Would you like to move it into the org or would you prefer it under your account?

@thelinuxlich
Copy link

Yeah, use write hooks as state machine for jobs

@grantcarthew
Copy link
Owner

I can move it guys. What's the address?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants