Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add posibility to use RedisCluster as connection Class #2030

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

MykolaBordakov
Copy link

Hi. Here my suggest about fixing this issue: #2021
There i add some implementation about using RedisCluster connection type.

Copy link

codecov bot commented Jan 24, 2024

Codecov Report

Attention: Patch coverage is 36.84211% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 90.82%. Comparing base (34f83d6) to head (2b92254).
Report is 39 commits behind head on master.

❗ Current head 2b92254 differs from pull request most recent head f6480a8. Consider uploading reports for the commit f6480a8 to get more accurate results

Files Patch % Lines
rq/worker.py 42.85% 8 Missing ⚠️
rq/queue.py 20.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2030      +/-   ##
==========================================
- Coverage   93.61%   90.82%   -2.79%     
==========================================
  Files          28       30       +2     
  Lines        3758     3978     +220     
==========================================
+ Hits         3518     3613      +95     
- Misses        240      365     +125     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rq/worker.py Outdated
timeout_config = {"socket_timeout": self.connection_timeout}
connection.connection_pool.connection_kwargs.update(timeout_config)
return connection
if connection.__class__.__name__ == 'Redis':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind changing this approach of explicit class name check? This runs afoul of Python's duck typing concept (e.g it wouldn't work with FakeRedis).

I'd prefer using try/except to try and detect the type of connection the worker has.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. No problem. With try excrpt will be work well.

@selwin
Copy link
Collaborator

selwin commented Jan 27, 2024

Is it possible to setup a Redis cluster to test this in CI?

Comment on lines +422 to +435
current_socket_timeout = connection.connection_pool.connection_kwargs.get("socket_timeout")
if current_socket_timeout is None:
timeout_config = {"socket_timeout": self.connection_timeout}
connection.connection_pool.connection_kwargs.update(timeout_config)
return connection
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd also need to test this so test coverage doesn't go down. An easy way to do this would be to move this logic to a separate setup_connection() and pass both Redis and RedisCluster instances into it so we can easily test this.

@MykolaBordakov
Copy link
Author

MykolaBordakov commented Jan 27, 2024

Yes, it is possible. Bast way to do this is use docker compose file. Here it : https://github.com/bitnami/containers/blob/main/bitnami%2Fredis-cluster%2Fdocker-compose.yml
I used it for my tests. Code and cluster works well.

@selwin
Copy link
Collaborator

selwin commented Jan 27, 2024

Ok great. Do you mind adding Redis cluster to the test matrix then? I think we can just test it with the latest Python and redis-py version.

@MykolaBordakov
Copy link
Author

I add yours previous commnets. I reopen pull request.

@MykolaBordakov
Copy link
Author

Hi @selwin
I tried to add redis cluster to test. Can you explain to me, how you create connection by this function : find_empty_redis_database() (init.py from tests folder)? Does you have some real redis DB during test pipeline ?? Can we do same thing for redis cluster ??

@MykolaBordakov
Copy link
Author

Hi @selwin sorry for pushing you,
Is any information about how to organize redis cluster tests ?

@selwin
Copy link
Collaborator

selwin commented Mar 21, 2024

@MykolaBordakov I don't think need separate tests for Redis Cluster. The important thing is making sure that existing tests run on Redis cluster. So what you need to figure out is how to spin up Redis Cluster when CI runs.

@MykolaBordakov
Copy link
Author

Did I understand your test pipeline correctly?
You had some code, that creates test Redis connection:

def find_empty_redis_database(ssl=False):
    """Tries to connect to a random Redis database (starting from 4), and
    will use/connect it when no keys are in there.
    """
    for dbnum in range(4, 17):
        connection_kwargs = {'db': dbnum}
        if ssl:
            connection_kwargs['port'] = 9736
            connection_kwargs['ssl'] = True
            connection_kwargs['ssl_cert_reqs'] = None  # disable certificate validation
        testconn = Redis(**connection_kwargs)
        empty = testconn.dbsize() == 0
        if empty:
            return testconn
    assert False, 'No empty Redis database found to run tests in.'

If we needs to test it with RedisCluster i needs to create a same thing. But, redis cluster connection didn`t set up in such way.
Or, maybe, we does not do any additional tests. And run yours pipeline without redis cluster connection set up?

@MykolaBordakov
Copy link
Author

Hi @selwin
Did I understand your test pipeline correctly?
You had some code, that creates test Redis connection:

def find_empty_redis_database(ssl=False):
    """Tries to connect to a random Redis database (starting from 4), and
    will use/connect it when no keys are in there.
    """
    for dbnum in range(4, 17):
        connection_kwargs = {'db': dbnum}
        if ssl:
            connection_kwargs['port'] = 9736
            connection_kwargs['ssl'] = True
            connection_kwargs['ssl_cert_reqs'] = None  # disable certificate validation
        testconn = Redis(**connection_kwargs)
        empty = testconn.dbsize() == 0
        if empty:
            return testconn
    assert False, 'No empty Redis database found to run tests in.'

If we needs to test it with RedisCluster i needs to create a same thing. But, redis cluster connection didn`t set up in such way.
Or, maybe, we does not do any additional tests. And run yours pipeline without redis cluster connection set up?

@MykolaBordakov
Copy link
Author

Hi @selwin
Can you help me with tests?? I don't quite understand how REDIS is currently being deployed for testing??

@andreasciamanna
Copy link

I gave this branch a try.

While it seems to solve the original problem, the worker throws these exceptions:

  File "(...)/.venv/lib/python3.12/site-packages/redis/cluster.py", line 2282, in watch
    raise RedisClusterException("method watch() is not implemented")
redis.exceptions.RedisClusterException: method watch() is not implemented

@MykolaBordakov
Copy link
Author

Hi @andreasciamanna
What tests are doing ?? Or you find solution for adding tests in to CI??
Thank you.

@andreasciamanna
Copy link

Hi @andreasciamanna What tests are doing ?? Or you find solution for adding tests in to CI?? Thank you.

I didn't mention any tests.

I'm using this branch in my project and noticed that issue.

The worker seems to work anyway, but the exceptions are flooding the logs, and I wonder if they may cause any problems.

@MykolaBordakov
Copy link
Author

Hi @andreasciamanna What tests are doing ?? Or you find solution for adding tests in to CI?? Thank you.

I didn't mention any tests.

I'm using this branch in my project and noticed that issue.

The worker seems to work anyway, but the exceptions are flooding the logs, and I wonder if they may cause any problems.

What recis lib version are you using?? Can you try redis lib version 5.0.1??

@andreasciamanna
Copy link

Hi @andreasciamanna What tests are doing ?? Or you find solution for adding tests in to CI?? Thank you.

I didn't mention any tests.
I'm using this branch in my project and noticed that issue.
The worker seems to work anyway, but the exceptions are flooding the logs, and I wonder if they may cause any problems.

What recis lib version are you using?? Can you try redis lib version 5.0.1??

I was using 5.0.0. I've tried 5.0.1 but got the same results.

Full error, in case the previous wasn't enough:

13:22:11 [Job 2be735ce-921e-4c31-8c52-7aaf8b5fa9fe]: exception raised while executing (apps.queue_worker.update_references)
Traceback (most recent call last):
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 1552, in perform_job
    self.handle_job_success(job=job, queue=queue, started_job_registry=started_job_registry)
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 1485, in handle_job_success
    pipeline.watch(job.dependents_key)
  File "/(omitted)/.venv/lib/python3.12/site-packages/redis/cluster.py", line 2282, in watch
    raise RedisClusterException("method watch() is not implemented")
redis.exceptions.RedisClusterException: method watch() is not implemented

@MykolaBordakov
Copy link
Author

Hi @andreasciamanna What tests are doing ?? Or you find solution for adding tests in to CI?? Thank you.

I didn't mention any tests.
I'm using this branch in my project and noticed that issue.
The worker seems to work anyway, but the exceptions are flooding the logs, and I wonder if they may cause any problems.

What recis lib version are you using?? Can you try redis lib version 5.0.1??

I was using 5.0.0. I've tried 5.0.1 but got the same results.

Full error, in case the previous wasn't enough:

13:22:11 [Job 2be735ce-921e-4c31-8c52-7aaf8b5fa9fe]: exception raised while executing (apps.queue_worker.update_references)
Traceback (most recent call last):
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 1552, in perform_job
    self.handle_job_success(job=job, queue=queue, started_job_registry=started_job_registry)
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 1485, in handle_job_success
    pipeline.watch(job.dependents_key)
  File "/(omitted)/.venv/lib/python3.12/site-packages/redis/cluster.py", line 2282, in watch
    raise RedisClusterException("method watch() is not implemented")
redis.exceptions.RedisClusterException: method watch() is not implemented

Thanks, full log is nice. One question, does job works well?? Did all you tasks have been done??

@andreasciamanna
Copy link

Yes. Everything works fine, so far.

@andreasciamanna
Copy link

andreasciamanna commented Apr 11, 2024

@MykolaBordakov have you tried your branch with a scheduled job? e.g. queue.enqueue_in( ... )

It seems like if you run the worker with the scheduler (in my case, from the script itself, using worker.work(with_scheduler=True)), the error shows up again:

Traceback (most recent call last):
  File "/(omitted)/apps/queue_worker.py", line 113, in <module>
    worker.work(with_scheduler=True)
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 557, in work
    self._start_scheduler(burst, logging_level, date_format, log_format)
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 806, in _start_scheduler
    self.scheduler = RQScheduler(
                     ^^^^^^^^^^^^
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/scheduler.py", line 53, in __init__
    self._connection_class, self._pool_class, self._pool_kwargs = parse_connection(connection)
                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/connections.py", line 12, in parse_connection
    connection_pool_kwargs = connection.connection_pool.connection_kwargs.copy()
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'RedisCluster' object has no attribute 'connection_pool'

I wouldn't expect the project to require duplication of fixes in the codebase, but it seems to be the case here.

@MykolaBordakov
Copy link
Author

@MykolaBordakov have you tried your branch with a scheduled job? e.g. queue.enqueue_in( ... )

It seems like if you run the worker with the scheduler (in my case, from the script itels, using worker.work(with_scheduler=True)), the error shows up again:

Traceback (most recent call last):
  File "/(omitted)/apps/queue_worker.py", line 113, in <module>
    worker.work(with_scheduler=True)
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 557, in work
    self._start_scheduler(burst, logging_level, date_format, log_format)
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 806, in _start_scheduler
    self.scheduler = RQScheduler(
                     ^^^^^^^^^^^^
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/scheduler.py", line 53, in __init__
    self._connection_class, self._pool_class, self._pool_kwargs = parse_connection(connection)
                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/connections.py", line 12, in parse_connection
    connection_pool_kwargs = connection.connection_pool.connection_kwargs.copy()
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'RedisCluster' object has no attribute 'connection_pool'

I wouldn't expect the project to require duplication of fixes in the codebase, but it seems to be the case here.

This issue was fixed by my branch. If you want to use it with scheduler, you needs to replace original rq file after PIP installation of RQ lib. Also, when you install rq scheduler, rq will be installed to from current latest release. When we will merge this branch, you will not do files replacement. Also, i will try to find fix yours previous errors. Have a nice day.

@MykolaBordakov
Copy link
Author

@MykolaBordakov have you tried your branch with a scheduled job? e.g. queue.enqueue_in( ... )

It seems like if you run the worker with the scheduler (in my case, from the script itels, using worker.work(with_scheduler=True)), the error shows up again:

Traceback (most recent call last):
  File "/(omitted)/apps/queue_worker.py", line 113, in <module>
    worker.work(with_scheduler=True)
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 557, in work
    self._start_scheduler(burst, logging_level, date_format, log_format)
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/worker.py", line 806, in _start_scheduler
    self.scheduler = RQScheduler(
                     ^^^^^^^^^^^^
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/scheduler.py", line 53, in __init__
    self._connection_class, self._pool_class, self._pool_kwargs = parse_connection(connection)
                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/(omitted)/.venv/lib/python3.12/site-packages/rq/connections.py", line 12, in parse_connection
    connection_pool_kwargs = connection.connection_pool.connection_kwargs.copy()
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'RedisCluster' object has no attribute 'connection_pool'

I wouldn't expect the project to require duplication of fixes in the codebase, but it seems to be the case here.

Here is example how to find path to yours rq lib.

import rq
print(rq.__file__)

@andreasciamanna
Copy link

andreasciamanna commented Apr 11, 2024

I don't understand your last two comments.

I've installed rq with pip install git+https://github.com/MykolaBordakov/rq.git@support-redis-cluster.

I'm supposed to be using your version of rq.

Also, when you install rq scheduler,

The scheduler is part of rq and is not a separate package.

Could you please let me know if you tested the scheduler?

@MykolaBordakov
Copy link
Author

I don't understand your last two comments.

I've installed rq with pip install git+https://github.com/MykolaBordakov/rq.git@support-redis-cluster.

I'm supposed to be using your version of rq.

Also, when you install rq scheduler,

The scheduler is part of rq and is not a separate package.

Could you please let me know if you tested the scheduler?

I use rq-schedular, it is a other lib. Here link:
https://pypi.org/project/rq-scheduler-ng/
There are rq lib in dependencies. If you install it, you automaticaly install rq from master branch.

@andreasciamanna
Copy link

Is there any advantage in using a separate library when rq already provides a scheduler?

Even if there is, the scheduler in rq would still be broken when using RedisCluster, meaning this PR shouldn't be merged until the problem is resolved, right?

@MykolaBordakov
Copy link
Author

with_scheduler
Ou, sorry. I think i understood yours problem. I will check scheduler code.

@MykolaBordakov
Copy link
Author

Is there any advantage in using a separate library when rq already provides a scheduler?

Even if there is, the scheduler in rq would still be broken when using RedisCluster, meaning this PR shouldn't be merged until the problem is resolved, right?
Yes, you are right. I need to fix scheduler before merge. Also, needs to create tests for my code to CI. Maybe, you can help me with tests?? :)

@andreasciamanna
Copy link

[...] needs to create tests for my code to CI. Maybe you can help me with tests?? :)

I would be happy to do so, but there are a couple of blockers:

  • I have zero experience with GitHub actions: I'm familiar with other CI environments (e.g. GitLab-CI), but GitHub actions is an entirely different approach I never had the need, time or will to learn.
  • I also don't have much experience with Python in general. It's a language I managed to avoid until recently, so I'm still learning and failing to grasp some of its specifics.

There is a lack of information on how to set the test environment, or maybe it's more of a lack of knowledge from my side.
In any case, that docker-compose file shared by @selwin does not help much and he has been in radio silence since his last comment three weeks ago.

Some more leads from @selwin or anyone else would help.

@selwin
Copy link
Collaborator

selwin commented Apr 12, 2024

There is a lack of information on how to set the test environment, or maybe it's more of a lack of knowledge from my side.
In any case, that docker-compose file shared by @selwin does not help much and he has been in radio silence since his last comment three weeks ago.

Like I said, I have never personally used Redis cluster, though I'm not against adding Redis cluster support to RQ so I will accept contributions in this area.

That said, this PR needs to show that it actually works on Redis cluster setup, meaning RQ's CI setup (Github Actions) needs to be updated to run the relevant test suites against a Redis cluster. I have no knowledge on how to set this up, so contributions are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants