Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use oplog_replay option to optimize queries on oplog #3

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Use oplog_replay option to optimize queries on oplog #3

wants to merge 2 commits into from

Conversation

poudro
Copy link

@poudro poudro commented Nov 7, 2012

Using an index on the oplog is not very efficient and degrades performance when a big number of writes are being performed in a short amount of time.

Mongo has an internal scheme to perform queries on the oplog that are activated when using the "oplog_replay" option with queries (bit num 3 on http://www.mongodb.org/display/DOCS/Mongo+Wire+Protocol#MongoWireProtocol-OPQUERY).

pymongo doesn't, as yet, allow for the option, although a similar pull-request was proposed for this very purpose.

Please consider applying this patch if pymongo updates the available options in the Cursor class init.

Thx :)

poudro and others added 2 commits November 6, 2012 17:50
Copy link

@gukoff gukoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is straight up wrong in so many ways.
Isn't this enough?

q = self._oplog.find(
    spec, tailable=True, await_data=True, oplog_replay=True
)

@@ -1,16 +1,16 @@
#!/usr/bin/env python
#!/usr/bin/python
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong

# for name,m in masters.items():
# pprint.pprint(name)
# pprint.pprint(m._coll)
# pprint.pprint('masters output')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any commented code. If you don't use it, remove it.

'''Create a bson.Timestamp. Factored into a method so we can delay
importing socket until gevent.monkey.patch_all() is called. You could
also insert another timestamp function to set options if you so desire.
'''
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a clear reason not to remove this method.

master_uri = self._topology[name_by_id[master['_id']]]['uri']
conn = Connection(master_uri)
coll = conn.local[MMM_DB_NAME]
for cp in coll.find(dict(_id='master_checkpoint')):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use dict for it: {'_id': 'master_checkpoint'}

if masters[s._topology[name]['id']] == True:
masters[s._topology[name]['id']] = s
except:
True
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean?

try:
if masters[s._topology[name]['id']] == True:
masters[s._topology[name]['id']] = s
except:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants