-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better rate limited RESTClient #452
Conversation
My review is in progress 📖 - I will have feedback for you in a few minutes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed your code and found 4 potential issues. To discuss my individual comments that I have added, tag me in replies using @korbit-ai.
Please react with a 👍 to my comments that you find helpful and a 👎 to those you find unhelpful - this will help me learn and improve as we collaborate.
tests/backtest/test_polygon.py
Outdated
@@ -18,6 +18,7 @@ | |||
# Global parameters | |||
# API Key for testing Polygon.io | |||
POLYGON_API_KEY = os.environ.get("POLYGON_API_KEY") | |||
POLYGON_IS_PAID_SUBSCRIPTION = os.environ.get("POLYGON_IS_PAID_SUBSCRIPTION") or True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The environment variable 'POLYGON_IS_PAID_SUBSCRIPTION' is used as a boolean, but environment variables are always strings. This could lead to unexpected behavior if the environment variable is set to a string that is not 'True' or 'False'. Please convert the environment variable to a boolean by comparing the string to 'True' and 'False' and assigning the boolean value accordingly.
lumibot/tools/polygon_helper.py
Outdated
def _get(self, *args, **kwargs): | ||
if not self.paid: | ||
print( | ||
f"\nSleeping {self.seconds} seconds while getting data from Polygon because " | ||
f"we don't want to hit the rate limit. IT IS NORMAL FOR THIS TEXT TO SHOW UP SEVERAL TIMES " | ||
"and IT MAY TAKE UP TO 10 MINUTES PER ASSET while we download all the data from Polygon. The next " | ||
"time you run this it should be faster because the data will be cached to your machine. \n" | ||
"If you want this to go faster, you can get a paid Polygon subscription at https://polygon.io/pricing " | ||
f"and set `polygon_has_paid_subscription=True` when starting the backtest.\n" | ||
) | ||
time.sleep(self.seconds) | ||
return super()._get(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _get
method in the RLRESTClient
class currently uses a print statement to warn about sleeping due to rate limits. This could lead to a cluttered console output and is not easily manageable. Consider using the logging
module to output this message, which would allow for better control over when and how this message is displayed. Additionally, ensure that this message is only displayed when necessary to avoid repetition.
lumibot/tools/polygon_helper.py
Outdated
@@ -92,8 +89,8 @@ def get_price_data_from_polygon( | |||
# print(f"\nGetting pricing data for {asset} / {quote_asset} with '{timespan}' timespan from Polygon...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lumibot/tools/polygon_helper.py
Outdated
@@ -480,3 +462,23 @@ def update_polygon_data(df_all, result): | |||
df_all = df_all[~df_all.index.duplicated(keep="first")] # Remove any duplicate rows | |||
|
|||
return df_all | |||
|
|||
class RLRESTClient(RESTClient): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Canx This is great! Let me know if it's ready for review/merging |
…to polygon_refactor
lumibot/tools/polygon_helper.py
Outdated
""" | ||
Initialize the PolygonClient with rate limiting. | ||
""" | ||
self.seconds = 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually better to give these kind constants names at the top of the so folks are aware is used in this file rather than having to track it down.
Also the simplicity here is nice but we still have the problem of concurrency to deal with because of these pools. You could have a global var that holds the timestamp of the last access.
with RLRC_LOCK:
now = precise clock timestamp
elapsed_time = now - RLRC_LAST_TIME
RLRC_LAST_TIME = now
if elapsed_time < RLRC_MIN_ELAPSED:
sleep RLRC_MIN_ELAPSED - elapsed_time
Of you could slip the finessing and skip the timestamp and just put sleep inside the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your right! Made a WAIT_SECONDS constant and now init is not needed.
Ok, I need to understand more about this locking problem... At first try chatgpt is proposing to use a semaphore. I'll see what I can do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also thinking of doing a singleton, in tests I've seen more than one RESTClient instance and maybe that could be a problem...
I'm going to close this PR as I created the same one from the repo #455 |
With the new rate limited RESTClient class (PolygonClient) all current and future polygon API calls will be rate limited if needed. This was not done before for list_option_contracts, for example.
This also enables local testing with a free polygon key (works but slow). For local testing with free account you have to add an environment variable called POLYGON_IS_PAID_SUBSCRIPTION=False