Skip to content

Fix: Handle Single-Timestamp Case and Add Test Cases for other functions like intersection of 2 DF` #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

MAYANK12SHARMA
Copy link

@MAYANK12SHARMA MAYANK12SHARMA commented Mar 3, 2025

Description

This PR updates the find_contiguous_time_periods function to allow for single-timestamp sequences when min_seq_length is set to 1. Previously, the function required min_seq_length to be strictly greater than 1, which prevented valid single-point time periods (e.g., single satellite image selection).

Key Changes:

  • Validation Update: min_seq_length can now be 1.
  • Single-Timestamp Support: If only one timestamp exists and min_seq_length == 1, a valid period is returned with start_dt and end_dt set to the same value.
  • Sequence Length Check: Changed the condition to accept sequences >= min_seq_length, instead of strictly greater.

Fixes #192

How Has This Been Tested?

Tested with various time series scenarios:

  • Single timestamp with min_seq_length = 1
  • Single timestamp with min_seq_length > 1 (raises expected error)
  • Multiple timestamps with and without gaps

Added new test cases to cover these conditions.

  • Single timestamp with min_seq_length = 1
  • Multi-timestamp sequences with valid/invalid gaps
  • Edge cases for exact sequence lengths

Checklist:

  • My code follows OCF's coding style guidelines
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked my code and corrected any misspellings

raise ValueError(
"Only one timestamp found, but min_seq_length > 1. No valid periods.",
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good changes! But there's a ToDo in this same file here that needs to be addressed for some edge cases to work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As i can see in original file, there is a Todo but it is at line 256 in original file and in updated file it is at line 267.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is the same Todo I meant! We have to allow overlap of 1 timestamp between intervals for single timestamp data to work correctly, otherwise some cases won't be covered - e g, when one source's interval ends exactly where the single timestamp of another source lies.

Speaking of edge cases, it would be great if you could add tests to this. I would check that a single timestamp is located correctly, that a t0 can be created when it requires just one timestamp from satellite, and that interval selection works when one of them is a single timestamp and is located inside the other, outside the other, and at the endpoints. The t0 one can be tricky, let me know if you would like me to help with it!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the issue only mentioned the find_contiguous_time_periods function, I initially focused on that one and overlooked the other. I have now addressed the other function as well and added a clearer way to visualize the intersection of the two dataframes. Please review it and let me know if anything is missing.

…ove argument formatting in compute_icon_mean_stddev.py
@MAYANK12SHARMA
Copy link
Author

MAYANK12SHARMA commented Mar 3, 2025

@AUdaltsova , Actually during ruff and linting, i got the problem as
image

so i update the pyprject.toml file can you review and merge now?

…gle-point overlaps and improve intersection logic with detailed documentation and test cases.
@MAYANK12SHARMA MAYANK12SHARMA changed the title Fix: Handle Single-Timestamp Case and Improve Validation for min_seq_length Fix: Handle Single-Timestamp Case and Add Test Cases for other functions` Mar 3, 2025
…clude precise timestamps and enhance coverage for various overlap scenarios.
@MAYANK12SHARMA MAYANK12SHARMA changed the title Fix: Handle Single-Timestamp Case and Add Test Cases for other functions` Fix: Handle Single-Timestamp Case and Add Test Cases for other functions like intersection of 2 DF` Mar 3, 2025
@MAYANK12SHARMA
Copy link
Author

Hi @AUdaltsova, can you take a moment to review it now?

@Sukh-P
Copy link
Member

Sukh-P commented Mar 6, 2025

@MAYANK12SHARMA thanks for having a go at this issue, I just want to flag that since this issue does not have the contributions-welcome tag on it which we are using as stated in our contributions guidelines here this won't be a priority for us to review/implement as we have found in our experience that these sort of issues are more efficient to handle internally, we can try to get back to you when we can but due to this being a busy period at OCF I apologise if there is delay, thanks

@MAYANK12SHARMA
Copy link
Author

MAYANK12SHARMA commented Mar 6, 2025

"Hi @Sukh-P, no worries at all. Whenever you have some time, I’d appreciate it if you could review it. I’ve been contributing to OCF since January. I have worked on ocf proejcts like open_data_pvnet, analysis dashboard, pvsitedatamodel, nowcasting datamodel and more.

I feel it's worth for me to work with this org.

@MAYANK12SHARMA
Copy link
Author

@peterdudfield, @Sukh-P Could you please take a moment to review the PR. It's been 3 weeks.

@AUdaltsova
Copy link
Contributor

Hi @MAYANK12SHARMA, thanks a lot for your patience. We have done some work on this internally, specifically in the tests and the intersection_of_2_dataframes_of_periods() function. Also, the things you have been encountering due to ruff shouldn't be a problem now.

If you still want to work on this PR, the initial changes to min_seq_length handling you did are still very much needed! I think it would be easiest to put just them (I think that's your very first commit) on a new PR, since this one will need commit reverting and for the merge conflicts to be resolved. If you don't want to return to this, that's completely understandable, especially since it's been a while. Either way, I think it's best to close this PR for now - feel free to tag me for a review if you decide to return to this issue! Thank you again for your work.

@AUdaltsova AUdaltsova closed this Apr 10, 2025
@MAYANK12SHARMA
Copy link
Author

Hi, @AUdaltsova i am really sorry for late reply. I will raise the new PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for single timestamp periods during contiguous time periods selection
3 participants