Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sol pipelines #387

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

Sol pipelines #387

wants to merge 15 commits into from

Conversation

MihirT906
Copy link

@MihirT906 MihirT906 commented Feb 16, 2023

Resolve #355

@CLAassistant
Copy link

CLAassistant commented Feb 16, 2023

CLA assistant check
All committers have signed the CLA.

Copy link
Collaborator

@sarahmish sarahmish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR @MihirT906!
Couple of small things need to be fixed (mainly docstrings)

Please remove the changes of the following files:

  • 2' (I think it was added by mistake)
  • Makefile
  • tests/readme_test/README.md
  • tests/readme_test/README_evaluate.md

Comment on lines 12 to 23
"""
The function checks if the dataset being used is valid i.e has a length greater than 0 and contains the dimension required
Args:
X (ndarray):
N-dimensional value sequence to iterate over
dim (int):
Integer indicating the dimension number for a multi-dimensional dataset
Returns:
ndarray:
Returns an nd array that contains a dataset with 2 columns ['timestamp', 'value']

"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of things to keep in mind when writing docstrings. The most important thing is to be consistent with the existing docstrings which should look something like this

"""Short description in a single line ending with a dot.

Longer description that can span across multiple lines. Longer
description that can span across multiple lines. Longer description
that can span across multiple lines.

Args:
    arg_name (arg_type):
        argument description.
    arg_name (arg_type):
        Argument description that spans across multiple lines. Argument
        description that spans across multiple lines. Argument description
        that spans across multiple lines.

Returns:
    return_type:
        description of the returned objects

Based on the current code, what needs to be fixed is:

  • the first sentence is always short
  • if the line is longer than 100 characters, break it over multiple lines
  • add space between each section (Args:, and Returns:)
  • lastly remove unnecessary white space (line 22)

Applying all these changes would look something like:

"""Validate data dimensions.

The function checks if the dataset being used is valid i.e has a length 
greater than 0 and contains the dimension required.

Args:
    X (ndarray): 
        N-dimensional value sequence to iterate over.
    dim (int):
        Integer indicating the dimension number for a multi-dimensional dataset.

Returns:
    ndarray:
        An array that contains a dataset with 2 columns ['timestamp', 'value'].
"""

This applies to all the docstrings


def diff_thres(X, thres = "0.1", op = ">"):
"""
The function detects anomalies that are flagged through moving standard deviation thresholding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the description here is inaccurate


def thresholding(X, thres, op):
"""
The function detects anomalies that are flagged through moving standard deviation thresholding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here, I think it needs to be updated

@@ -0,0 +1,31 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed build_anomaly_intervals is not being used. Is there a reason for adding it to the PR? (similar comment applies to the function, if it is not part of the pipelines, we can remove it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add SOL Pipelines
3 participants