Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Introducing local sensitivity analysis #575

Open
wants to merge 16 commits into
base: develop
Choose a base branch
from

Conversation

Lucas-Prates
Copy link
Contributor

@Lucas-Prates Lucas-Prates commented Mar 11, 2024

Pull request type

  • Other (please describe): This PR is a draft to implement sensitivity analysis in RocketPy. It is a work in progress and needs validation.

Checklist

  • Lint (black rocketpy/ tests/) has passed locally

Current behavior

The Sensitivity Analysis notebook teaches the users how to perform the simulations, plot the distribution
of some flight variables (e.g. apogee), and computes the prediction ellipses for the landing point.

New behavior

Our goal is to take sensitivity analysis even further. Briefly, we attempt to answer the following question: Which parameters would reduce the variability of the variable of interest (e.g. apogee) the most if we measured them with greater precision?

To that end, a bit of theory is developed, check the technical document. What was developed resembles the work of [1], a core reference in sensitivity analysis for engineering. His approach is a global sensitivity analysis with a full model containing interaction terms. Our first implementation considers a local sensitivity analysis using only first-order terms.

A quick and dirty test of the functionality of the SensitivityModel class is provided the "sensitivity_model_usage" notebook. This notebook is currently giving weird results! The linear approximations for the variables are, for some reason I still have to figure out, not good enough. This was not happening at previous experimentations that suggested that this approached worked. I have to look carefully at what is happening, but I did not want to delay the PR.

The concepts are discussed in-depth in the "sensitivity_analysis_parameter_importance" notebook (the notebook was not updated to the new SensitivityModel yet!)

Breaking change

  • No

Additional information

Technical Document

[1] Sobol, Ilya M. "Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates." Mathematics and computers in simulation

Copy link

codecov bot commented Mar 11, 2024

Codecov Report

Attention: Patch coverage is 7.25191% with 243 lines in your changes are missing coverage. Please review.

Project coverage is 72.01%. Comparing base (209434f) to head (c91a50e).
Report is 1 commits behind head on develop.

Files Patch % Lines
rocketpy/sensitivity/sensivity_model.py 7.61% 194 Missing ⚠️
rocketpy/tools.py 3.92% 49 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #575      +/-   ##
===========================================
- Coverage    73.60%   72.01%   -1.60%     
===========================================
  Files           70       72       +2     
  Lines        10310    10572     +262     
===========================================
+ Hits          7589     7613      +24     
- Misses        2721     2959     +238     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Gui-FernandesBR Gui-FernandesBR marked this pull request as draft April 4, 2024 04:33
@Lucas-Prates Lucas-Prates changed the base branch from develop to enh/class_dispersion May 14, 2024 21:46
@Lucas-Prates Lucas-Prates added Enhancement New feature or request, including adjustments in current codes Monte Carlo Monte Carlo and related contents labels May 14, 2024
@Lucas-Prates Lucas-Prates changed the title ENH: (DRAFT) Introducing Parameter Importance in Sensitivity Analysis ENH: Introducing local sensitivity analysis May 14, 2024
@Gui-FernandesBR Gui-FernandesBR marked this pull request as ready for review May 15, 2024 13:14
@Gui-FernandesBR Gui-FernandesBR added this to the Release v1.X.0 milestone May 15, 2024
@Gui-FernandesBR Gui-FernandesBR linked an issue May 15, 2024 that may be closed by this pull request
@Gui-FernandesBR
Copy link
Member

Tests are not passing

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/runner/work/RocketPy/RocketPy/rocketpy/__init__.py", line 2, in <module>
    from .environment import Environment, EnvironmentAnalysis
  File "/home/runner/work/RocketPy/RocketPy/rocketpy/environment/__init__.py", line 1, in <module>
    from .environment import Environment
  File "/home/runner/work/RocketPy/RocketPy/rocketpy/environment/environment.py", line [16](https://github.com/RocketPy-Team/RocketPy/actions/runs/9096367936/job/25001616031?pr=575#step:6:17), in <module>
    from ..tools import exponential_backoff
  File "/home/runner/work/RocketPy/RocketPy/rocketpy/tools.py", line 556, in <module>
    parameters_list: list[str],
TypeError: 'type' object is not subscriptable
Error: Process completed with exit code 1.

Could you fix it before our review, please? That would help us. @Lucas-Prates

@Lucas-Prates
Copy link
Contributor Author

Lucas-Prates commented May 15, 2024

Could you fix it before our review, please? That would help us. @Lucas-Prates

Sure, I will fix it briefly. This simplified type hinting started at python 3.9. I will make sure the tests pass this time. :P

@Gui-FernandesBR
Copy link
Member

Please be aware of #444, we are not supporting type hinting or annotations yet.

Base automatically changed from enh/class_dispersion to develop May 21, 2024 22:52
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding a setup.py file?

We are using the pyproject.toml file now. We no longer support the setup.py

Comment on lines +562 to +579
Parameters
----------
input_filename : str
Input file exported by MonteCarlo class. Each line is a
sample unit described by a dictionary where keys are parameters names
and the values are the sampled parameters values.

output_filename : str
Output file exported by MonteCarlo.simulate function. Each line is a
sample unit described by a dictionary where keys are target variables
names and the values are the obtained values from the flight simulation.

parameters_list : list[str]
List of parameters whose values will be extracted.

target_variables_list : list[str]
List of target variables whose values will be extracted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Parameters
----------
input_filename : str
Input file exported by MonteCarlo class. Each line is a
sample unit described by a dictionary where keys are parameters names
and the values are the sampled parameters values.
output_filename : str
Output file exported by MonteCarlo.simulate function. Each line is a
sample unit described by a dictionary where keys are target variables
names and the values are the obtained values from the flight simulation.
parameters_list : list[str]
List of parameters whose values will be extracted.
target_variables_list : list[str]
List of target variables whose values will be extracted.
Parameters
----------
input_filename : str
Input file exported by MonteCarlo class. Each line is a
sample unit described by a dictionary where keys are parameters names
and the values are the sampled parameters values.
output_filename : str
Output file exported by MonteCarlo.simulate function. Each line is a
sample unit described by a dictionary where keys are target variables
names and the values are the obtained values from the flight simulation.
parameters_list : list[str]
List of parameters whose values will be extracted.
target_variables_list : list[str]
List of target variables whose values will be extracted.

Can't have these line skips here. Breaks docs pages

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also happens in a lot of the other functions in this PR

Comment on lines +559 to +560
"""Reads MonteCarlo simulation data file and builds parameters and flight
variables matrices from specified
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"from specified..."

Docs not finished here

Comment on lines +580 to +589
Returns
-------
parameters_matrix: np.matrix
Numpy matrix contaning input parameters values. Each column correspond
to a parameter in the same order specified by 'parameters_list' input.

target_variables_matrix: np.matrix
Numpy matrix contaning target variables values. Each column correspond
to a target variable in the same order specified by 'target_variables_list'
input.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Returns
-------
parameters_matrix: np.matrix
Numpy matrix contaning input parameters values. Each column correspond
to a parameter in the same order specified by 'parameters_list' input.
target_variables_matrix: np.matrix
Numpy matrix contaning target variables values. Each column correspond
to a target variable in the same order specified by 'target_variables_list'
input.
Returns
-------
parameters_matrix: np.matrix
Numpy matrix contaning input parameters values. Each column correspond
to a parameter in the same order specified by 'parameters_list' input.
target_variables_matrix: np.matrix
Numpy matrix contaning target variables values. Each column correspond
to a target variable in the same order specified by 'target_variables_list'
input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo! containing

I recoomend the installation of this extension: https://marketplace.visualstudio.com/items?itemName=streetsidesoftware.code-spell-checker

Suggested change
Returns
-------
parameters_matrix: np.matrix
Numpy matrix contaning input parameters values. Each column correspond
to a parameter in the same order specified by 'parameters_list' input.
target_variables_matrix: np.matrix
Numpy matrix contaning target variables values. Each column correspond
to a target variable in the same order specified by 'target_variables_list'
input.
Returns
-------
parameters_matrix: np.matrix
Numpy matrix containing input parameters values. Each column correspond
to a parameter in the same order specified by 'parameters_list' input.
target_variables_matrix: np.matrix
Numpy matrix containing target variables values. Each column correspond
to a target variable in the same order specified by 'target_variables_list'
input.

Comment on lines +25 to +29
def __init__(
self,
parameters_names,
target_variables_names,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs docs

Comment on lines +192 to +197
def fit(
self,
parameters_matrix,
target_data,
):
"""Fits sensitivity model
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any use case where you would define a SensitivityModel and not call .fit() immediately afterward?

This seems like something that could/should be inside __init__

Comment on lines +65 to +81
def set_parameters_nominal(
self,
parameters_nominal_mean,
parameters_nominal_sd,
):
"""Set parameters nominal mean and standard deviation

Parameters
----------
parameters_nominal_mean : np.array
An array contaning the nominal mean for parameters in the
order specified in parameters names at initialization
parameters_nominal_sd : np.array
An array contaning the nominal standard deviation for
parameters in the order specified in parameters names at
initialization
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do you have to set the mean and sd simultaneously?

Also, do you have to set mean and sd of all parameters? Setting for just some of them does not work?

Another thing, to run a Monte Carlo sim, the mean and sd is already given in the Monte Carlo class right? So it would be natural to get them from there automatically

Comment on lines +140 to +143
if parameters_matrix.shape[1] != self.n_parameters:
raise ValueError(
"Number of columns (parameters) does not match number of parameters passed at initialization."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hasn't this already been checked if there is a parameters_matrix ?

Copy link
Member

@Gui-FernandesBR Gui-FernandesBR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a partial review. I could not run your codes to validate it is working, but I suggested a few changes that might improve the quality (i.e. readability) of the code.

@@ -19,6 +19,7 @@
from cftime import num2pydate
from matplotlib.patches import Ellipse
from packaging import version as packaging_version
import json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to run isort before pushing to origin. You can you the command make isort in the root directory. THis will ensure the imports are conventionally organized.

Comment on lines +580 to +589
Returns
-------
parameters_matrix: np.matrix
Numpy matrix contaning input parameters values. Each column correspond
to a parameter in the same order specified by 'parameters_list' input.

target_variables_matrix: np.matrix
Numpy matrix contaning target variables values. Each column correspond
to a target variable in the same order specified by 'target_variables_list'
input.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo! containing

I recoomend the installation of this extension: https://marketplace.visualstudio.com/items?itemName=streetsidesoftware.code-spell-checker

Suggested change
Returns
-------
parameters_matrix: np.matrix
Numpy matrix contaning input parameters values. Each column correspond
to a parameter in the same order specified by 'parameters_list' input.
target_variables_matrix: np.matrix
Numpy matrix contaning target variables values. Each column correspond
to a target variable in the same order specified by 'target_variables_list'
input.
Returns
-------
parameters_matrix: np.matrix
Numpy matrix containing input parameters values. Each column correspond
to a parameter in the same order specified by 'parameters_list' input.
target_variables_matrix: np.matrix
Numpy matrix containing target variables values. Each column correspond
to a target variable in the same order specified by 'target_variables_list'
input.

Comment on lines +655 to +661
for i in range(n_parameters):
parameter = parameters_list[i]
parameters_matrix[:, i] = parameters_samples[parameter]

for i in range(n_variables):
target_variable = target_variables_list[i]
target_variables_matrix[:, i] = target_variables_samples[target_variable]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using enumerate is a more pythonic and readable solution.

Suggested change
for i in range(n_parameters):
parameter = parameters_list[i]
parameters_matrix[:, i] = parameters_samples[parameter]
for i in range(n_variables):
target_variable = target_variables_list[i]
target_variables_matrix[:, i] = target_variables_samples[target_variable]
for i, parameter in enumerate(parameters_list):
parameters_matrix[:, i] = parameters_samples[parameter]
for i, target_variable in enumerate(target_variables_list):
target_variables_matrix[:, i] = target_variables_samples[target_variable]


if number_of_samples_parameters != number_of_samples_variables:
raise ValueError(
"Number of samples for parameters does not match the number of samples for target variables!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is too long (over 88 columns), please break it into 2 lines.

Suggested change
"Number of samples for parameters does not match the number of samples for target variables!"
"Number of samples for parameters does not match the "
"number of samples for target variables!"

Comment on lines +638 to +641
except Exception:
raise Exception(
f"Variable {variable} was not found in {output_filename}!"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your are using a broad exception catch although you are printing a message that is more related to a particular case of exceptions: the KeyError.

Suggested change
except Exception:
raise Exception(
f"Variable {variable} was not found in {output_filename}!"
)
except KeyError as e:
raise KeyError(
f"Variable {variable} was not found in {output_filename}!"
) from e

Comment on lines +594 to +612
# Auxiliary function that unnests dictionary
def unnest_dict(x):
new_dict = {}
for key, value in x.items():
# the nested dictionary is inside a list
if isinstance(x[key], list):
# sometimes the object inside the list is another list
# we must skip these cases
if isinstance(value[0], dict):
inner_dict = unnest_dict(value[0])
inner_dict = {
key + "_" + inner_key: inner_value
for inner_key, inner_value in inner_dict.items()
}
new_dict.update(inner_dict)
else:
new_dict.update({key: value})

return new_dict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would define this function outside the load_monte_carlo_data() function, this would allow us to re-use this function in other contexts.

Also, the term flatten is usually used to describe this kid of "parse a nested dictionary" operation. Maybe it would be a good alternative of name.

@@ -49,3 +49,5 @@
StochasticTail,
StochasticTrapezoidalFins,
)
from .sensitivity import SensitivityModel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run isort


References
----------
[1] Sobol, Ilya M. "Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates." Mathematics and computers in simulation 55.1-3 (2001): 271-280.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line too long... You can use the \ sign to continue the text in the next line whithout breaking it. Like this:

"something here
continues here, do you see?"

"""
if len(parameters_nominal_mean) != self.n_parameters:
raise ValueError(
"Nominal mean array length does not match number of parameters passed at initilization."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo fix. And line is too long.

Suggested change
"Nominal mean array length does not match number of parameters passed at initilization."
"Nominal mean array length does not match number of parameters passed at initialization."

(same thing happens in a few lines below)

Comment on lines +125 to +126
return

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probally remove this return and the code you keep working.

Suggested change
return

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request, including adjustments in current codes Monte Carlo Monte Carlo and related contents
Projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

Sensitivity Analysis on Monte Carlo Simulations
4 participants