How To Test The Goodness of The Model #389

kqasem · 2020-05-08T23:27:04Z

Hello everyone,
I created a Bayesian model in StreamMetabolizer and this is how my input data looks like:

These are the model specifications I used:

Model specifications:
model_name b_Kn_oipi_tr_plrckm.stan
engine stan
split_dates FALSE
keep_mcmcs TRUE
keep_mcmc_data TRUE
day_start 4
day_end 28
day_tests full_day, even_timesteps, complete_data, ...
required_timestep NA
GPP_daily_mu 3
GPP_daily_lower 0
GPP_daily_sigma 2
ER_daily_mu -7.1
ER_daily_upper 0
ER_daily_sigma 7.1
K600_daily_meanlog_meanlog 2.484906649788
K600_daily_meanlog_sdlog 1.32
K600_daily_sdlog_sigma 0.05
err_obs_iid_sigma_scale 0.03
err_proc_iid_sigma_scale 5
params_in GPP_daily_mu, GPP_daily_lower, GPP_daily_...
params_out GPP, ER, DO_R2, GPP_daily, ER_daily, K600...
n_chains 4
n_cores 1
burnin_steps 100
saved_steps 200
thin_steps 1
verbose FALSE

This is how the DO predictions look like:

I am not sure how to test the goodness of the model? I need to improve the predictions and see when the model results are acceptable?
The tutorials you have does not have an explanation (might missed it) of how to test the model performance.

What do I need to do next? how to further optimize the model and improve the results.

Please let me know if there is another area where I should post such questions. I am new hear.
Thank you very much.

Session information

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] rstan_2.19.3 StanHeaders_2.21.0-1
[3] unitted_0.2.9 ggplot2_3.3.0
[5] tidyr_1.0.2 dplyr_0.8.5
[7] streamMetabolizer_0.11.4

loaded via a namespace (and not attached):
[1] deSolve_1.28 tidyselect_1.0.0 purrr_0.3.4
[4] colorspace_1.4-1 vctrs_0.2.4 generics_0.0.2
[7] stats4_3.6.0 loo_2.2.0 utf8_1.1.4
[10] rlang_0.4.6 pkgbuild_1.0.7 pillar_1.4.3
[13] glue_1.4.0 withr_2.2.0 LakeMetabolizer_1.5.0
[16] readxl_1.3.1 matrixStats_0.56.0 lifecycle_0.2.0
[19] plyr_1.8.6 munsell_0.5.0 gtable_0.3.0
[22] cellranger_1.1.0 codetools_0.2-16 labeling_0.3
[25] inline_0.3.15 callr_3.4.3 ps_1.3.2
[28] parallel_3.6.0 fansi_0.4.1 Rcpp_1.0.4.6
[31] scales_1.1.0 farver_2.0.3 gridExtra_2.3
[34] rLakeAnalyzer_1.11.4.1 digest_0.6.25 processx_3.4.2
[37] grid_3.6.0 cli_2.0.2 tools_3.6.0
[40] magrittr_1.5 lazyeval_0.2.2 tibble_3.0.1
[43] crayon_1.3.4 pkgconfig_2.0.3 ellipsis_0.3.0
[46] prettyunits_1.1.1 lubridate_1.7.8 assertthat_0.2.1
[49] rstudioapi_0.11 R6_2.4.1 compiler_3.6.0

robohall · 2020-05-12T16:45:05Z

Karoline, A few comments 1. What river is this? 15 m mean depth is super deep for any river and suggests the main channel of the Orinoco or something like that. If this depth is really “stage” then it is unsuitable for using that value to convert volumetric estimates to areal estimates because you are likely off by a factor of 3 or so. 2. with several months of fitting, it is hard to see the fits on any one day to see any problems there 3. A great way to deduce fitting problems is to plot ER as a function of K600. If they strongly covary, then it means the model has high equifinality and thus despite good fits you will have high uncertainty on metabolism for any one day. This is common as K600 increases or GPP decreases. Bob On May 8, 2020, at 5:27 PM, Karoline Qasem <[email protected]<mailto:[email protected]>> wrote: Hello everyone, I created a Bayesian model in StreamMetabolizer and this is how my input data looks like: [Depth_Temp_PAR]<https://user-images.githubusercontent.com/36547773/81456633-adf6be80-9158-11ea-90ec-8e67301a7bf7.png> These are the model specifications I used: Model specifications: model_name b_Kn_oipi_tr_plrckm.stan engine stan split_dates FALSE keep_mcmcs TRUE keep_mcmc_data TRUE day_start 4 day_end 28 day_tests full_day, even_timesteps, complete_data, ... required_timestep NA GPP_daily_mu 3 GPP_daily_lower 0 GPP_daily_sigma 2 ER_daily_mu -7.1 ER_daily_upper 0 ER_daily_sigma 7.1 K600_daily_meanlog_meanlog 2.484906649788 K600_daily_meanlog_sdlog 1.32 K600_daily_sdlog_sigma 0.05 err_obs_iid_sigma_scale 0.03 err_proc_iid_sigma_scale 5 params_in GPP_daily_mu, GPP_daily_lower, GPP_daily_... params_out GPP, ER, DO_R2, GPP_daily, ER_daily, K600... n_chains 4 n_cores 1 burnin_steps 100 saved_steps 200 thin_steps 1 verbose FALSE This is how the DO predictions look like: [DO]<https://user-images.githubusercontent.com/36547773/81456751-37a68c00-9159-11ea-879f-b2f80b48e147.png> I am not sure how to test the goodness of the model? I need to improve the predictions and see when the model results are acceptable? The tutorials you have does not have an explanation (might missed it) of how to test the model performance. What do I need to do next? how to further optimize the model and improve the results. Please let me know if there is another area where I should post such questions. I am new hear. Thank you very much. Session information sessionInfo() R version 3.6.0 (2019-04-26) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rstan_2.19.3 StanHeaders_2.21.0-1 [3] unitted_0.2.9 ggplot2_3.3.0 [5] tidyr_1.0.2 dplyr_0.8.5 [7] streamMetabolizer_0.11.4 loaded via a namespace (and not attached): [1] deSolve_1.28 tidyselect_1.0.0 purrr_0.3.4 [4] colorspace_1.4-1 vctrs_0.2.4 generics_0.0.2 [7] stats4_3.6.0 loo_2.2.0 utf8_1.1.4 [10] rlang_0.4.6 pkgbuild_1.0.7 pillar_1.4.3 [13] glue_1.4.0 withr_2.2.0 LakeMetabolizer_1.5.0 [16] readxl_1.3.1 matrixStats_0.56.0 lifecycle_0.2.0 [19] plyr_1.8.6 munsell_0.5.0 gtable_0.3.0 [22] cellranger_1.1.0 codetools_0.2-16 labeling_0.3 [25] inline_0.3.15 callr_3.4.3 ps_1.3.2 [28] parallel_3.6.0 fansi_0.4.1 Rcpp_1.0.4.6 [31] scales_1.1.0 farver_2.0.3 gridExtra_2.3 [34] rLakeAnalyzer_1.11.4.1 digest_0.6.25 processx_3.4.2 [37] grid_3.6.0 cli_2.0.2 tools_3.6.0 [40] magrittr_1.5 lazyeval_0.2.2 tibble_3.0.1 [43] crayon_1.3.4 pkgconfig_2.0.3 ellipsis_0.3.0 [46] prettyunits_1.1.1 lubridate_1.7.8 assertthat_0.2.1 [49] rstudioapi_0.11 R6_2.4.1 compiler_3.6.0 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#389>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AC4CU5RCMCEOIABJC3OJY6TRQSINJANCNFSM4M4Q7YLA>.

kqasem · 2020-05-12T23:42:42Z

Thanks a lot for your feedback.
Good catch! I did not convert the units to "m". the input depth is in feet. I will update that.
I will also look into the ER/K600 suggestion.

Thanks a lot for your time and your thoughts.

Karoline

Tzu-Yao · 2020-05-22T17:15:17Z

Hi Karoline and Bob,

Thanks for your questions and answers. I'd like to follow up on using ER-K600 correlation as a metric to assess the quality of model performance.

Bob mentioned that it has high equifinality when ER and K600 strongly covary, I'm wondering if there's any standard for this correlation? For example, if the correlation between ER and K600 exceeds some value and we are xx% sure it has equifinality and should reject the results.

Thanks,
Tzu-Yao

robohall · 2020-05-23T14:57:02Z

Tzu-Yao, We have no threshold about what high vs low ER vs K correlation is. It will depend on your question. If you sole interest is say a seasonal mean estimate of ER, then a high correlation might be ok because the errors will average out. We did just this in Madinger and Hall 2019 L&O Lett. where we had horribly high correlation between ER and K. If one is interested on controls of the error in ER, then I suggest the correlation needs to be low.. For example: Let’s say you want to know what control ER, e.g. GPP, temp, time since flood, many others. If ER covaries with K is r=0.8 then if linear regression, r^2 = 0.64 which is to say 64% of the daily variation in ER comes from not knowing K, and 36% comes from your predictor variable plus process error. That seems like a bad time series of ER to do that sort of analysis when most of the variation can be ascribed to an artifact of the analysis. And because GPP and K covary, it won’t be possible to tease out the effect of GPP (A known control of ER), if ER and K covary Where that threshold lies is user dependent and where the reviewers will have the last say. Bob On May 22, 2020, at 11:15 AM, Tzu-Yao <[email protected]<mailto:[email protected]>> wrote: Hi Karoline and Bob, Thanks for your questions and answers. I'd like to follow up on using ER-K600 correlation as a metric to assess the quality of model performance. Bob mentioned that it has high equifinality when ER and K600 strongly covary, I'm wondering if there's any standard for this correlation? For example, if the correlation between ER and K600 exceeds some value and we are xx% sure it has equifinality and should reject the results. Thanks, Tzu-Yao — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#389 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AC4CU5WOGS3WWMOI5ALVICDRS2XLHANCNFSM4M4Q7YLA>.

Tzu-Yao · 2020-06-02T15:17:58Z

Hi Bob,

Thanks for your insights! You raised an example of analyzing controls on ER when ER covary with K. I was just wondering if I want to know what controls K (e.g. flow speed, turbulence, water depth, temp...), will you say I will encounter the problem as in analyzing controls on ER, as you described, when ER strongly covary with K?

Thanks,
Tzu-Yao

robohall · 2020-06-02T15:32:02Z

If ER and K covary, then the cause of that variance is simply model fitting. i..e there wont be much to be ale to explain variation in K or ER. And yes, you won’t be able to analyze controls on ER either because the main cause of variation is simply fitting error and nothing biological. In theory for a stream, K does not vary outside of some idiosyncratic effect with discharge. In a big river that may not be true since wind can vary and strongly drive K. On Jun 2, 2020, at 9:18 AM, Tzu-Yao <[email protected]<mailto:[email protected]>> wrote: Hi Bob, Thanks for your insights! You raised an example of analyzing controls on ER when ER covary with K. I was just wondering if I want to know what controls K (e.g. flow speed, turbulence, water depth, temp...), will you say I will encounter the problem as in analyzing controls on ER, as you described, when ER strongly covary with K? Thanks, Tzu-Yao — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#389 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AC4CU5TMFRPLHO6IM5F6JATRUUJ3PANCNFSM4M4Q7YLA>.

Tzu-Yao · 2020-06-12T17:56:49Z

Thanks Bob. Your explanation really cleared things up!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How To Test The Goodness of The Model #389

How To Test The Goodness of The Model #389

kqasem commented May 8, 2020

robohall commented May 12, 2020 via email

kqasem commented May 12, 2020

Tzu-Yao commented May 22, 2020

robohall commented May 23, 2020 via email

Tzu-Yao commented Jun 2, 2020

robohall commented Jun 2, 2020 via email

Tzu-Yao commented Jun 12, 2020

How To Test The Goodness of The Model #389

How To Test The Goodness of The Model #389

Comments

kqasem commented May 8, 2020

Session information

robohall commented May 12, 2020 via email

kqasem commented May 12, 2020

Tzu-Yao commented May 22, 2020

robohall commented May 23, 2020 via email

Tzu-Yao commented Jun 2, 2020

robohall commented Jun 2, 2020 via email

Tzu-Yao commented Jun 12, 2020