Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Package Aware Quarto Cache #392

Merged
merged 9 commits into from
Mar 19, 2025

Conversation

michaelwalshe
Copy link
Contributor

@michaelwalshe michaelwalshe commented Mar 13, 2025

Fixes #391

After it was suggested in the monthly CAMIS meeting, I had some free time this afternoon so thought I'd take a stab at setting up something to make it possible for us to use quarto caching.

Uses a pre-render R script (i.e. runs automatically first when running quarto render) alongside freeze: auto to setup caching of rendered qmd files, that will be re-rendered if the packages that they use are updated. This just uses a simple R script to check and store dependency information for each quarto file, and quarto will handle re-rendering files if the source changes. If package info has changed, it deleted the cached folder for that quarto doc. We currently don't have a location for scripts etc, so this is under R/quarto_check_pkg_dependencies.R and creates a file at data/quarto_pkg_dependencies.csv.

Some more info:

  • This uses renv::dependencies to check what R packages a qmd file uses. In some cases that could be inaccurate, however it is typically quite good, and given we recommend that in a comparison document packages are explicitly loaded using library() this should be fine.
  • For base/recommended packages (e.g. {stats}) they will not have entries in renv.lock, so instead of the package hash it saves the version of R
  • For python packages, renv can't currently check what Python packages are used in a qmd file, so I've set it up so that if any python package is updated, all QMD files using Python are re-rendered
  • Additionally, the info stored in requirements.txt is just package version, not a hash, so that's all that is saved for python packages.

I've also committed the _freeze cache - this won't be updated by Github Actions (though that may be possible to figure out), it will instead be updated by each PR. As most users only render a single file it may need to be a task every so often for someone to re-render the whole site and push a new _freeze cache, if the GH actions stop using it for some files and start taking longer again.

Apologies for the huge PR, the only major changes is a new R file, the CSV, and I had to update the renv.lock as it was missing rstan and that was causing an error in the build. The rest of the diffs should just be the _freeze cache

… of the search, check against a previous cache, and delete cached entries in _freeze if dependencies have changed. Also add this script to pre-render step, and set freeze: auto in quarto config
@michaelwalshe
Copy link
Contributor Author

An option to setup the the github action to push the rendered _freeze directory back to the main branch is https://github.com/stefanzweifel/git-auto-commit-action. We'd have to set this up carefully though to make sure we can't accidentally break the repository on a bad render.

@statasaurus
Copy link
Contributor

So I haven't looked at this, but can you check something for me. If a dependent package changes will the file rerun?

@michaelwalshe
Copy link
Contributor Author

Yes! That's the idea behind the quarto_check_pkg_dependencies.R script. For example if we render with everything as normal, you get something like:

Quarto Render No Package Updates
michael@KD-C138GL3:~/source/CAMIS$ quarto render
R/quarto_check_pkg_dependencies.R
here() starts at /home/michael/source/CAMIS
Checking package versions against previous executions...
Finding R package dependencies ... [162/162] Done!
Saving current package versions to data/quarto_pkg_dependencies.csv

[  1/166] minutes/posts/9Dec2024.qmd
[  2/166] minutes/posts/17apr2023.qmd
[  3/166] minutes/posts/21Aug2023.qmd
[  4/166] minutes/posts/9oct2023.qmd
[  5/166] minutes/posts/15July2024.qmd
[  6/166] minutes/posts/11Sept2023.qmd
[  7/166] minutes/posts/13May2024.qmd
[  8/166] minutes/posts/19June2023.qmd
[  9/166] minutes/posts/20Nov2023.qmd
[ 10/166] minutes/posts/12Feb2024.qmd
[ 11/166] minutes/posts/10June2024.qmd
[ 12/166] minutes/posts/10Oct2024.qmd
[ 13/166] minutes/posts/10Feb2025.qmd
[ 14/166] minutes/posts/8Jan2024.qmd
[ 15/166] minutes/posts/10July2023.qmd
[ 16/166] minutes/posts/03Jan2025.qmd
[ 17/166] minutes/posts/15May2023.qmd
[ 18/166] minutes/posts/13Feb2023.Rmd
[ 19/166] minutes/posts/12Aug2024.qmd
[ 20/166] minutes/posts/11Mar2024.qmd
[ 21/166] minutes/posts/13mar2023.qmd
[ 22/166] minutes/posts/10Mar2025.qmd
[ 23/166] minutes/posts/23Jan2023.Rmd
[ 24/166] minutes/posts/9sept2024.qmd
[ 25/166] minutes/posts/12Dec2022.qmd
[ 26/166] minutes/posts/8Apr2024.qmd
[ 27/166] minutes/index.qmd
WARN: Unable to create a feed as the required `site-url` property is missing from this project.
[ 28/166] R/Weighted-log-rank.qmd
[ 29/166] R/ci_for_prop.qmd
[ 30/166] R/ttest_2Sample.qmd
[ 31/166] R/association.qmd
[ 32/166] R/survival_cif.qmd
[ 33/166] R/mi_mar_predictive_mean_match.qmd
[ 34/166] R/R_Friedmantest.qmd
[ 35/166] R/mcnemar.qmd
[ 36/166] R/Accelerated_Failure_time_model.qmd
[ 37/166] R/xgboost.qmd
[ 38/166] R/ttest_1Sample.qmd
[ 39/166] R/sample_size_average_bioequivalence.qmd
[ 40/166] R/ancova.qmd
[ 41/166] R/anova.qmd
[ 42/166] R/summary-stats.qmd
[ 43/166] R/linear-regression.qmd
[ 44/166] R/survival.qmd
[ 45/166] R/survival_csh.qmd
[ 46/166] R/survey-stats-summary.qmd
[ 47/166] R/kruskal_wallis.qmd
[ 48/166] R/nparestimate.qmd
[ 49/166] R/jonckheere.qmd
[ 50/166] R/PCA_analysis.qmd
[ 51/166] R/summary_skew_kurt.qmd
[ 52/166] R/marginal_homogeneity_tests.qmd
[ 53/166] R/binomial_test.qmd
[ 54/166] R/nonpara_wilcoxon_ranksum.qmd
[ 55/166] R/mi_mar_regression.qmd
[ 56/166] R/manova.qmd
WARN: Unable to resolve link target: contribution.qmd
[ 57/166] R/gsd-tte.qmd
[ 58/166] R/count_data_regression.qmd
[ 59/166] R/tobit regression.qmd
[ 60/166] R/ttest_Paired.qmd
[ 61/166] R/cmh.qmd
[ 62/166] R/logistic_regr.qmd
[ 63/166] R/rbmi_continuous_joint.qmd
[ 64/166] R/correlation.qmd
[ 65/166] R/wilcoxonsr_hodges_lehman.qmd
[ 66/166] R/mmrm.qmd
[ 67/166] R/rounding.qmd
[ 68/166] blogs/posts/202312_highlights_blog.qmd
[ 69/166] blogs/posts/202403_phuseUS2024.qmd
[ 70/166] blogs/posts/202305_introduction_to_CAMIS_blog.qmd
[ 71/166] blogs/posts/202503_Tobit_regression.qmd
[ 72/166] blogs/index.qmd
[ 73/166] python/paired_t_test.qmd
[ 74/166] python/skewness_kurtosis.qmd
[ 75/166] python/linear_regression.qmd
[ 76/166] python/ancova.qmd
WARN: Unable to resolve link target: python/linear-regression.qmd
[ 77/166] python/anova.qmd
[ 78/166] python/two_samples_t_test.qmd
[ 79/166] python/survey-stats-summary.qmd
[ 80/166] python/logistic_regression.qmd
[ 81/166] python/kruskal_wallis.qmd
[ 82/166] python/Rounding.qmd
[ 83/166] python/Summary_statistics.qmd
[ 84/166] python/binomial_test.qmd
[ 85/166] python/one_sample_t_test.qmd
[ 86/166] python/chi-square.qmd
[ 87/166] python/MANOVA.qmd
WARN: Unable to resolve link target: contribution.qmd
[ 88/166] python/correlation.qmd
[ 89/166] non_website_content/Conferences 2024 archive.qmd
[ 90/166] non_website_content/conferences/2024/abstract_useR2024.md
[ 91/166] non_website_content/Conferences 2023 archive.qmd
[ 92/166] non_website_content/dissertations/202406_MMRM.qmd
[ 93/166] East/gsd-tte.qmd
[ 94/166] contribution/hackathon/contribution_guide_ssh.qmd
[ 95/166] contribution/contribution.qmd
[ 96/166] contribution/get_started.qmd
[ 97/166] LICENSE.md
[ 98/166] Comp/r-sas_ttest_1Sample.qmd
[ 99/166] Comp/r-sas_manova.qmd
[100/166] Comp/r-sas_survival.qmd
[101/166] Comp/r-sas_jonckheere.qmd
[102/166] Comp/r-sas_correlation.qmd
[103/166] Comp/r-sas-summary-stats.qmd
[104/166] Comp/r-sas_cmh.qmd
[105/166] Comp/r-sas-python_survey-stats-summary.qmd
[106/166] Comp/r-sas_ancova.qmd
[107/166] Comp/r-sas_chi-sq.qmd
[108/166] Comp/r-sas_rounding.qmd
[109/166] Comp/r-east_gsd_tte.qmd
[110/166] Comp/r-sas_mmrm.qmd
[111/166] Comp/r-sas-wilcoxonsr_HL.qmd
[112/166] Comp/r-sas_kruskalwallis.qmd
[113/166] Comp/r-sas_mcnemar.qmd
[114/166] Comp/r-sas_ci_for_prop.qmd
[115/166] Comp/r-sas_friedman.qmd
[116/166] Comp/r-sas_linear-regression.qmd
[117/166] Comp/r-sas_survival_csh.qmd
[118/166] Comp/r-sas_negbin.qmd
[119/166] Comp/r-sas_ttest_Paired.qmd
[120/166] Comp/r-sas_anova.qmd
[121/166] Comp/r-sas_ttest_2Sample.qmd
[122/166] Comp/r-sas_summary_skew_kurt.qmd
[123/166] Comp/r-sas_tobit.qmd
[124/166] Comp/r-sas_survival_cif.qmd
[125/166] Comp/r-sas_logistic-regr.qmd
[126/166] Comp/r-sas_rbmi_continuous_joint.qmd
[127/166] templates/RvsSAS_template.qmd
[128/166] templates/template.qmd
[129/166] Clustering_Knowhow.qmd
[130/166] about.qmd
[131/166] index.qmd
[132/166] SAS/ranksum.qmd
[133/166] SAS/ci_for_prop.qmd
[134/166] SAS/ttest_2Sample.qmd
[135/166] SAS/association.qmd
[136/166] SAS/survival_cif.qmd
[137/166] SAS/wilcoxonsr_HL.qmd
[138/166] SAS/mcnemar.qmd
[139/166] SAS/rbmi_continuous_joint_SAS.qmd
[140/166] SAS/SAS_Friedmantest.qmd
[141/166] SAS/ttest_1Sample.qmd
[142/166] SAS/rmst.qmd
[143/166] SAS/tobit regression SAS.qmd
[144/166] SAS/ancova.qmd
[145/166] SAS/jonchkheere_terpstra.qmd
[146/166] SAS/anova.qmd
[147/166] SAS/summary-stats.qmd
[148/166] SAS/linear-regression.qmd
[149/166] SAS/survival.qmd
[150/166] SAS/survival_csh.qmd
[151/166] SAS/survey-stats-summary.qmd
[152/166] SAS/kruskal_wallis.qmd
[153/166] SAS/logistic-regr.qmd
[154/166] SAS/nparestimate.qmd
[155/166] SAS/summary_skew_kurt.qmd
[156/166] SAS/mi_mar_regression.qmd
[157/166] SAS/manova.qmd
[158/166] SAS/ttest_Paired.qmd
[159/166] SAS/cmh.qmd
[160/166] SAS/correlation.qmd
[161/166] SAS/mmrm.qmd
[162/166] SAS/rounding.qmd
[163/166] publication/dissertation.qmd
[164/166] publication/white_paper.qmd
[165/166] publication/index.qmd
[166/166] publication/conference.qmd

Output created: _site/index.html

If I then update a package (which will update the renv.lock package hash):

> install.packages('cardx')
# Downloading packages -------------------------------------------------------
- Downloading cardx from CRAN ...               OK [513.4 Kb in 1.3s]
- Downloading cards from CRAN ...               OK [535.8 Kb in 0.61s]
Successfully downloaded 2 packages in 4.2 seconds.

The following package(s) will be installed:
- cards [0.5.1]
- cardx [0.2.3]
These packages will be installed into "~/source/CAMIS/renv/library/linux-ubuntu-noble/R-4.4/x86_64-pc-linux-gnu".

Do you want to proceed? [Y/n]: y

# Installing packages --------------------------------------------------------
- Installing cards ...                          OK [installed binary and cached in 0.58s]
- Installing cardx ...                          OK [installed binary and cached in 0.58s]
Successfully installed 2 packages in 1.4 seconds.
- Automatic snapshot has updated '~/source/CAMIS/renv.lock'.

And then re-render, I can see that it detects those packages have changed, uses renv to detect which quarto .qmd files use that package, and delete their caches so that quarto renders them from scratch:

Quarto Render With Package Updates
michael@KD-C138GL3:~/source/CAMIS$ quarto render
R/quarto_check_pkg_dependencies.R
here() starts at /home/michael/source/CAMIS
Checking package versions against previous executions...
Finding R package dependencies ... [162/162] Done!
Dependencies changed, removing cache at: /home/michael/source/CAMIS/_freeze/Comp/r-sas_ci_for_prop
Dependencies changed, removing cache at: /home/michael/source/CAMIS/_freeze/R/ci_for_prop
Saving current package versions to data/quarto_pkg_dependencies.csv

[  1/166] minutes/posts/9Dec2024.qmd
[  2/166] minutes/posts/17apr2023.qmd
[  3/166] minutes/posts/21Aug2023.qmd
[  4/166] minutes/posts/9oct2023.qmd
[  5/166] minutes/posts/15July2024.qmd
[  6/166] minutes/posts/11Sept2023.qmd
[  7/166] minutes/posts/13May2024.qmd
[  8/166] minutes/posts/19June2023.qmd
[  9/166] minutes/posts/20Nov2023.qmd
[ 10/166] minutes/posts/12Feb2024.qmd
[ 11/166] minutes/posts/10June2024.qmd
[ 12/166] minutes/posts/10Oct2024.qmd
[ 13/166] minutes/posts/10Feb2025.qmd
[ 14/166] minutes/posts/8Jan2024.qmd
[ 15/166] minutes/posts/10July2023.qmd
[ 16/166] minutes/posts/03Jan2025.qmd
[ 17/166] minutes/posts/15May2023.qmd
[ 18/166] minutes/posts/13Feb2023.Rmd
[ 19/166] minutes/posts/12Aug2024.qmd
[ 20/166] minutes/posts/11Mar2024.qmd
[ 21/166] minutes/posts/13mar2023.qmd
[ 22/166] minutes/posts/10Mar2025.qmd
[ 23/166] minutes/posts/23Jan2023.Rmd
[ 24/166] minutes/posts/9sept2024.qmd
[ 25/166] minutes/posts/12Dec2022.qmd
[ 26/166] minutes/posts/8Apr2024.qmd
[ 27/166] minutes/index.qmd
WARN: Unable to create a feed as the required `site-url` property is missing from this project.
[ 28/166] R/Weighted-log-rank.qmd
[ 29/166] R/ci_for_prop.qmd


processing file: ci_for_prop.qmd
1/23
2/23 [unnamed-chunk-1]
3/23
4/23 [unnamed-chunk-2]
5/23
6/23 [unnamed-chunk-3]
7/23
8/23 [unnamed-chunk-4]
9/23
10/23 [unnamed-chunk-5]
11/23
12/23 [unnamed-chunk-6]
13/23
14/23 [unnamed-chunk-7]
15/23
16/23 [unnamed-chunk-8]
17/23
18/23 [unnamed-chunk-9]
19/23
20/23 [unnamed-chunk-10]
21/23
22/23 [unnamed-chunk-11]
23/23
output file: ci_for_prop.knit.md

[ 30/166] R/ttest_2Sample.qmd
[ 31/166] R/association.qmd
[ 32/166] R/survival_cif.qmd
[ 33/166] R/mi_mar_predictive_mean_match.qmd
[ 34/166] R/R_Friedmantest.qmd
[ 35/166] R/mcnemar.qmd
[ 36/166] R/Accelerated_Failure_time_model.qmd
[ 37/166] R/xgboost.qmd
[ 38/166] R/ttest_1Sample.qmd
[ 39/166] R/sample_size_average_bioequivalence.qmd
[ 40/166] R/ancova.qmd
[ 41/166] R/anova.qmd
[ 42/166] R/summary-stats.qmd
[ 43/166] R/linear-regression.qmd
[ 44/166] R/survival.qmd
[ 45/166] R/survival_csh.qmd
[ 46/166] R/survey-stats-summary.qmd
[ 47/166] R/kruskal_wallis.qmd
[ 48/166] R/nparestimate.qmd
[ 49/166] R/jonckheere.qmd
[ 50/166] R/PCA_analysis.qmd
[ 51/166] R/summary_skew_kurt.qmd
[ 52/166] R/marginal_homogeneity_tests.qmd
[ 53/166] R/binomial_test.qmd
[ 54/166] R/nonpara_wilcoxon_ranksum.qmd
[ 55/166] R/mi_mar_regression.qmd
[ 56/166] R/manova.qmd
WARN: Unable to resolve link target: contribution.qmd
[ 57/166] R/gsd-tte.qmd
[ 58/166] R/count_data_regression.qmd
[ 59/166] R/tobit regression.qmd
[ 60/166] R/ttest_Paired.qmd
[ 61/166] R/cmh.qmd
[ 62/166] R/logistic_regr.qmd
[ 63/166] R/rbmi_continuous_joint.qmd
[ 64/166] R/correlation.qmd
[ 65/166] R/wilcoxonsr_hodges_lehman.qmd
[ 66/166] R/mmrm.qmd
[ 67/166] R/rounding.qmd
[ 68/166] blogs/posts/202312_highlights_blog.qmd
[ 69/166] blogs/posts/202403_phuseUS2024.qmd
[ 70/166] blogs/posts/202305_introduction_to_CAMIS_blog.qmd
[ 71/166] blogs/posts/202503_Tobit_regression.qmd
[ 72/166] blogs/index.qmd
[ 73/166] python/paired_t_test.qmd
[ 74/166] python/skewness_kurtosis.qmd
[ 75/166] python/linear_regression.qmd
[ 76/166] python/ancova.qmd
WARN: Unable to resolve link target: python/linear-regression.qmd
[ 77/166] python/anova.qmd
[ 78/166] python/two_samples_t_test.qmd
[ 79/166] python/survey-stats-summary.qmd
[ 80/166] python/logistic_regression.qmd
[ 81/166] python/kruskal_wallis.qmd
[ 82/166] python/Rounding.qmd
[ 83/166] python/Summary_statistics.qmd
[ 84/166] python/binomial_test.qmd
[ 85/166] python/one_sample_t_test.qmd
[ 86/166] python/chi-square.qmd
[ 87/166] python/MANOVA.qmd
WARN: Unable to resolve link target: contribution.qmd
[ 88/166] python/correlation.qmd
[ 89/166] non_website_content/Conferences 2024 archive.qmd
[ 90/166] non_website_content/conferences/2024/abstract_useR2024.md
[ 91/166] non_website_content/Conferences 2023 archive.qmd
[ 92/166] non_website_content/dissertations/202406_MMRM.qmd
[ 93/166] East/gsd-tte.qmd
[ 94/166] contribution/hackathon/contribution_guide_ssh.qmd
[ 95/166] contribution/contribution.qmd
[ 96/166] contribution/get_started.qmd
[ 97/166] LICENSE.md
[ 98/166] Comp/r-sas_ttest_1Sample.qmd
[ 99/166] Comp/r-sas_manova.qmd
[100/166] Comp/r-sas_survival.qmd
[101/166] Comp/r-sas_jonckheere.qmd
[102/166] Comp/r-sas_correlation.qmd
[103/166] Comp/r-sas-summary-stats.qmd
[104/166] Comp/r-sas_cmh.qmd
[105/166] Comp/r-sas-python_survey-stats-summary.qmd
[106/166] Comp/r-sas_ancova.qmd
[107/166] Comp/r-sas_chi-sq.qmd
[108/166] Comp/r-sas_rounding.qmd
[109/166] Comp/r-east_gsd_tte.qmd
[110/166] Comp/r-sas_mmrm.qmd
[111/166] Comp/r-sas-wilcoxonsr_HL.qmd
[112/166] Comp/r-sas_kruskalwallis.qmd
[113/166] Comp/r-sas_mcnemar.qmd
[114/166] Comp/r-sas_ci_for_prop.qmd


processing file: r-sas_ci_for_prop.qmd
1/3
2/3 [unnamed-chunk-1]
3/3
output file: r-sas_ci_for_prop.knit.md

[115/166] Comp/r-sas_friedman.qmd
[116/166] Comp/r-sas_linear-regression.qmd
[117/166] Comp/r-sas_survival_csh.qmd
[118/166] Comp/r-sas_negbin.qmd
[119/166] Comp/r-sas_ttest_Paired.qmd
[120/166] Comp/r-sas_anova.qmd
[121/166] Comp/r-sas_ttest_2Sample.qmd
[122/166] Comp/r-sas_summary_skew_kurt.qmd
[123/166] Comp/r-sas_tobit.qmd
[124/166] Comp/r-sas_survival_cif.qmd
[125/166] Comp/r-sas_logistic-regr.qmd
[126/166] Comp/r-sas_rbmi_continuous_joint.qmd
[127/166] templates/RvsSAS_template.qmd
[128/166] templates/template.qmd
[129/166] Clustering_Knowhow.qmd
[130/166] about.qmd
[131/166] index.qmd
[132/166] SAS/ranksum.qmd
[133/166] SAS/ci_for_prop.qmd
[134/166] SAS/ttest_2Sample.qmd
[135/166] SAS/association.qmd
[136/166] SAS/survival_cif.qmd
[137/166] SAS/wilcoxonsr_HL.qmd
[138/166] SAS/mcnemar.qmd
[139/166] SAS/rbmi_continuous_joint_SAS.qmd
[140/166] SAS/SAS_Friedmantest.qmd
[141/166] SAS/ttest_1Sample.qmd
[142/166] SAS/rmst.qmd
[143/166] SAS/tobit regression SAS.qmd
[144/166] SAS/ancova.qmd
[145/166] SAS/jonchkheere_terpstra.qmd
[146/166] SAS/anova.qmd
[147/166] SAS/summary-stats.qmd
[148/166] SAS/linear-regression.qmd
[149/166] SAS/survival.qmd
[150/166] SAS/survival_csh.qmd
[151/166] SAS/survey-stats-summary.qmd
[152/166] SAS/kruskal_wallis.qmd
[153/166] SAS/logistic-regr.qmd
[154/166] SAS/nparestimate.qmd
[155/166] SAS/summary_skew_kurt.qmd
[156/166] SAS/mi_mar_regression.qmd
[157/166] SAS/manova.qmd
[158/166] SAS/ttest_Paired.qmd
[159/166] SAS/cmh.qmd
[160/166] SAS/correlation.qmd
[161/166] SAS/mmrm.qmd
[162/166] SAS/rounding.qmd
[163/166] publication/dissertation.qmd
[164/166] publication/white_paper.qmd
[165/166] publication/index.qmd
[166/166] publication/conference.qmd

Output created: _site/index.html

@statasaurus statasaurus self-requested a review March 19, 2025 09:23
Copy link
Contributor

@statasaurus statasaurus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for coming up with this @michaelwalshe!! I have just done the updates we chatted about yesterday. This is really amazing!

@statasaurus statasaurus merged commit 2f71460 into PSIAIMS:main Mar 19, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache rendered pages to speed up render times
2 participants