You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this issue I will group together comments from my 2 reviewers: ropensci/software-review#674 (comment) in an attempt to tackle the changes in a few PRs that relate to similar files.
✅ [1]To align with the naming of the other exported functions, map_compare() and map_convert(), I would suggest renaming metadata_map() to map_metadata(). While I understand you may have concerns about potential confusion with the package name, I believe this change would harmonize the naming scheme, enhance clarity, and facilitate autocomplete for users.resolved on review thread: ropensci/software-review#674 (comment)
[2] For the map_convert() function, I recommend renaming the argument output_csv to csv_to_convert to make its purpose clearer. Additionally, introducing a new argument, csv_to_convert_dir, could address an issue I’ve identified in the provided function example (see the Examples section below). If you implement this change, the output_dir argument could represent the directory where the generated CSV file is stored (defaulting to csv_to_convert_dir if not specified).
[3] Input checking
I suggest adding explicit input validation to your exported functions. This would enhance the user experience by providing clear and informative error messages or warnings.
For map_compare(), consider verifying that session_dir exists and that the CSV files specified in the user input are present.
For map_convert(), you might want to check that both output_csv and output_dir exist.
For metadata_map() and the other functions, validating file existence, expected dimensions, and column names (if any) could further improve robustness.
[4] README
The demo example in the README feels somewhat extensive and might benefit from being simplified to improve accessibility. Currently, interacting with up to 20 variables requires substantial effort, which could deter users from running the full demo.
For sections involving custom inputs (metadata input, domain list input, and custom lookup), I recommend moving these to the vignette and referencing it in the README. For example:
"If you wish to use custom metadata, domain lists, or lookup tables, please refer to vignette XXX, section YYY."
Additionally, examples in the README should be copy-pasteable and work seamlessly without requiring additional setup. For instance, the following code snippet requires downloading a file.
This is why I would present only the demo example in the README.
Providing examples for custom domain list inputs and custom lookup tables would also be beneficial. For these examples, linking to demo files in inst/inputs could clarify the expected file formats. I guess an explanation about the location of the files used by default and the way to access them programmatically with system.file() would be useful to your users (If this function is probably familiar to R packages developpers, it might not be straightforward for package “users” only).
As an alternative to @nord you might enjoy https://github.com/moodymudskipper/devtag@RayStick (not in the dev guide yet), it generates Rd pages for internal functions, that developers can read with ?fun, but Rbuildignores them so the users don't see them.
[6] Testing
In the tests/testthat.R file, it seems unnecessary to load the {mockery} and {withr} packages explicitly, as you are already using explicit namespacing.
A note regarding {mockery}: its README states that the package is superseded and recommends using testthat::local_mocked_bindings() instead. More details can be found [here](https://github.com/r-lib/mockery).
[7] Examples
Some examples are enclosed in dontrun tags, which may pose an issue if you plan to submit the package to CRAN.
For map_compare(), consider using a temporary directory to avoid creating files in the user’s workspace or package’s installation directory:
# Locate file paths for the example files in the package
demo_session_dir <- system.file("outputs", package = "mapmetadata")
demo_session1_base <- "360_NationalCommunityChildHealthDatabase(NCCHD)_CHILD_2024-12-19-14-11-55"
demo_session2_base <- "360_NationalCommunityChildHealthDatabase(NCCHD)_CHILD_2024-12-19-14-17-45"
demo_metadata_file <- system.file("inputs", "360_NationalCommunityChildHealthDatabase(NCCHD)_Structural_Metadata.csv", package = "mapmetadata")
demo_domain_file <- system.file("inputs", "domain_list_demo.csv", package = "mapmetadata")
output_dir <- tempdir()
message("Output directory: ", output_dir)
# Run the function - requires user interaction
map_compare(
session_dir = demo_session_dir,
session1_base = demo_session1_base,
session2_base = demo_session2_base,
metadata_file = demo_metadata_file,
domain_file = demo_domain_file,
output_dir = output_dir
)
Returning the path of the output file and including a browseURL() call to open the file in the example could enhance usability. This would allow us to delete the temp directory once the generated file has been opened.
The example for map_convert() writes to the package’s installation directory, which is problematic. Modifying this to use temporary directories would improve clarity and compliance. If you implement my earlier suggestion to add a csv_to_convert_dir argument, this example could be significantly simplified. Moreover, I think a message telling the user the path of the output could be nice, like you did with map_compare(). I think this could be nice that this function returns the path of the output file so we could open it in the example with a browseURL() call.
For metadata_map(), since it requires user interaction, you should enclose it in an if(interactive()) block:
#' @examples
# Demo run requires no function inputs but requires user interaction
if(interactive()) {
metadata_map()
}
[8] file path construction
In your file-handling functions, you are constructing paths using paste() or paste0(). To ensure platform independence, I recommend replacing these with file.path(). For example, in map_compare():
This ensures compatibility across operating systems.
[9] Feedback based on devchecks()
Portable file names: Removing parentheses from file names should resolve this issue.
Package size: Large files in inst/outputs, such as BAR_360_NationalCommunityChildHealthDatabase.html, could be removed if not essential to the package.
Future file timestamps: Likely specific to my computer and unrelated to your code.
To support further accessibility I recommend being more descriptive in the text where a link is highlighted in text like for the word here particularly as this is something that helps with accessibility.
Also appears in the vignette which shows an image but only says download here:.
It took me a while to navigate the website as I wasn't familiar with it so I wonder if the image would benefit from a direct hyperlink to the same page?
It might be useful to expand on the point about changing the save location by adjusting the output_dir argument because argument is a technical term that might not be familiar to researchers - perhaps using an example to expand on this would help?
The demo requires initials to be entered but it isn't explained why it is required. It gets recorded in the logs but this might not be apparent to an initial user of the package.
Demo mode is said to process only the first 20 variables from the selected table and it's not clear what the 20 variables refer to particularly when directed to selecting data elements. Are data elements the same as variables in this example? I got a bit confused with this so couldn't complete the instructions.
The parameters used in the metadata_map() function example don't appear to work. The code is metadata_map(csv_file = new_csv_file, domain_file = demo_domains_file) but there isn't a parameter called csv_file - should this be metadata_file? I first of all tried to use the 14_Mortality (Death registration)_Metadata.csv data from the website but got the error because the use of structural metadata is only referred to in the making of metadata. I found this link via the arguments help for metadata_file which is excellent documentation, however, will be missed I think so would be really great to be clear in the README file as well. An enhancement may be to have a message coded if a file is missing structural in the file name that the error is because it's the wrong file?
get("look_up") should perhaps come earlier in the vignette because it's not possible to use this function when in the process of selecting variables from using metadata_map() that appears just before.
For the prompt Enter one or more numbers separated by spaces and then ENTER, or 0 to cancel I tried just using the Enter/Return key to finish which gave an output. Pressing Enter/Return key though isn't a listed option and I just guessed it would work so possibly better to be explicit.
I recommend being explicit in the argument help for metadata_file as ?metadata won't return any files if the package hasn't been loaded: ?mapmetadata::metadata. Although this is functionality in RStudio it could be missed by someone new to R and RStudio and their experience of not finding help might be incorrectly associated with the package.
The reference to data/metadata.rda might not make sense as packages are often just used through the library() so perhaps better to use refer to the code mapmetadata::metadata?
For Using a custom domain list input where domain file inputs are detailed as automatically appending so do not include these in your domain list is this because they get replicated? It would be helpful to be explicit that this wouldn't be likely to break code but just duplicate.
Function documentation sometimes includes references to other functions in the package, for example valid_comparison() refers to it being used in the the map_compare.R function. Cross links are mentioned in the ROpenSci package guidelines and if you use RStudio I recommend using my colleague, Matt Dray's, snorkel package to add these in as you can highlight the text and use the RStudio add in.
Minor spelling have been submitted for correction on a PR but where functions have spelling mistakes:
concensus is a function name and will require updates to corresponding files: concensus_on_mismatch(), test-concensus_on_mismatch.R.
It is also in the message from the map_compare() function and is in the documentation for map_compare().
Object in function metadata_map() (which won't have any effect to the user) mistmatch.
[6] Package build
Notes on no visible binding for global variable can be managed using the utils package and a specific R file (globals.R). More details on more sophisticated treatment can be found in the R packages book. I tend to get these issues particularly when referring to SQL columns in R code.
portable file names... NOTE is appearing because the file names are over 100 characters long. These files are referred to in the map_convert() and map_compare() documentation example so it might be ok to rename this longer name by the parts for generalising? For example:
#' @examples
#' # Locate file path and file name for the example files in the package
#' demo_output_dir <- system.file("outputs", package = "mapmetadata")
#' demo_output_csv <- "MAPPING_360_example_datetime.csv"
#'
#' # Run the function
#' map_convert(output_csv = demo_output_csv, output_dir = demo_output_dir)
[7] Formatting consistency
I noticed that there are line breaks put into function metadata_map.R so almost all the text is within an 80 character limit, but this isn't the case for other scripts. It's quite nice to have a line break set because it makes it easier to read across the screen and as it's been used in one script I wondered if it could be applied to all scripts? I use lintr to highlight the lines that are over 80 characters.
The text was updated successfully, but these errors were encountered:
is there also an editor contribution tag for @maelle
No, thank you for asking. 🙏 In the dev guide we state "Please do not list editors as contributors. Your participation in and contribution to rOpenSci is thanks enough!" https://devguide.ropensci.org/pkg_building.html#authorship (I'm actually realizing you listed me as contributor in some other metadata!)
No, thank you for asking. 🙏 In the dev guide we state "Please do not list editors as contributors. Your participation in and contribution to rOpenSci is thanks enough!" https://devguide.ropensci.org/pkg_building.html#authorship (I'm actually realizing you listed me as contributor in some other metadata!)
Ah, missed that guide note! Well I am happy to keep you as a contributor (if you are okay with that) as I think you gave ideas - and contributions in this repo - above and beyond your editor role, but correct me if I am wrong. Thanks!
In this issue I will group together comments from my 2 reviewers: ropensci/software-review#674 (comment) in an attempt to tackle the changes in a few PRs that relate to similar files.
Changes to tackle:
After I finish the changes:
is there also an editor contribution tag for @maelle?Reviewer 1: ropensci/software-review#674 (comment)
Function and argument naming
✅ [1]
To align with the naming of the other exported functions,resolved on review thread: ropensci/software-review#674 (comment)map_compare()
andmap_convert()
, I would suggest renamingmetadata_map()
tomap_metadata()
. While I understand you may have concerns about potential confusion with the package name, I believe this change would harmonize the naming scheme, enhance clarity, and facilitate autocomplete for users.[2] For the
map_convert()
function, I recommend renaming the argumentoutput_csv
tocsv_to_convert
to make its purpose clearer. Additionally, introducing a new argument,csv_to_convert_dir
, could address an issue I’ve identified in the provided function example (see the Examples section below). If you implement this change, theoutput_dir
argument could represent the directory where the generated CSV file is stored (defaulting tocsv_to_convert_dir
if not specified).[3] Input checking
I suggest adding explicit input validation to your exported functions. This would enhance the user experience by providing clear and informative error messages or warnings.
map_compare()
, consider verifying thatsession_dir
exists and that the CSV files specified in the user input are present.map_convert()
, you might want to check that bothoutput_csv
andoutput_dir
exist.metadata_map()
and the other functions, validating file existence, expected dimensions, and column names (if any) could further improve robustness.[4] README
The demo example in the README feels somewhat extensive and might benefit from being simplified to improve accessibility. Currently, interacting with up to 20 variables requires substantial effort, which could deter users from running the full demo.
For sections involving custom inputs (metadata input, domain list input, and custom lookup), I recommend moving these to the vignette and referencing it in the README. For example:
Additionally, examples in the README should be copy-pasteable and work seamlessly without requiring additional setup. For instance, the following code snippet requires downloading a file.
This is why I would present only the demo example in the README.
Providing examples for custom domain list inputs and custom lookup tables would also be beneficial. For these examples, linking to demo files in
inst/inputs
could clarify the expected file formats. I guess an explanation about the location of the files used by default and the way to access them programmatically withsystem.file()
would be useful to your users (If this function is probably familiar to R packages developpers, it might not be straightforward for package “users” only).[5] Documentation
For non-exported functions, you might want to prevent
.Rd
files from being generated in theman/
folder. Adding the#' @noRd
tag to these functions' documentation can achieve this. For more details, see [the rOpenSci dev guide](https://devguide.ropensci.org/pkg_building.html#roxygen-2-use).Editor says:
As an alternative to @nord you might enjoy https://github.com/moodymudskipper/devtag @RayStick (not in the dev guide yet), it generates Rd pages for internal functions, that developers can read with ?fun, but Rbuildignores them so the users don't see them.
[6] Testing
In the
tests/testthat.R
file, it seems unnecessary to load the{mockery}
and{withr}
packages explicitly, as you are already using explicit namespacing.A note regarding
{mockery}
: its README states that the package is superseded and recommends usingtestthat::local_mocked_bindings()
instead. More details can be found [here](https://github.com/r-lib/mockery).[7] Examples
Some examples are enclosed in
dontrun
tags, which may pose an issue if you plan to submit the package to CRAN.map_compare()
, consider using a temporary directory to avoid creating files in the user’s workspace or package’s installation directory:Returning the path of the output file and including a
browseURL()
call to open the file in the example could enhance usability. This would allow us to delete the temp directory once the generated file has been opened.The example for
map_convert()
writes to the package’s installation directory, which is problematic. Modifying this to use temporary directories would improve clarity and compliance. If you implement my earlier suggestion to add acsv_to_convert_dir
argument, this example could be significantly simplified. Moreover, I think a message telling the user the path of the output could be nice, like you did withmap_compare()
. I think this could be nice that this function returns the path of the output file so we could open it in the example with abrowseURL()
call.For
metadata_map()
, since it requires user interaction, you should enclose it in anif(interactive())
block:[8] file path construction
In your file-handling functions, you are constructing paths using
paste()
orpaste0()
. To ensure platform independence, I recommend replacing these withfile.path()
. For example, inmap_compare()
:You could replace
This ensures compatibility across operating systems.
[9] Feedback based on devchecks()
inst/outputs
, such asBAR_360_NationalCommunityChildHealthDatabase.html
, could be removed if not essential to the package.R/globals.R
file and useutils::globalVariables()
. For guidance, see [this forum post](https://forum.posit.co/t/how-to-solve-no-visible-binding-for-global-variable-note/28887).Reviewer 2: ropensci/software-review#674 (comment)
[1] ROpenSci packaging guidelines
Console messages asks to avoid the use of
cat
andprint
as these are not easy to suppress.Avoiding verbosity or suppressing messages is hard within this package as the interactions from the user require the messages.
Editor says:
Regarding verbosity, we've published a blog post a while ago (with the goal of updating the dev guide later), that might be relevant: https://ropensci.org/blog/2024/02/06/verbosity-control-packages/
[2] Accessibility ✅ #185
Thank you for the alt text on images!To support further accessibility I recommend being more descriptive in the text where a link is highlighted in text like for the wordhere
particularly as this is something that helps with accessibility.Also appears in the vignette which shows an image but only saysdownload here:
.It took me a while to navigate the website as I wasn't familiar with it so I wonder if the image would benefit from a direct hyperlink to the same page?And the non-descriptivehere
insee here for outputs generated from the demo run
.[3] Technical language ✅ #185
It might be useful to expand on the point about changing the save location by adjusting theoutput_dir
argument becauseargument
is a technical term that might not be familiar to researchers - perhaps using an example to expand on this would help?[4] Expectations of functionality
Only the Return key works to continue where the instruction says to
finish here, or press any other key to continue with mapping variables
. Other keys don't appear to "finish" and also, to be very clear, need to be executed in the Console.The demo requires initials to be entered but it isn't explained why it is required. It gets recorded in the logs but this might not be apparent to an initial user of the package.
Demo mode is said to process only the first 20 variables
from the selected table
and it's not clear what the 20 variables refer to particularly when directed to selecting data elements. Aredata elements
the same asvariables
in this example? I got a bit confused with this so couldn't complete the instructions.The parameters used in the
metadata_map()
function example don't appear to work. The code ismetadata_map(csv_file = new_csv_file, domain_file = demo_domains_file)
but there isn't a parameter calledcsv_file
- should this bemetadata_file
? I first of all tried to use the14_Mortality (Death registration)_Metadata.csv
data from the website but got the error because the use of structural metadata is only referred to in the making ofmetadata
. I found this link via the arguments help formetadata_file
which is excellent documentation, however, will be missed I think so would be really great to be clear in the README file as well. An enhancement may be to have a message coded if a file is missingstructural
in the file name that the error is because it's the wrong file?get("look_up")
should perhaps come earlier in the vignette because it's not possible to use this function when in the process of selecting variables from usingmetadata_map()
that appears just before.For the prompt
Enter one or more numbers separated by spaces and then ENTER, or 0 to cancel
I tried just using the Enter/Return key to finish which gave an output. Pressing Enter/Return key though isn't a listed option and I just guessed it would work so possibly better to be explicit.I recommend linking to the function reference for
map_compare()
in tutorial.I recommend being explicit in the argument help for metadata_file as
?metadata
won't return any files if the package hasn't been loaded:?mapmetadata::metadata
. Although this is functionality in RStudio it could be missed by someone new to R and RStudio and their experience of not finding help might be incorrectly associated with the package.The reference to
data/metadata.rda
might not make sense as packages are often just used through thelibrary()
so perhaps better to use refer to the codemapmetadata::metadata
?For
Using a custom domain list input
where domain file inputs are detailed as automatically appending sodo not include these in your domain list
is this because they get replicated? It would be helpful to be explicit that this wouldn't be likely to break code but just duplicate.For
Using a custom lookup table input (advanced)
I couldn't work out the corresponding example csv that the instructionensure that all domain codes in the lookup file are also included in your *domain file*
without opening all the csvs and looking for the column name. It might be good to expand on this with an example.Function documentation sometimes includes references to other functions in the package, for example
valid_comparison()
refers to it being used in thethe map_compare.R function
. Cross links are mentioned in the ROpenSci package guidelines and if you use RStudio I recommend using my colleague, Matt Dray's, snorkel package to add these in as you can highlight the text and use the RStudio add in.[5] Spelling ✅ #185
Minor spelling have been submitted for correction on a PR but where functions have spelling mistakes:concensus
is a function name and will require updates to corresponding files:concensus_on_mismatch()
,test-concensus_on_mismatch.R
.It is also in the message from themap_compare()
function and is in the documentation formap_compare()
.Object in functionmetadata_map()
(which won't have any effect to the user)mistmatch
.[6] Package build
Notes on
no visible binding for global variable
can be managed using the utils package and a specific R file (globals.R
). More details on more sophisticated treatment can be found in the R packages book. I tend to get these issues particularly when referring to SQL columns in R code.portable file names... NOTE
is appearing because the file names are over 100 characters long. These files are referred to in themap_convert()
andmap_compare()
documentation example so it might be ok to rename this longer name by the parts for generalising? For example:[7] Formatting consistency
I noticed that there are line breaks put into function
metadata_map.R
so almost all the text is within an 80 character limit, but this isn't the case for other scripts. It's quite nice to have a line break set because it makes it easier to read across the screen and as it's been used in one script I wondered if it could be applied to all scripts? I uselintr
to highlight the lines that are over 80 characters.The text was updated successfully, but these errors were encountered: