Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMARTSviewer #235

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

SMARTSviewer #235

wants to merge 7 commits into from

Conversation

gjgetzinger
Copy link
Contributor

@gjgetzinger gjgetzinger commented Apr 6, 2020


Add function to query SMARTS viewer API

Generate a visualization of a SMARTS string using the SMARTSvierwer API provided by ZBH-Center for Bioinformatics at Universitat Hamburg. (https://smartsview.zbh.uni-hamburg.de/rest). For a set of input SMARTS strings, an image file can be returned to R for plotting with rasterImage or directly downloaded in png, pdf or svg format. Throws warning if multiple SMARTS strings provided and download selected to avoid downloading large numbers of files.

PR task list:

  • Update NEWS
  • Add tests (if appropriate)
  • Update documentation with devtools::document()
  • Check package passed

gjgetzinger and others added 2 commits February 14, 2020 10:31
Generate a visualization of a SMARTS string using the SMARTSvierwer API provided by ZBH-Center for Bioinformatics at Universitat Hamburg. (https://smartsview.zbh.uni-hamburg.de/rest). For a set of input SMARTS strings, an image file can be returned to R for plotting with rasterImage or directly downloaded in png, pdf or svg format. Throws warning if multiple SMARTS strings provided and download selected to avoid downloading large numbers of files.
@stitam
Copy link
Contributor

stitam commented Apr 7, 2020

Hi @gjgetzinger, thanks for opening this PR!

@codecov-io
Copy link

codecov-io commented Apr 8, 2020

Codecov Report

Merging #235 into master will not change coverage by %.
The diff coverage is 0.00%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master    #235   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files          19      20    +1     
  Lines        1619    1652   +33     
======================================
- Misses       1619    1652   +33     
Impacted Files Coverage Δ
R/smarts_viewer.R 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ee0de7...6607b8a. Read the comment docs.

Copy link
Contributor

@stitam stitam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @gjgetzinger for your PR! Accessing images is a completely new and very exciting functionality for webchem and I think it is in scope for the package. Given that this functionality is new, I think we should discuss some details before I merge your PR. Details like how these functions should be named, how they should behave, what class they should return, and also some good coding practices, e.g. whether or not we should import image manipulation packages like the magick package and whether or not these functions should offer direct export functionality. I have added my first thoughts as comments to the PR. What do you think? @gjgetzinger, @Aariq, @andschar?

Comment on lines -4 to -21
NEW FEATURES

MINOR IMPROVEMENTS

BUG FIXES

* get_csid() now returns all csids when queried from formula
* get_csid() returned an error when query was NA [PR #226, fixed by stitam]
* get_chebiid() and chebi_comp_entity() fixed for invalid queries [PR #225, fixed by stitam]
* get_cid() returned the PubChem ID of sodium when the query was NA [PR #223, fixed by stitam]
* aw_query() returned a list for successful queries, NA for unsuccessful queries [PR #222, fixed by stitam]

DEPRECATED FUNCTIONS

DEFUNCT FUNCTIONS

webchem 0.5.0
======================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that you didn't update your branch before making changes? Our contributing guide contains more information about tracking upstream progress.

response <- httr::GET(entity_query)
if (response$status_code == 200) {
if (output == 'image') {
httr::content(response, as = 'parsed', type = 'Image/png')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I try to run the first example with image_format = "svg" or "pdf" the function returns

 Error in png::readPNG(x) : libpng error: Not a PNG file

and Traceback points to httr::content() and httr::GET(). Could this be related to type = "image/png"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also get this error.

Comment on lines +1 to +3
test_that("multiplication works", {
expect_equal(2 * 2, 4)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this second test file was pushed accidentally:)

#' legend_option = "both", filename = "test.pdf")
#' }
#'
smarts_viewer <-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a function name like sv_view() would be more consistent with our guidelines for naming webchem functions. What do you think?

context("smartsview")

a <- smarts_viewer("[CX3](=[OX1])[OX2][CX3](=[OX1])", "image", "png", 1, "both" )
b <- smarts_viewer("[CX3](=[OX1])[OX2][CX3](=[OX1])", "download", "png", 1, "both", "smartsview_test.png" )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is CRAN policy that "packages should not write in the user’s home filespace (including clipboards), nor anywhere else on the file system apart from the R session’s temporary directory". You can use the cross-platform tempdir() function to access the R sessions's temporary directory.

Comment on lines +89 to +94
utils::download.file(url = response$url,
destfile = gsub(
pattern = paste0('.', image_format),
paste0('_', x, '.', image_format),
filename
))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While being able to export the image from R is a very good idea, I am not sure if this should be handled by the smarts_viewer() function. Alternatively, smarts_viewer() could query the url and always return a plot or a list of plots, and then these plots could be exported by standard R functions. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this would work, as I think only the PNG version can be stored as an R object. If you were to implement this, you'd probably want to have one function that outputs the image data as a matrix with class smarts (for example) and store the query info as attributes. Then, when you run the hypothetical smarts_save() function on a smarts object, it would save the PNG or re-query the database if filetype was .pdf or .svg.

@Aariq
Copy link
Collaborator

Aariq commented Apr 12, 2020

Just wanted to say that I did not know about this resource and this will be a super cool addition to webchem. I'm looking forward to looking at the code. Is there a possibility (in the future, not necessarily this PR) to have a translator function that could translate from, say, InChIKey to SMARTS string? Because that could make for some really cool workflows!

Copy link
Collaborator

@Aariq Aariq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mostly left suggestions, but I'll leave it up to @stitam to decide which changes need to happen before merging this PR. Super cool function! I'm excited to have this be part of webchem!

Comment on lines +49 to +54
function(smarts,
output = c('image', 'download'),
image_format = c('pdf', 'png', 'svg'),
visualization_modus = c(1, 2),
legend_option = c('both', 'none', 'static', 'dynamic'),
filename = NULL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some suggestion that might help streamline the API and address some of @stitam's concerns about downloading. Also, I was confused for a while because I couldn't get the download to work, and then I realized it was because I had output = "image" and supplied an image_format and filename (there's no warning for this situation).

Suggested change
function(smarts,
output = c('image', 'download'),
image_format = c('pdf', 'png', 'svg'),
visualization_modus = c(1, 2),
legend_option = c('both', 'none', 'static', 'dynamic'),
filename = NULL) {
function(smarts,
visualization_modus = c(1, 2),
legend_option = c('both', 'none', 'static', 'dynamic'),
filename = NULL) {

If filename = NULL (default) then only the png data is downloaded and returned as an R object (same as output = 'image'). If filename is anything else, the function can attempt to write a file. The file type can be taken from the file extension in filename as is done in cowplot::save_plot().

Comment on lines +89 to +94
utils::download.file(url = response$url,
destfile = gsub(
pattern = paste0('.', image_format),
paste0('_', x, '.', image_format),
filename
))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this would work, as I think only the PNG version can be stored as an R object. If you were to implement this, you'd probably want to have one function that outputs the image data as a matrix with class smarts (for example) and store the query info as attributes. Then, when you run the hypothetical smarts_save() function on a smarts object, it would save the PNG or re-query the database if filetype was .pdf or .svg.

Comment on lines +27 to +28
#' plot(0:1,0:1, 'n')
#' rasterImage(img[[1]],0,0,1,1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just based on the name, I was expecting smarts_viewer() to automatically plot something when output = "image". It might be nice to have a default plot show up (maybe controlled by a view = TRUE argument) by including this plotting code in the function itself.

@gjgetzinger
Copy link
Contributor Author

@stitam @Aariq Thanks for the edits/comments/suggestions. I will start to incorporate suggestions into the PR later this week.

This was referenced May 6, 2020
@stitam
Copy link
Contributor

stitam commented May 6, 2020

I have opened Issue #249 for discussions on images.

@Aariq Aariq marked this pull request as draft May 12, 2020 00:35
@Aariq Aariq linked an issue May 15, 2020 that may be closed by this pull request
@jranke
Copy link
Contributor

jranke commented Dec 4, 2024

As I have returned to collecting data on chemicals with R, and am very much a fan of being able to display images, I just want to mention that you can get 2D structures via the chents package if you have python3-rdkit configured for use with the reticulate package: https://pkgdown.jrwb.de/chents/reference/chent.html#ref-examples. Maybe someone coming over to this PR is interested in this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Images
5 participants