Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporating latest API changes #40

Closed
wants to merge 107 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
19a046d
generated files
mustberuss Dec 10, 2024
acaa797
setting group on top level attributes
mustberuss Dec 10, 2024
62cee5d
Merge branch 'ropensci:master' into master
mustberuss Dec 15, 2024
004664a
feat: httr to httr2
mustberuss Dec 20, 2024
8c44577
feat: httr to httr2
mustberuss Dec 20, 2024
0a2ad58
docs: API link update
mustberuss Dec 20, 2024
c9e8c5e
feat: dont print document number in scientific notation
mustberuss Dec 20, 2024
1d68033
build: version bumps to ko deprecated msgs
mustberuss Dec 20, 2024
9902e2d
added timeout - prevent run away jobs
mustberuss Dec 20, 2024
7cbae0d
feat: casting type changes
mustberuss Dec 20, 2024
1d733b7
docs: field name change
mustberuss Dec 20, 2024
3fa1765
feat: new casting methodology
mustberuss Dec 20, 2024
4b1f8e9
docs: field name change
mustberuss Dec 20, 2024
1b25760
docs: field name change
mustberuss Dec 20, 2024
d078c35
fix: parameters on posts
mustberuss Dec 20, 2024
e6bf61b
feat: httr to httr2
mustberuss Dec 20, 2024
2386c45
feat: new paging methodology
mustberuss Dec 20, 2024
3f74716
feat: removed paging limits
mustberuss Dec 20, 2024
f22c089
feat: search_pv parameter updates
mustberuss Dec 20, 2024
4d2f064
feat: use new api group/field shorthand
mustberuss Dec 21, 2024
edae041
feat: added in_range query function
mustberuss Dec 21, 2024
ed8effe
feat: unnesting plural entities from singular endpoints
mustberuss Dec 21, 2024
8a1c876
feat: added lifecycle deprecations
mustberuss Dec 21, 2024
01ab978
feat: added in_range query function
mustberuss Dec 21, 2024
1e817bc
fix: length check to avoid coercion warning
mustberuss Dec 21, 2024
5f9e38f
feat: query checking in the new api version
mustberuss Dec 21, 2024
6763772
feat: don't require sort fields to be fields param
mustberuss Dec 21, 2024
8ca6ee8
feat validation in the new api version
mustberuss Dec 21, 2024
378e221
feat: search_pv parameter changes
mustberuss Dec 21, 2024
f14126b
feat: new paging methodology
mustberuss Dec 21, 2024
e3d0f18
refactor: getting top level attributes
mustberuss Dec 21, 2024
d5509b0
feat: retrieve_linked can retrieve documentation links
mustberuss Dec 21, 2024
66ad2f5
feat: search_pv parameter updates
mustberuss Dec 21, 2024
065a539
docs: API link update
mustberuss Dec 21, 2024
7ce51b4
docs: parameter and example changes
mustberuss Dec 21, 2024
86dae00
docs: plural to singular endpoints
mustberuss Dec 21, 2024
5956b7b
feat validation in the new api version
mustberuss Dec 21, 2024
07f50e3
generated files
mustberuss Dec 21, 2024
47c014f
generated files
mustberuss Dec 21, 2024
ce1875f
feat: new paging methodology
mustberuss Dec 22, 2024
6e997fc
test: updatng tests for new api version
mustberuss Dec 22, 2024
8d7f7d1
Merge branch 'master' of github.com:mustberuss/patentsview
mustberuss Dec 22, 2024
aae887d
test: updatng tests for new api version
mustberuss Dec 22, 2024
1c1eba5
added skip_on_ci()s
mustberuss Dec 22, 2024
dda8bac
removed skip_on_ci()s
mustberuss Dec 23, 2024
c50f208
removed run_dontrun = TRUE from run_examples
mustberuss Dec 23, 2024
6dc6c7c
generated files
mustberuss Dec 23, 2024
47218b6
docs: API link update
mustberuss Dec 23, 2024
e634e31
ropensci build changes
mustberuss Dec 23, 2024
7f0309e
checking access to the patentsview API key
mustberuss Dec 23, 2024
4434cd2
test removal
mustberuss Dec 24, 2024
3a1efb3
checking secrets access
mustberuss Dec 24, 2024
8e88622
only apply sort if user set one
mustberuss Dec 25, 2024
fdf728b
feat: httr to httr2
mustberuss Dec 20, 2024
9158a56
feat: httr to httr2
mustberuss Dec 20, 2024
ad0defb
docs: API link update
mustberuss Dec 20, 2024
6c2b3dc
feat: dont print document number in scientific notation
mustberuss Dec 20, 2024
62eccf9
build: version bumps to ko deprecated msgs
mustberuss Dec 20, 2024
19f2f43
added timeout - prevent run away jobs
mustberuss Dec 20, 2024
41b246c
feat: casting type changes
mustberuss Dec 20, 2024
8bedcb2
docs: field name change
mustberuss Dec 20, 2024
24c7c51
feat: new casting methodology
mustberuss Dec 20, 2024
1dcb89d
docs: field name change
mustberuss Dec 20, 2024
3454c52
docs: field name change
mustberuss Dec 20, 2024
ad0c9f7
fix: parameters on posts
mustberuss Dec 20, 2024
a5889f8
feat: httr to httr2
mustberuss Dec 20, 2024
6a9aa9c
feat: new paging methodology
mustberuss Dec 20, 2024
7a539b1
feat: removed paging limits
mustberuss Dec 20, 2024
ca71aa1
feat: search_pv parameter updates
mustberuss Dec 20, 2024
f5a1fcb
feat: use new api group/field shorthand
mustberuss Dec 21, 2024
4b2703c
feat: added in_range query function
mustberuss Dec 21, 2024
cdb768c
feat: unnesting plural entities from singular endpoints
mustberuss Dec 21, 2024
467a950
feat: added lifecycle deprecations
mustberuss Dec 21, 2024
d77af76
feat: added in_range query function
mustberuss Dec 21, 2024
722c84c
fix: length check to avoid coercion warning
mustberuss Dec 21, 2024
eb93718
feat: query checking in the new api version
mustberuss Dec 21, 2024
973409b
feat: don't require sort fields to be fields param
mustberuss Dec 21, 2024
146ac99
feat validation in the new api version
mustberuss Dec 21, 2024
8b77cad
feat: search_pv parameter changes
mustberuss Dec 21, 2024
fc56555
feat: new paging methodology
mustberuss Dec 21, 2024
9a8b1df
refactor: getting top level attributes
mustberuss Dec 21, 2024
2d48c50
feat: retrieve_linked can retrieve documentation links
mustberuss Dec 21, 2024
f8da869
feat: search_pv parameter updates
mustberuss Dec 21, 2024
37fd524
docs: API link update
mustberuss Dec 21, 2024
79e232a
docs: parameter and example changes
mustberuss Dec 21, 2024
745570d
docs: plural to singular endpoints
mustberuss Dec 21, 2024
472eb89
feat validation in the new api version
mustberuss Dec 21, 2024
1511e61
generated files
mustberuss Dec 21, 2024
6167721
generated files
mustberuss Dec 21, 2024
20a4f0d
feat: new paging methodology
mustberuss Dec 22, 2024
3040e78
test: updatng tests for new api version
mustberuss Dec 22, 2024
41ec537
test: updatng tests for new api version
mustberuss Dec 22, 2024
f023a31
added skip_on_ci()s
mustberuss Dec 22, 2024
591b82e
removed skip_on_ci()s
mustberuss Dec 23, 2024
a589f38
removed run_dontrun = TRUE from run_examples
mustberuss Dec 23, 2024
399a097
generated files
mustberuss Dec 23, 2024
1ab2581
docs: API link update
mustberuss Dec 23, 2024
bc5467a
ropensci build changes
mustberuss Dec 23, 2024
86b26aa
checking access to the patentsview API key
mustberuss Dec 23, 2024
763d1f2
test removal
mustberuss Dec 24, 2024
7a9c755
checking secrets access
mustberuss Dec 24, 2024
1e400a5
only apply sort if user set one
mustberuss Dec 25, 2024
69c366b
adding secondary sort
mustberuss Dec 28, 2024
1046db4
merge build change
mustberuss Dec 28, 2024
a30f934
Merge branch 'master' of github.com:mustberuss/patentsview
mustberuss Dec 29, 2024
7d6ba01
checking write access to the repo
mustberuss Dec 29, 2024
570edb7
workflow revert
mustberuss Dec 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ LazyData: TRUE
Depends:
R (>= 3.1)
Imports:
httr,
httr2,
lifecycle,
jsonlite,
utils
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ export(cast_pv_data)
export(get_endpoints)
export(get_fields)
export(get_ok_pk)
export(pad_patent_id)
export(qry_funs)
export(retrieve_linked_data)
export(search_pv)
Expand Down
50 changes: 40 additions & 10 deletions R/cast-pv-data.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,23 @@ as_is <- function(x) x
get_cast_fun <- function(data_type) {
# Some fields aren't documented, so we don't know what their data type is. Use
# string type for these.
# new version of the API: state of string vs fulltext is in flux. Latter currently unused
if (length(data_type) != 1) data_type <- "string"
switch(
data_type,
switch(data_type,
"string" = as_is,
"date" = as.Date,
"float" = as.numeric,
"integer" = as.integer,
"number" = as_is,
"integer" = as_is,
"int" = as.integer,
"fulltext" = as_is
"fulltext" = as_is,
"boolean" = as_is,
"bool" = as.logical
)
}

#' @noRd
lookup_cast_fun <- function(name, typesdf) {
data_type <- typesdf[typesdf$field == name, "data_type"]
data_type <- typesdf[typesdf$common_name == name, "data_type"]
get_cast_fun(data_type = data_type)
}

Expand All @@ -29,6 +31,18 @@ cast_one.character <- function(one, name, typesdf) {
cast_fun(one)
}

#' @noRd
cast_one.double <- function(one, name, typesdf) {
cast_fun <- lookup_cast_fun(name, typesdf)
cast_fun(one)
}

#' @noRd
cast_one.integer <- function(one, name, typesdf) {
cast_fun <- lookup_cast_fun(name, typesdf)
cast_fun(one)
}

#' @noRd
cast_one.default <- function(one, name, typesdf) NA

Expand Down Expand Up @@ -69,17 +83,33 @@ cast_one <- function(one, name, typesdf) UseMethod("cast_one")
#' \dontrun{
#'
#' fields <- c("patent_date", "patent_title", "patent_year")
#' res <- search_pv(query = "{\"patent_number\":\"5116621\"}", fields = fields)
#' res <- search_pv(query = "{\"patent_id\":\"5116621\"}", fields = fields)
#' cast_pv_data(data = res$data)
#' }
#'
#' @export
cast_pv_data <- function(data) {
validate_pv_data(data)

endpoint <- names(data)
entity_name <- names(data)

if (entity_name == "rel_app_texts") {
# blend the fields from both rel_app_texts entities
typesdf <- unique(fieldsdf[fieldsdf$group == entity_name, c("common_name", "data_type")])
} else {
# need to get the endpoint from entity_name
endpoint_df <- fieldsdf[fieldsdf$group == entity_name, ]
endpoint <- unique(endpoint_df$endpoint)

# watch out here- several endpoints return entities that are groups returned
# by the patent and publication endpoints (attorneys, inventors, assignees)
if(length(endpoint) > 1) {
endpoint <- endpoint[!endpoint %in% c("patent", "publication")]
}

typesdf <- fieldsdf[fieldsdf$endpoint == endpoint, c("common_name", "data_type")]

typesdf <- fieldsdf[fieldsdf$endpoint == endpoint, c("field", "data_type")]
}

df <- data[[1]]

Expand All @@ -89,7 +119,7 @@ cast_pv_data <- function(data) {

df[] <- list_out
out_data <- list(x = df)
names(out_data) <- endpoint
names(out_data) <- entity_name

structure(
out_data,
Expand Down
35 changes: 21 additions & 14 deletions R/check-query.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,28 +10,32 @@ is_int <- function(x)

#' @noRd
is_date <- function(x)
grepl("[12][[:digit:]]{3}-[01][[:digit:]]-[0-3][[:digit:]]", x)
grepl("^[12][[:digit:]]{3}-[01][[:digit:]]-[0-3][[:digit:]]$", x)

#' @noRd
one_check <- function(operator, field, value, f1) {

if (nrow(f1) == 0)
stop2(field, " is not a valid field to query for your endpoint")
if (f1$data_type == "date" && !is_date(value))
stop2("Bad date: ", value, ". Date must be in the format of yyyy-mm-dd")
if (f1$data_type %in% c("string", "fulltext") && !is.character(value))
if (f1$data_type %in% c("bool", "int", "string", "fulltext") && !is.character(value))
stop2(value, " must be of type character")
if (f1$data_type == "integer" && !is_int(value))
stop2(value, " must be an integer")
if (f1$data_type == "boolean" && !is.logical(value))
stop2(value, " must be a boolean")
if (f1$data_type == "number" && !is.numeric(value))
stop2(value, " must be a number")

if (
(operator %in% c("_begins", "_contains") && !(f1$data_type == "string")) ||
(operator %in% c("_text_all", "_text_any", "_text_phrase") &&
!(f1$data_type == "fulltext")) ||
(f1$data_type %in% c("string", "fulltext") &&
operator %in% c("_gt", "_gte", "_lt", "_lte"))
)
# The new version of the API blurrs the distinction between string/fulltext fields.
# It looks like the string/fulltext functions can be used interchangeably
(operator %in% c("_begins", "_contains", "_text_all", "_text_any", "_text_phrase") &&
!(f1$data_type == "fulltext" || f1$data_type == "string")) ||
(f1$data_type %in% c("string", "fulltext") &&
operator %in% c("_gt", "_gte", "_lt", "_lte"))) {
stop2("You cannot use the operator ", operator, " with the field ", field)
}
}

#' @noRd
Expand All @@ -40,13 +44,16 @@ check_query <- function(query, endpoint) {
num_opr <- c("_gt", "_gte", "_lt", "_lte")
str_opr <- c("_begins", "_contains")
fltxt_opr <- c("_text_all", "_text_any", "_text_phrase")
all_opr <- c(simp_opr, num_opr, str_opr, fltxt_opr)
all_opr <- c(simp_opr, num_opr, str_opr, fltxt_opr, "_in_range")

flds_flt <- fieldsdf[fieldsdf$endpoint == endpoint & fieldsdf$can_query == "y", ]
flds_flt <- fieldsdf[fieldsdf$endpoint == endpoint, ]

apply_checks <- function(x, endpoint) {
x <- swap_null_nms(x)
if (names(x) %in% c("_not", "_and", "_or") || is.na(names(x))) {

# troublesome next line: 'length(x) = 2 > 1' in coercion to 'logical(1)'
# if (names(x) %in% c("_not", "_and", "_or") || is.na(names(x))) {
if (length(names(x)) > 1 || names(x) %in% c("_not", "_and", "_or") || is.na(names(x))) {
lapply(x, FUN = apply_checks)
} else if (names(x) %in% all_opr) {
f1 <- flds_flt[flds_flt$field == names(x[[1]]), ]
Expand All @@ -61,8 +68,8 @@ check_query <- function(query, endpoint) {
)
} else {
stop2(
names(x), " is either not a valid operator or not a ",
"queryable field for this endpoint"
names(x), " is not a valid operator or not a ",
"valid field for this endpoint"
)
}
}
Expand Down
2 changes: 1 addition & 1 deletion R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#' A data frame containing the names of retrievable fields for each of the
#' endpoints. You can find this data on the API's online documentation for each
#' endpoint as well (e.g., the
#' \href{https://patentsview.org/apis/api-endpoints/patents}{patents endpoint
#' \href{https://search.patentsview.org/docs/docs/Search%20API/SearchAPIReference/#patent}{patent endpoint
#' field list table}).
#'
#' @format A data frame with the following columns:
Expand Down
57 changes: 50 additions & 7 deletions R/get-fields.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
#' @noRd
get_top_level_attributes <- function(endpoint) {
fieldsdf[fieldsdf$endpoint == endpoint & !grepl("\\.", fieldsdf$field), "field"]
}


#' Get list of retrievable fields
#'
#' This function returns a vector of fields that you can retrieve from a given
Expand All @@ -13,15 +19,18 @@
#' endpoint's fields (i.e., do not filter the field list based on group
#' membership). See the field tables located online to see which groups you
#' can specify for a given endpoint (e.g., the
#' \href{https://search.patentsview.org/docs/docs/Search%20API/SearchAPIReference/#patent}{patent
#' \href{https://search.patentsview.org/docs/docs/Search%20API/SearchAPIReference/#patent}{patents
#' endpoint table}), or use the \code{fieldsdf} table
#' (e.g., \code{unique(fieldsdf[fieldsdf$endpoint == "patent", "group"])}).
#' @param include_pk Boolean on whether to include the endpoint's primary key,
#' defaults to FALSE. The primary key is needed if you plan on calling
#' \code{\link{unnest_pv_data}} on the results of \code{\link{search_pv}}
#'
#' @return A character vector with field names.
#'
#' @examples
#' # Get all assignee-level fields for the patent endpoint:
#' fields <- get_fields(endpoint = "patent", groups = "assignees")
#' # Get all top level (non-nested) fields for the patent endpoint:
#' fields <- get_fields(endpoint = "patent", groups = c("patents"))
#'
#' # ...Then pass to search_pv:
#' \dontrun{
Expand All @@ -31,7 +40,7 @@
#' fields = fields
#' )
#' }
#' # Get all patent and assignee-level fields for the patent endpoint:
#' # Get unnested patent and assignee-level fields for the patent endpoint:
#' fields <- get_fields(endpoint = "patent", groups = c("assignees", "patents"))
#'
#' \dontrun{
Expand All @@ -41,15 +50,49 @@
#' fields = fields
#' )
#' }
#' # Get the nested inventors fields and the primary key in order to call unnest_pv_data
#' # on the returned data. unnest_pv_data would throw an error if the primary key was
#' # not present in the results.
#' fields <- get_fields(endpoint = "patent", groups = c("inventors"), include_pk = TRUE)
#'
#' \dontrun{
#' # ...Then pass to search_pv and unnest the results
#' results <- search_pv(
#' query = '{"_gte":{"patent_date":"2007-01-04"}}',
#' fields = fields
#' )
#' unnest_pv_data(results$data)
#' }
#'
#' @export
get_fields <- function(endpoint, groups = NULL) {
get_fields <- function(endpoint, groups = NULL, include_pk = FALSE) {
validate_endpoint(endpoint)

# using API's shorthand notation, group names can be requested as fields instead of
# fully qualifying each nested field. Fully qualified, all patent endpoint's attributes
# is over 4K, too big to be sent on a GET with a modest query

pk <- get_ok_pk(endpoint)
plural_entity <- fieldsdf[fieldsdf$endpoint == endpoint & fieldsdf$field == pk, "group"]
top_level_attributes <- get_top_level_attributes(endpoint)

if (is.null(groups)) {
fieldsdf[fieldsdf$endpoint == endpoint, "field"]
c(
top_level_attributes,
unique(fieldsdf[fieldsdf$endpoint == endpoint & fieldsdf$group != plural_entity, "group"])
)
} else {
validate_groups(endpoint, groups = groups)
fieldsdf[fieldsdf$endpoint == endpoint & fieldsdf$group %in% groups, "field"]

# don't include pk if plural_entity group is requested (pk would be a member)
extra_field <- if (include_pk && !plural_entity %in% groups) pk else NULL
extra_fields <- if (plural_entity %in% groups) top_level_attributes else NULL

c(
extra_field,
extra_fields,
groups[!groups == plural_entity]
)
}
}

Expand Down
5 changes: 4 additions & 1 deletion R/print.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,10 @@ print.pv_data_result <- function(x, ...) {
)

utils::str(
x, vec.len = 1, max.level = 2, give.attr = FALSE, strict.width = "cut"
x, vec.len = 1, max.level = 2, give.attr = FALSE, strict.width = "cut",
formatNum = function(x, ...) {
format(x, trim = TRUE, drop0trailing = TRUE, scientific = FALSE, ...)
}
)
}

Expand Down
57 changes: 0 additions & 57 deletions R/process-error.R

This file was deleted.

18 changes: 2 additions & 16 deletions R/process-resp.R
Original file line number Diff line number Diff line change
@@ -1,23 +1,10 @@
#' @noRd
parse_resp <- function(resp) {
j <- httr::content(resp, as = "text", encoding = "UTF-8")
jsonlite::fromJSON(
j,
simplifyVector = TRUE, simplifyDataFrame = TRUE, simplifyMatrix = TRUE
)
}

#' @noRd
get_request <- function(resp) {
gp <- structure(
list(method = resp$req$method, url = resp$req$url),
list(method = resp$request$method, url = resp$request$url),
class = c("list", "pv_request")
)

if (gp$method == "POST") {
gp$body <- rawToChar(resp$req$options$postfields)
}

gp
}

Expand All @@ -42,11 +29,10 @@ get_query_results <- function(prsd_resp) {

#' @noRd
process_resp <- function(resp) {
if (httr::http_error(resp)) throw_er(resp)

prsd_resp <- parse_resp(resp)
request <- get_request(resp)
data <- get_data(prsd_resp)

query_results <- get_query_results(prsd_resp)

structure(
Expand Down
Loading
Loading