Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Census geocoder single address error #168

Open
lrsulli opened this issue Jun 14, 2022 · 9 comments
Open

Census geocoder single address error #168

lrsulli opened this issue Jun 14, 2022 · 9 comments
Labels
bug Something isn't working

Comments

@lrsulli
Copy link

lrsulli commented Jun 14, 2022

Thank you for the great package!

I am running into an error using the single address method with the Census Geocoder which was not occurring before the Census recently updated their geocoder. The error message reads "Error: lexical error: invalid char in json text." I have tried to geocode the same list of addresses multiple times and the error occurs at different points in the list and sometimes not at all.

# Geocode addresses -------------------------------------------------------

# Generate counter to track progress
counter <- 0

# Get geocodes

data_geo <- lapply(split_list, function(x) {
    dt_geo <- geocode(
        .tbl = x,
        method = "census",
        mode = "single",
        street = "address",
        city = "city",
        state = "state",
        postalcode = "zip",
        full_results = TRUE,
        api_options = list(census_return_type = 'geographies')
        )
    
    # Check progress
    counter <<- counter + 1
    print(paste0('Progress: ', counter, ' of ', num_dt))

    return(dt_geo)
    })
#> Passing 1,000 addresses to the US Census single address geocoder
#> Warning in query_api(api_url, api_query_parameters, method = method): Bad
#> Request (HTTP 400).
#> Error: lexical error: invalid char in json text.
#>                                        <!DOCTYPE html PUBLIC "-//W3C//
#>                      (right here) ------^

Created on 2022-06-14 by the reprex package (v2.0.1)

Here is my session information for your reference:

  • Session info --------------------------------------------------------------------------
    setting value
    version R version 4.1.0 (2021-05-18)
    os Windows Server x64
    system x86_64, mingw32
    ui RStudio
    language (EN)
    collate English_United States.1252
    ctype English_United States.1252
    tz America/New_York
    date 2022-06-14

  • Packages ------------------------------------------------------------------------------
    ! package * version date lib source
    assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.0.3)
    backports 1.2.1 2020-12-09 [2] CRAN (R 4.0.3)
    cachem 1.0.5 2021-05-15 [2] CRAN (R 4.0.5)
    callr 3.7.0 2021-04-20 [2] CRAN (R 4.0.5)
    cli 2.5.0 2021-04-26 [2] CRAN (R 4.0.5)
    clipr 0.7.1 2020-10-08 [2] CRAN (R 4.0.3)
    crayon 1.4.1 2021-02-08 [2] CRAN (R 4.0.4)
    curl 4.3.1 2021-04-30 [2] CRAN (R 4.0.5)
    data.table * 1.14.0 2021-02-21 [2] CRAN (R 4.0.4)
    DBI 1.1.1 2021-01-15 [2] CRAN (R 4.0.3)
    desc 1.3.0 2021-03-05 [2] CRAN (R 4.0.4)
    devtools 2.4.1 2021-05-05 [2] CRAN (R 4.0.5)
    digest 0.6.27 2020-10-24 [2] CRAN (R 4.0.3)
    dplyr 1.0.6 2021-05-05 [2] CRAN (R 4.0.5)
    ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.0.5)
    evaluate 0.14 2019-05-28 [2] CRAN (R 4.0.3)
    fansi 0.5.0 2021-05-25 [2] CRAN (R 4.0.5)
    fastmap 1.1.0 2021-01-25 [2] CRAN (R 4.0.3)
    forcats 0.5.1 2021-01-27 [2] CRAN (R 4.0.3)
    fs 1.5.0 2020-07-31 [2] CRAN (R 4.0.3)
    generics 0.1.0 2020-10-31 [2] CRAN (R 4.0.3)
    glue 1.4.2 2020-08-27 [2] CRAN (R 4.0.3)
    haven * 2.4.3 2021-08-04 [1] CRAN (R 4.1.1)
    highr 0.9 2021-04-16 [2] CRAN (R 4.0.5)
    hms 1.0.0 2021-01-13 [2] CRAN (R 4.0.3)
    htmltools 0.5.1.1 2021-01-22 [2] CRAN (R 4.0.3)
    httr 1.4.2 2020-07-20 [2] CRAN (R 4.0.3)
    jsonlite 1.7.2 2020-12-09 [2] CRAN (R 4.0.4)
    knitr 1.31 2021-01-27 [2] CRAN (R 4.0.3)
    lifecycle 1.0.0 2021-02-15 [2] CRAN (R 4.0.4)
    magrittr 2.0.1 2020-11-17 [2] CRAN (R 4.0.3)
    memoise 2.0.0 2021-01-26 [2] CRAN (R 4.0.3)
    pillar 1.6.1 2021-05-16 [2] CRAN (R 4.0.5)
    pkgbuild 1.2.0 2020-12-15 [2] CRAN (R 4.0.3)
    pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.0.3)
    pkgload 1.2.1 2021-04-06 [2] CRAN (R 4.0.5)
    prettyunits 1.1.1 2020-01-24 [2] CRAN (R 4.0.3)
    processx 3.5.2 2021-04-30 [2] CRAN (R 4.0.5)
    progress 1.2.2 2019-05-16 [2] CRAN (R 4.0.3)
    ps 1.6.0 2021-02-28 [2] CRAN (R 4.0.4)
    purrr 0.3.4 2020-04-17 [2] CRAN (R 4.0.3)
    R6 2.5.0 2020-10-28 [2] CRAN (R 4.0.3)
    readr 1.4.0 2020-10-05 [2] CRAN (R 4.0.3)
    remotes 2.4.0 2021-06-02 [2] CRAN (R 4.0.2)
    reprex * 2.0.1 2021-08-05 [1] CRAN (R 4.1.3)
    rlang 0.4.11 2021-04-30 [2] CRAN (R 4.0.5)
    rmarkdown 2.8 2021-05-07 [2] CRAN (R 4.0.5)
    rprojroot 2.0.2 2020-11-15 [2] CRAN (R 4.0.3)
    rstudioapi 0.13 2020-11-12 [2] CRAN (R 4.0.3)
    sessioninfo 1.1.1 2018-11-05 [2] CRAN (R 4.0.3)
    styler 1.4.1 2021-03-30 [2] CRAN (R 4.0.4)
    testthat 3.0.2 2021-02-14 [2] CRAN (R 4.0.4)
    tibble 3.1.2 2021-05-16 [2] CRAN (R 4.0.5)
    tidygeocoder * 1.0.5 2021-11-02 [1] CRAN (R 4.1.3)
    tidyselect 1.1.0 2020-05-11 [2] CRAN (R 4.0.3)
    usethis 2.0.1 2021-02-10 [2] CRAN (R 4.0.4)
    utf8 1.2.1 2021-03-12 [2] CRAN (R 4.0.4)
    vctrs 0.3.8 2021-04-29 [2] CRAN (R 4.0.5)
    withr 2.4.2 2021-04-18 [2] CRAN (R 4.0.5)
    D xfun 0.19 2020-10-30 [2] CRAN (R 4.0.2)
    yaml 2.2.1 2020-02-01 [2] CRAN (R 4.0.3)

[1] \rschfs1x/userrs/K-Q/lrs263_RS/Documents/R/win-library/4.1
[2] C:/Program Files/R/R-4.1.0/library

D -- DLL MD5 mismatch, broken installation.

@lrsulli lrsulli added the bug Something isn't working label Jun 14, 2022
@jessecambon
Copy link
Owner

Hi @lrsulli do you have a reproducible example you could share? (ie. a few of the addresses that have caused this error would be helpful so I can try to reproduce what you're seeing)

@lrsulli
Copy link
Author

lrsulli commented Jun 14, 2022

Below are some addresses where I have encountered this error, but they do not always result in an error. For example, The first address (7 ESB SUPP CO UTILITY, CMP PENDLETON, CA 92055) only resulted in this error the first time I ran the code. Thank you for your help!

example <- tibble(address = c("7 ESB SUPP CO UTILITY",
                              "5545 SKY PATRIC WAY # 160",
                              "664 MISSION BLVD",
                              "28003 INDIAN",
                              "611 WOOLERY",
                              "164 EAST ST"),
                  city = c("CMP PENDLETON",
                           "SACRAMENTO",
                           "RIVERSIDE",
                           "PLS VRDS PNSL",
                           "VAN NUYS",
                           "SACRAMENTO"),
                  state = "CA",
                  zip = c("92055", "95823", "92509", "90275", "91436", "95814")
                  )

Created on 2022-06-14 by the reprex package (v2.0.1)

@jessecambon
Copy link
Owner

I didn't get any coordinates returned for these addresses, however I also didn't get any errors even after running it multiple times. Let me know if you are able to run this code:

library(tidygeocoder)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tibble)

example <- tibble(address = c("7 ESB SUPP CO UTILITY",
                              "5545 SKY PATRIC WAY # 160",
                              "664 MISSION BLVD",
                              "28003 INDIAN",
                              "611 WOOLERY",
                              "164 EAST ST"),
                  city = c("CMP PENDLETON",
                           "SACRAMENTO",
                           "RIVERSIDE",
                           "PLS VRDS PNSL",
                           "VAN NUYS",
                           "SACRAMENTO"),
                  state = "CA",
                  zip = c("92055", "95823", "92509", "90275", "91436", "95814")
)

output <- example %>%
  geocode(method = 'census', 
          mode = 'single',
          full_results = TRUE,
          api_options = list(census_return_type = 'geographies'),
          street = "address", city = "city", state = "state", postalcode = "zip")
#> Passing 6 addresses to the US Census single address geocoder
#> Query completed in: 2.3 seconds

output
#> # A tibble: 6 × 6
#>   address                   city          state zip     lat  long
#>   <chr>                     <chr>         <chr> <chr> <dbl> <dbl>
#> 1 7 ESB SUPP CO UTILITY     CMP PENDLETON CA    92055    NA    NA
#> 2 5545 SKY PATRIC WAY # 160 SACRAMENTO    CA    95823    NA    NA
#> 3 664 MISSION BLVD          RIVERSIDE     CA    92509    NA    NA
#> 4 28003 INDIAN              PLS VRDS PNSL CA    90275    NA    NA
#> 5 611 WOOLERY               VAN NUYS      CA    91436    NA    NA
#> 6 164 EAST ST               SACRAMENTO    CA    95814    NA    NA

Created on 2022-06-14 by the reprex package (v2.0.1)

Environment for reference:

> devtools::session_info()
─ Session info ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.1.3 (2022-03-10)
 os       Ubuntu 22.04 LTS
 system   x86_64, linux-gnu
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/New_York
 date     2022-06-14
 rstudio  2022.02.3+492 Prairie Trillium (desktop)
 pandoc   2.17.1.1 @ /usr/lib/rstudio/bin/quarto/bin/ (via rmarkdown)

─ Packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package      * version date (UTC) lib source
 assertthat     0.2.1   2019-03-21 [1] CRAN (R 4.1.3)
 brio           1.1.3   2021-11-30 [1] CRAN (R 4.1.3)
 cachem         1.0.6   2021-08-19 [1] CRAN (R 4.1.3)
 callr          3.7.0   2021-04-20 [1] CRAN (R 4.1.3)
 cli            3.2.0   2022-02-14 [1] CRAN (R 4.1.3)
 clipr          0.8.0   2022-02-22 [1] CRAN (R 4.1.3)
 crayon         1.5.1   2022-03-26 [1] CRAN (R 4.1.3)
 curl           4.3.2   2021-06-23 [1] CRAN (R 4.1.3)
 DBI            1.1.2   2021-12-20 [1] CRAN (R 4.1.3)
 desc           1.4.1   2022-03-06 [1] CRAN (R 4.1.3)
 devtools       2.4.3   2021-11-30 [1] CRAN (R 4.1.3)
 digest         0.6.29  2021-12-01 [1] CRAN (R 4.1.3)
 dplyr        * 1.0.9   2022-04-28 [1] CRAN (R 4.1.3)
 ellipsis       0.3.2   2021-04-29 [1] CRAN (R 4.1.3)
 evaluate       0.15    2022-02-18 [1] CRAN (R 4.1.3)
 fansi          1.0.3   2022-03-24 [1] CRAN (R 4.1.3)
 fastmap        1.1.0   2021-01-25 [1] CRAN (R 4.1.3)
 fs             1.5.2   2021-12-08 [1] CRAN (R 4.1.3)
 generics       0.1.2   2022-01-31 [1] CRAN (R 4.1.3)
 glue           1.6.2   2022-02-24 [1] CRAN (R 4.1.3)
 highr          0.9     2021-04-16 [1] CRAN (R 4.1.3)
 hms            1.1.1   2021-09-26 [1] CRAN (R 4.1.3)
 htmltools      0.5.2   2021-08-25 [1] CRAN (R 4.1.3)
 httr           1.4.2   2020-07-20 [1] CRAN (R 4.1.3)
 jsonlite       1.8.0   2022-02-22 [1] CRAN (R 4.1.3)
 knitr          1.38    2022-03-25 [1] CRAN (R 4.1.3)
 lifecycle      1.0.1   2021-09-24 [1] CRAN (R 4.1.3)
 magrittr       2.0.3   2022-03-30 [1] CRAN (R 4.1.3)
 memoise        2.0.1   2021-11-26 [1] CRAN (R 4.1.3)
 mime           0.12    2021-09-28 [1] CRAN (R 4.1.3)
 pillar         1.7.0   2022-02-01 [1] CRAN (R 4.1.3)
 pkgbuild       1.3.1   2021-12-20 [1] CRAN (R 4.1.3)
 pkgconfig      2.0.3   2019-09-22 [1] CRAN (R 4.1.3)
 pkgload        1.2.4   2021-11-30 [1] CRAN (R 4.1.3)
 prettyunits    1.1.1   2020-01-24 [1] CRAN (R 4.1.3)
 processx       3.5.3   2022-03-25 [1] CRAN (R 4.1.3)
 progress       1.2.2   2019-05-16 [1] CRAN (R 4.1.3)
 ps             1.6.0   2021-02-28 [1] CRAN (R 4.1.3)
 purrr          0.3.4   2020-04-17 [1] CRAN (R 4.1.3)
 R6             2.5.1   2021-08-19 [1] CRAN (R 4.1.3)
 remotes        2.4.2   2021-11-30 [1] CRAN (R 4.1.3)
 reprex         2.0.1   2021-08-05 [1] CRAN (R 4.1.3)
 rlang          1.0.2   2022-03-04 [1] CRAN (R 4.1.3)
 rmarkdown      2.13    2022-03-10 [1] CRAN (R 4.1.3)
 rprojroot      2.0.3   2022-04-02 [1] CRAN (R 4.1.3)
 rstudioapi     0.13    2020-11-12 [1] CRAN (R 4.1.3)
 sessioninfo    1.2.2   2021-12-06 [1] CRAN (R 4.1.3)
 testthat       3.1.3   2022-03-29 [1] CRAN (R 4.1.3)
 tibble       * 3.1.7   2022-05-03 [1] CRAN (R 4.1.3)
 tidygeocoder * 1.0.5   2021-11-02 [1] CRAN (R 4.1.3)
 tidyselect     1.1.2   2022-02-21 [1] CRAN (R 4.1.3)
 usethis        2.1.5   2021-12-09 [1] CRAN (R 4.1.3)
 utf8           1.2.2   2021-07-24 [1] CRAN (R 4.1.3)
 vctrs          0.4.1   2022-04-13 [1] CRAN (R 4.1.3)
 withr          2.5.0   2022-03-03 [1] CRAN (R 4.1.3)
 xfun           0.30    2022-03-02 [1] CRAN (R 4.1.3)
 yaml           2.3.5   2022-02-21 [1] CRAN (R 4.1.3)

@lrsulli
Copy link
Author

lrsulli commented Jun 14, 2022

I was able to run your code without any errors. I believe the probability of the error occurring increases with the size of the address list. I was able to replicate the error below with a list of just over 200 addresses. I got the error on my 6th time running this code, so you will likely have to run it several times in order to hopefully replicate it.

library(tidygeocoder)
#> Warning: package 'tidygeocoder' was built under R version 4.1.3
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

library(tibble)

example <- tibble(address = c("7 ESB SUPP CO UTILITY", "5545 SKY PATRIC WAY # 160",
                              "664 MISSION BLVD", "28003 INDIAN", "611 WOOLERY",
                              "164 EAST ST", "5807 TOPANGA CANYON BLVD APT L208",
                              "26880 LA PAZ RD", "2550 TELEGRAPH AVE APT 418",
                              "1287 S GRANDVIEW AVE # 502", "928 MORRIS PL",
                              "720 SANBORN L AVE", "7630 GOLETA AVE",
                              "480 CAMINO DEL RIO S", "51111 FAYE ST",
                              "223 SORENSON RD", "4630 DORCHESTER LN",
                              "1154 YORKSHIRE CIR", "12007 16TH ST",
                              "RR 190 BOX 3121", "4935 PARKLAND AVE",
                              "1241 DRAKE DR", "21750 MATTESON AVE",
                              "16235 1ST AVE # 2", "1958 MATADOR WAY UNIT 464",
                              "859 E ROUTE 66 STE A", "16015 GARVEY SPC 49",
                              "3216 W WEST ST # 9917", "418 HAZARD",
                              "28882 GAVIOTA AVE # 1", "5149 ANTONE",
                              "411 N VISTA ST", "27301 N EVI LN # 20101",
                              "MR1168 E CONNIE DR", "437 S COCHRAN AVE # 2",
                              "14434 SHARIAN RD","3637 SNELL AVE # 309",
                              "3401 OLIVERWOOD ST", "1300 VIST WALNUT",
                              "128 RANCH RD", "1601 LARK AVE # 8",
                              "127 W C ST", "27696 CAMINO CAPISTRANO",
                              "4307 RENAISSANCE DR APT 304", "1199 E BELLEVUE RD",
                              "3743 GIBSON", "1213 BASSFORD RD", "169 MALIBU RD",
                              "966 WILLOW CREEK RD", "2245 W 250TH ST",
                              "200 E RACQUET CLUB RD UNIT 39", "5245 GOLDENWOOD DARREN J DR",
                              "1202 VERMONT AVE", "13317 CAMINITO CIERA # 131",
                              "691 RANDOLPH AVE", "8180 LANGDON AVE # 301",
                              "2059 CROOKED CRK", "9033 BASELINE RD # F",
                              "12170 WILSTY DIR WAY", "109050 LUGO AVE",
                              "231160 CAVANAUGH", "210 W WASHINGTON ST",
                              "22135 CALIJCO EXPS WAY", "136 72ND AVE # 168",
                              "10561 LUBAO AVE", "VALLEY RANCH RD",
                              "420 S NEW NH HWY # 1", "985 ANDORA PL",
                              "4545 W 71ST ST", "12711 W KING ST", "15831 OLDEN",
                              "2020 CAMINO DE LA REINA UNIT 305", "3528 HOLLOW LN",
                              "PUBLC GDN", "6110 FRIARS RD STE 101",
                              "17099 W BERNARDO DR # 5-208", "109 S SCHOOL ST # A",
                              "6814 TRUS LN", "4 SCG SUPPORT CTR",
                              "2100 S STATE ST APT 41", "317 W WILSHIRE AVE # 1",
                              "2784 BLEMONT", "330 PATEO L REIO # A",
                              "46 DUNE PALMS # 59", "1168 SYDNEY AVE",
                              "358 BROADWAY", "3312 LYMAN RESIDENCES",
                              "1330 W PEARL ST APT B", "16745 2 SHERMAN WAY # 1",
                              "222813 14TH ST", "1107 W MEMORY LN # 13D",
                              "6737 BRENTFORD C", "W 169TH ST", "858 ANGOINETTE LN # F",
                              "1208 BUDGET MOTEL", "7621 S 4TH",
                              "1914 CANYON RD", "4309 30", "1540 PLUMAS",
                              "23416 SANDPIT RD", "9345 HARMON ST",
                              "140 S MARIPOSA AVE APT 309", "336 E HILLCREST BLVD STE 522",
                              "260 N 3RD AVE APT C302", "745 S STATE HIGHWAY 65",
                              "11500 W MANNING", "26 COURT",
                              "932 DPARKMAN AVE", "23 E VALLEY BLVD # 182",
                              "22614 VAN DEENE", "139 59TH SPRING WATER CT",
                              "1819 DE VINA ST LN", "4718 WILLIAMS RD",
                              "21214 MILLPOINT", "157 W ETINANDA AVE",
                              "3120 P O BOX",  "2350 LOWER CHILES RD",
                              "662 W HUNTINGTON DR # 502", "633 COLLEGE GROME",
                              "25 W KAIBAB AVE", "880 W REGENT ST",
                              "76 CERRITOS AVE # 9166", "5929 GEORGIA DR",
                              "108 W 2ND ST APT 312", "3000 SAND HILL RD # 4190",
                              "9603 AMANITO DEL FELIZ", "7005 DUNSMERE ST",
                              "41995 STATE HIGHWAY 74 STE B", "RR 464",
                              "3224 S GA", "48 GREEN HILLS DR",
                              "233446 HAYNES ST", "11633 BAYSE 20 ST",
                              "10824 QUJUNGA CANYON BLVD FOOTHIL BLVD",
                              "2278 SUTTON LOOP", "117 GRANDVIEW",
                              "2722 SEABRILLO CT", "124 N WESTMORELAND AVE APT 105",
                              "5287 E PAOLI WAY", "2395 MACARTHUR RD # S-1-135",
                              "2250 HOMESTEAD CT APT 110", "3961 MARTIN LUTH BLVD # K",
                              "5625 LEHSMAN AVE", "748 ALSBARRY AVE",
                              "120047 CONDON", "8330 CLEARWATER CT",
                              "932012 5TH BLVD", "309 E SEA CRST # 206",
                              "160 VIA PAZ LN",  "11525 VIA PLAYA DE",
                              "THE EST", "3417 CALLE VETA LN",
                              "10 S VOLUNTARIO ST # A", "16211 DORAL ST",
                              "1400 W 2 OLYMPIC BLVD # 1",  "1125 DE GARZA LN # B",
                              "12922 FOX ST", "3500 W 15TH ST",
                              "NONE",  "1614 24TH ST", "12700 ELLIOTT AVE SPC 175",
                              "745 W 19TH ST # 136", "8670 PIGENON RD",
                              "407 S BROADWAY",  "2528 HALF ADELIA AVE",
                              "5426 FANUEL ST", "19307 DEL SOL AVE",
                              "495 VERMONT ST", "3437 SIENN ST",
                              "150180 S RANCHOSANTAFE", "9450 S LACINTAC BLVD",
                              "1035 PARK ST", "535 WOODLAWN AVE",
                              "2121 S EL CAMINO REAL # 616", "15230 OLD CREAK RD",
                              "218 E HFL 10TH ST # 1", "591 STREET # DJ",
                              "68 C", "413 HERMOSA DR",  "780 OAK GROVE RD # RD318",
                              "3087 DAV AVE # 1", "1528 ROUTE AVE",
                              "15038 SANTANA ST", "PO PO BOX", "1316 VIA PORTOLA",
                              "220 E KEEFE # O", "1825 S VARDON",
                              "199 ARLIGTON WAY",  "2720 GARFIELD AVE",
                              "39 SUNSET RD", "10565 HANOVER A AVE",
                              "10279 CAMINITO RIO", "1305 HUMBUG CREEK RD # A",
                              "7061 MADISON AVE # B7", "6129 HALF PIEDMONT AVE",
                              "2044 W 236TH", "411 N CHESTER AVE APT 3",
                              "306 W EL NORTE PKWY # 120",  "1760 VIA PACIFICA # C210",
                              "3149 LA SELVA CIR APT 10", "600 N HUMBOLDT AVE APT 265",
                              "1442 E 215TH ST", "9775 AVENUE",
                              "688 ALTURAS", "246 E 89TH ST", "1419 W 2 36TH # 1"),
                  city = c("CMP PENDLETON", "SACRAMENTO", "RIVERSIDE", "PLS VRDS PNSL",
                           "VAN NUYS", "SACRAMENTO", "WOODLAND HILLS", "ALISO VIEJO",
                           "BERKELEY", "LOS ANGELES", "MONTEBELLO", "LOS ANGELES",
                           "YUCCA VALLEY", "SAN DIEGO", "GARDEN GROVE", "REDCREST",
                           "RNCHO CORDOVA", "STOCKTON", "ARTESIA", "PORTERVILLE",
                           "LOS ANGELES", "SAN DIEGO", "LOS ANGELES", "PARAMOUNT",
                           "NORTHRIDGE", "GLENDORA", "POMONA", "INGLEWOOD",
                           "SANTA ANA", "MENIFEE", "MARIPOSA", "LANCASTER",
                           "CANYON COUNTRY", "HANFORD", "VAN NUYS", "LATHROP",
                           "SAN FRANCISCO", "RIVERSIDE", "ORANGE", "BISHOP",
                           "LOS GATOS", "BRAWLEY", "LAGUNA NIGUEL", "SAN JOSE",
                           "ATWATER", "MONTE RIO", "EUREKA", "BAYSIDE",
                           "LAKE ARROWHEAD", "HARBOR CITY", "PALM SPRINGS",
                           "INGLEWOOD", "LOS ANGELES", "SAN DIEGO", "COSTA MESA",
                           "NORTH HILLS", "POMONA", "SAN PEDRO", "POWAY",
                           "SN BERNRDNO", "EL TORO", "RIVERSIDE", "PERRIS",
                           "TIPTON", "LOS ANGELES", "SANTA CLARITA", "LOS ANGELES",
                           "GILROY", "LAWNDALE", "SN BERNRDNO", "BURBANK",
                           "SAN DIEGO", "HAYWARD", "OAKLAND", "SAN DIEGO",
                           "SAN DIEGO", "GRASS VALLEY", "CITRUS HTS", "ALAMEDA",
                           "UKIAH", "FULLERTON", "BELMONT", "LOS ANGELES",
                           "LA QUINTA", "NEWMAN", "SACRAMENTO", "SN BERNRDNO",
                           "ANAHEIM", "VAN NUYS", "NEWHALL", "ANAHEIM",
                           "OAKLAND", "LAWNDALE", "DALY CITY", "MINERAL",
                           "ALHAMBRA", "SAN RAFAEL", "SACRAMENTO", "LEBEC",
                           "BURNEY", "SANTA ANA", "LOS ANGELES", "INGLEWOOD",
                           "UPLAND", "LINCOLN", "CARUTHERS", "VISALIA",
                           "LOS ANGELES", "ALHAMBRA", "GARDENA", "GARDEN GROVE",
                           "SANTA BARBARA", "SANTA CLARA", "WILMINGTON", "NORTH HILLS",
                           "FALLBROOK", "SAINT HELENA", "MONROVIA", "SAN DIEGO",
                           "BAKERSFIELD", "INGLEWOOD", "ANAHEIM", "SACRAMENTO",
                           "LOS ANGELES", "MENLO PARK", "SAN DIEGO", "LAMONT",
                           "HEMET", "WASCO", "VISALIA", "ANTIOCH", "WINNETKA",
                           "EL MONTE", "TUJUNGA", "FREMONT", "PETALUMA", "ANTIOCH",
                           "LOS ANGELES", "LONG BEACH", "MONTEREY", "LOS ALTOS",
                           "LYNWOOD", "ALTA LOMA", "LA PUENTE", "LAWNDALE",
                           "APPLE VALLEY", "LOS ANGELES", "INGLEWOOD", "SAN MARCOS",
                           "SAN DIEGO", "RESCUE", "SAN CLEMENTE", "SANTA BARBARA",
                           "TUSTIN", "LOS ANGELES", "SAN CLEMENTE", "PACOIMA",
                           "ROSAMOND", "FRESNO", "POINT ARENA", "EL MONTE",
                           "COSTA MESA", "MORENO VALLEY", "SANTA ANA", "S EL MONTE",
                           "SAN DIEGO", "WALNUT", "SAN JOSE", "RIVERBANK",
                           "SAN MARCOS", "INGLEWOOD", "BELLFLOWER", "BEVERLY HILLS",
                           "SAN MATEO", "EL CAJON", "SN BERNRDNO", "CHULA VISTA",
                           "BELLFLOWER", "CORONA", "CONCORD", "SAN JOSE",
                           "ROWLAND HEIGHTS", "CUDAHY", "LOS ANGELES", "LOS ANGELES",
                           "PALO ALTO", "EL TORO", "REDWOOD CITY", "SANTA ANA",
                           "BRENTWOOD", "OAKLAND", "SAN DIEGO", "APPLEGATE",
                           "SACRAMENTO", "LOS ANGELES", "TORRANCE", "COMPTON",
                           "ESCONDIDO", "CORONA", "SAN MATEO", "WILLOWS",
                           "WILMINGTON", "MONTCLAIR", "BAKERSFIELD", "ALPINE",
                           "LOS ANGELES"),
                  state = "CA",
                  zip = c("92055", "95823", "92509", "90275", "91436", "95814",
                          "91367", "92656", "94704", "90066", "90640", "90027",
                          "92284", "92108", "92840", "95569", "95742", "95207",
                          "90701", "93257", "90043", "92114", "90066", "90723",
                          "91330", "91740", "91766", "90302", "92703", "92584",
                          "95338", "93539", "91351", "93230", "91401", "95330",
                          "94101", "92506", "92867", "93514", "95032", "92227",
                          "92677", "95134", "95301", "95462", "95503", "95524",
                          "92352", "90710", "92262", "90302", "90044", "92129",
                          "92626", "91343", "91765", "90731", "92064", "92404",
                          "92630", "92506", "92570", "93272", "90001", "91351",
                          "90020", "95020", "90260", "92410", "91505", "92108",
                          "94541", "94604", "92108", "92127", "95945", "95610",
                          "94501", "95482", "92832", "94002", "90033", "92253",
                          "95360", "95818", "92414", "92801", "91406", "91321",
                          "92807", "94621", "90260", "94015", "96063", "91803",
                          "94903", "95822", "93243", "96013", "92704", "90004",
                          "90301", "91786", "95648", "93609", "93277", "90026",
                          "91801", "90249", "92843", "93101", "95050", "90744",
                          "91343", "92088", "94574", "91016", "92115", "93306",
                          "90301", "92805", "95831", "90012", "94025", "92121",
                          "93241", "92544", "93280", "93277", "94509", "91306",
                          "91732", "91042", "94538", "94952", "94509", "90004",
                          "90803", "93944", "94024", "90262", "91737", "91744",
                          "90260", "92308", "90002", "90301", "92069", "92124",
                          "95672", "92672", "93103", "92782", "90001", "92672",
                          "91331", "93560", "93720", "95468", "91732", "92627",
                          "92557", "92701", "91733", "92109", "91789", "95110",
                          "95367", "92069", "90301", "90706", "90210", "94403",
                          "92021", "92410", "91910", "90706", "92879", "94518",
                          "95128", "91748", "90201", "90086", "90033", "94303",
                          "92630", "94063", "92708", "94513", "94610", "92122",
                          "95703", "95841", "90042", "90501", "90221", "92026",
                          "92882", "94403", "95988", "90748", "91763", "93305",
                          "91901", "90018"))

output <- example %>%
  geocode(method = 'census', 
          mode = 'single',
          full_results = TRUE,
          api_options = list(census_return_type = 'geographies'),
          street = "address", city = "city", state = "state", postalcode = "zip")
#> Passing 206 addresses to the US Census single address geocoder
#> Error: lexical error: invalid char in json text.
#>                                        <!DOCTYPE html PUBLIC "-//W3C//
#>                      (right here) ------^

Created on 2022-06-14 by the reprex package (v2.0.1)

@elfluffybunny
Copy link

I'm experiencing the same issues with the package and I apologize that I can't post code in reprex format since my employer has blocked GitHub and the code is on my work computer. I've posted some screenshots below. The main issue I've encountered is inconsistency with the error lrsulli described. For example, when I pass a tibble of ~ 1,300 addresses, the error shows up apparently at random at different points in the code execution; so it doesn't seem to be an issue with the addresses. Perhaps its something going on with the Census geocoder itself? I did notice that the Census changed the look of its geocoding page and "migrated to the cloud"...If I send fewer addresses, the same error occurs, so I'm not sure this is a function of tibble size either. I have about 50,000 addresses to geocode now that Census has redone it's tracts with the new decennial census, so hopefully this will not be an ongoing issue.

Two attempts with same data. Error at different points.
unnamed

Third attempt - code finally completes without error. No changes were made to the data or code.
0

@jessecambon
Copy link
Owner

jessecambon commented Jun 21, 2022

@elfluffybunny @lrsulli I reached out to the Census and they told me that it looks like the geocoder service is occasionally returning HTML content for some reason (it is supposed to be returning JSON content so this causes a problem when tidygeocoder tries to parse it). I think the next step in debugging this will be to try to return the entire HTML response.

Here's a code example that queries the Census geocoder and I've highlighted the line that should return JSON, but is apparently sometimes returning HTML. If you want to give it a try, you can loop through your inputs with code like this and record all the raw responses (before jsonlite::fromJSON tries to process it). Then you should be able to see if an HTML response was returned and you can post that entire HTML response here.

Depending on what the HTML response looks like, the fix in tidygeocoder might be to parse the HTML data or to just ignore that response and continue without an error.

@jessecambon
Copy link
Owner

@lrsulli I made a script to attempt to reproduce the error and record the raw HTML response. However, I'm not able to reproduce the error even after running this several times. The script is below if you'd like to try. It outputs the content type from the raw response and attempts to parse the content with jsonlite to reproduce the error.

library(httr)
library(tibble)
library(dplyr)
library(jsonlite)
library(tidygeocoder)

# reference: get_census_url() : https://github.com/jessecambon/tidygeocoder/blob/main/R/api_url.R
# api parameters: https://jessecambon.github.io/tidygeocoder/articles/geocoder_services.html#api-parameters

api_url <- tidygeocoder:::get_census_url("geographies", "address")

get_raw_response <- function(street, city, state, zip) {
  httr::GET(url = api_url, 
            query = list(street = street, city = city, state = state, zip = zip,
                         format = 'json', benchmark = 'Public_AR_Current', 
                         vintage = 'Current_Current'))
}

parse_json <- function(resp) jsonlite::fromJSON(httr::content(resp, as = 'text', encoding = "UTF-8"))

test <- get_raw_response("1600 Pennsylvania Ave", "Washington", "DC", "20500")

df <- tibble(address = c("7 ESB SUPP CO UTILITY", "5545 SKY PATRIC WAY # 160",
                              "664 MISSION BLVD", "28003 INDIAN", "611 WOOLERY",
                              "164 EAST ST", "5807 TOPANGA CANYON BLVD APT L208",
                              "26880 LA PAZ RD", "2550 TELEGRAPH AVE APT 418",
                              "1287 S GRANDVIEW AVE # 502", "928 MORRIS PL",
                              "720 SANBORN L AVE", "7630 GOLETA AVE",
                              "480 CAMINO DEL RIO S", "51111 FAYE ST",
                              "223 SORENSON RD", "4630 DORCHESTER LN",
                              "1154 YORKSHIRE CIR", "12007 16TH ST",
                              "RR 190 BOX 3121", "4935 PARKLAND AVE",
                              "1241 DRAKE DR", "21750 MATTESON AVE",
                              "16235 1ST AVE # 2", "1958 MATADOR WAY UNIT 464",
                              "859 E ROUTE 66 STE A", "16015 GARVEY SPC 49",
                              "3216 W WEST ST # 9917", "418 HAZARD",
                              "28882 GAVIOTA AVE # 1", "5149 ANTONE",
                              "411 N VISTA ST", "27301 N EVI LN # 20101",
                              "MR1168 E CONNIE DR", "437 S COCHRAN AVE # 2",
                              "14434 SHARIAN RD","3637 SNELL AVE # 309",
                              "3401 OLIVERWOOD ST", "1300 VIST WALNUT",
                              "128 RANCH RD", "1601 LARK AVE # 8",
                              "127 W C ST", "27696 CAMINO CAPISTRANO",
                              "4307 RENAISSANCE DR APT 304", "1199 E BELLEVUE RD",
                              "3743 GIBSON", "1213 BASSFORD RD", "169 MALIBU RD",
                              "966 WILLOW CREEK RD", "2245 W 250TH ST",
                              "200 E RACQUET CLUB RD UNIT 39", "5245 GOLDENWOOD DARREN J DR",
                              "1202 VERMONT AVE", "13317 CAMINITO CIERA # 131",
                              "691 RANDOLPH AVE", "8180 LANGDON AVE # 301",
                              "2059 CROOKED CRK", "9033 BASELINE RD # F",
                              "12170 WILSTY DIR WAY", "109050 LUGO AVE",
                              "231160 CAVANAUGH", "210 W WASHINGTON ST",
                              "22135 CALIJCO EXPS WAY", "136 72ND AVE # 168",
                              "10561 LUBAO AVE", "VALLEY RANCH RD",
                              "420 S NEW NH HWY # 1", "985 ANDORA PL",
                              "4545 W 71ST ST", "12711 W KING ST", "15831 OLDEN",
                              "2020 CAMINO DE LA REINA UNIT 305", "3528 HOLLOW LN",
                              "PUBLC GDN", "6110 FRIARS RD STE 101",
                              "17099 W BERNARDO DR # 5-208", "109 S SCHOOL ST # A",
                              "6814 TRUS LN", "4 SCG SUPPORT CTR",
                              "2100 S STATE ST APT 41", "317 W WILSHIRE AVE # 1",
                              "2784 BLEMONT", "330 PATEO L REIO # A",
                              "46 DUNE PALMS # 59", "1168 SYDNEY AVE",
                              "358 BROADWAY", "3312 LYMAN RESIDENCES",
                              "1330 W PEARL ST APT B", "16745 2 SHERMAN WAY # 1",
                              "222813 14TH ST", "1107 W MEMORY LN # 13D",
                              "6737 BRENTFORD C", "W 169TH ST", "858 ANGOINETTE LN # F",
                              "1208 BUDGET MOTEL", "7621 S 4TH",
                              "1914 CANYON RD", "4309 30", "1540 PLUMAS",
                              "23416 SANDPIT RD", "9345 HARMON ST",
                              "140 S MARIPOSA AVE APT 309", "336 E HILLCREST BLVD STE 522",
                              "260 N 3RD AVE APT C302", "745 S STATE HIGHWAY 65",
                              "11500 W MANNING", "26 COURT",
                              "932 DPARKMAN AVE", "23 E VALLEY BLVD # 182",
                              "22614 VAN DEENE", "139 59TH SPRING WATER CT",
                              "1819 DE VINA ST LN", "4718 WILLIAMS RD",
                              "21214 MILLPOINT", "157 W ETINANDA AVE",
                              "3120 P O BOX",  "2350 LOWER CHILES RD",
                              "662 W HUNTINGTON DR # 502", "633 COLLEGE GROME",
                              "25 W KAIBAB AVE", "880 W REGENT ST",
                              "76 CERRITOS AVE # 9166", "5929 GEORGIA DR",
                              "108 W 2ND ST APT 312", "3000 SAND HILL RD # 4190",
                              "9603 AMANITO DEL FELIZ", "7005 DUNSMERE ST",
                              "41995 STATE HIGHWAY 74 STE B", "RR 464",
                              "3224 S GA", "48 GREEN HILLS DR",
                              "233446 HAYNES ST", "11633 BAYSE 20 ST",
                              "10824 QUJUNGA CANYON BLVD FOOTHIL BLVD",
                              "2278 SUTTON LOOP", "117 GRANDVIEW",
                              "2722 SEABRILLO CT", "124 N WESTMORELAND AVE APT 105",
                              "5287 E PAOLI WAY", "2395 MACARTHUR RD # S-1-135",
                              "2250 HOMESTEAD CT APT 110", "3961 MARTIN LUTH BLVD # K",
                              "5625 LEHSMAN AVE", "748 ALSBARRY AVE",
                              "120047 CONDON", "8330 CLEARWATER CT",
                              "932012 5TH BLVD", "309 E SEA CRST # 206",
                              "160 VIA PAZ LN",  "11525 VIA PLAYA DE",
                              "THE EST", "3417 CALLE VETA LN",
                              "10 S VOLUNTARIO ST # A", "16211 DORAL ST",
                              "1400 W 2 OLYMPIC BLVD # 1",  "1125 DE GARZA LN # B",
                              "12922 FOX ST", "3500 W 15TH ST",
                              "NONE",  "1614 24TH ST", "12700 ELLIOTT AVE SPC 175",
                              "745 W 19TH ST # 136", "8670 PIGENON RD",
                              "407 S BROADWAY",  "2528 HALF ADELIA AVE",
                              "5426 FANUEL ST", "19307 DEL SOL AVE",
                              "495 VERMONT ST", "3437 SIENN ST",
                              "150180 S RANCHOSANTAFE", "9450 S LACINTAC BLVD",
                              "1035 PARK ST", "535 WOODLAWN AVE",
                              "2121 S EL CAMINO REAL # 616", "15230 OLD CREAK RD",
                              "218 E HFL 10TH ST # 1", "591 STREET # DJ",
                              "68 C", "413 HERMOSA DR",  "780 OAK GROVE RD # RD318",
                              "3087 DAV AVE # 1", "1528 ROUTE AVE",
                              "15038 SANTANA ST", "PO PO BOX", "1316 VIA PORTOLA",
                              "220 E KEEFE # O", "1825 S VARDON",
                              "199 ARLIGTON WAY",  "2720 GARFIELD AVE",
                              "39 SUNSET RD", "10565 HANOVER A AVE",
                              "10279 CAMINITO RIO", "1305 HUMBUG CREEK RD # A",
                              "7061 MADISON AVE # B7", "6129 HALF PIEDMONT AVE",
                              "2044 W 236TH", "411 N CHESTER AVE APT 3",
                              "306 W EL NORTE PKWY # 120",  "1760 VIA PACIFICA # C210",
                              "3149 LA SELVA CIR APT 10", "600 N HUMBOLDT AVE APT 265",
                              "1442 E 215TH ST", "9775 AVENUE",
                              "688 ALTURAS", "246 E 89TH ST", "1419 W 2 36TH # 1"),
                  city = c("CMP PENDLETON", "SACRAMENTO", "RIVERSIDE", "PLS VRDS PNSL",
                           "VAN NUYS", "SACRAMENTO", "WOODLAND HILLS", "ALISO VIEJO",
                           "BERKELEY", "LOS ANGELES", "MONTEBELLO", "LOS ANGELES",
                           "YUCCA VALLEY", "SAN DIEGO", "GARDEN GROVE", "REDCREST",
                           "RNCHO CORDOVA", "STOCKTON", "ARTESIA", "PORTERVILLE",
                           "LOS ANGELES", "SAN DIEGO", "LOS ANGELES", "PARAMOUNT",
                           "NORTHRIDGE", "GLENDORA", "POMONA", "INGLEWOOD",
                           "SANTA ANA", "MENIFEE", "MARIPOSA", "LANCASTER",
                           "CANYON COUNTRY", "HANFORD", "VAN NUYS", "LATHROP",
                           "SAN FRANCISCO", "RIVERSIDE", "ORANGE", "BISHOP",
                           "LOS GATOS", "BRAWLEY", "LAGUNA NIGUEL", "SAN JOSE",
                           "ATWATER", "MONTE RIO", "EUREKA", "BAYSIDE",
                           "LAKE ARROWHEAD", "HARBOR CITY", "PALM SPRINGS",
                           "INGLEWOOD", "LOS ANGELES", "SAN DIEGO", "COSTA MESA",
                           "NORTH HILLS", "POMONA", "SAN PEDRO", "POWAY",
                           "SN BERNRDNO", "EL TORO", "RIVERSIDE", "PERRIS",
                           "TIPTON", "LOS ANGELES", "SANTA CLARITA", "LOS ANGELES",
                           "GILROY", "LAWNDALE", "SN BERNRDNO", "BURBANK",
                           "SAN DIEGO", "HAYWARD", "OAKLAND", "SAN DIEGO",
                           "SAN DIEGO", "GRASS VALLEY", "CITRUS HTS", "ALAMEDA",
                           "UKIAH", "FULLERTON", "BELMONT", "LOS ANGELES",
                           "LA QUINTA", "NEWMAN", "SACRAMENTO", "SN BERNRDNO",
                           "ANAHEIM", "VAN NUYS", "NEWHALL", "ANAHEIM",
                           "OAKLAND", "LAWNDALE", "DALY CITY", "MINERAL",
                           "ALHAMBRA", "SAN RAFAEL", "SACRAMENTO", "LEBEC",
                           "BURNEY", "SANTA ANA", "LOS ANGELES", "INGLEWOOD",
                           "UPLAND", "LINCOLN", "CARUTHERS", "VISALIA",
                           "LOS ANGELES", "ALHAMBRA", "GARDENA", "GARDEN GROVE",
                           "SANTA BARBARA", "SANTA CLARA", "WILMINGTON", "NORTH HILLS",
                           "FALLBROOK", "SAINT HELENA", "MONROVIA", "SAN DIEGO",
                           "BAKERSFIELD", "INGLEWOOD", "ANAHEIM", "SACRAMENTO",
                           "LOS ANGELES", "MENLO PARK", "SAN DIEGO", "LAMONT",
                           "HEMET", "WASCO", "VISALIA", "ANTIOCH", "WINNETKA",
                           "EL MONTE", "TUJUNGA", "FREMONT", "PETALUMA", "ANTIOCH",
                           "LOS ANGELES", "LONG BEACH", "MONTEREY", "LOS ALTOS",
                           "LYNWOOD", "ALTA LOMA", "LA PUENTE", "LAWNDALE",
                           "APPLE VALLEY", "LOS ANGELES", "INGLEWOOD", "SAN MARCOS",
                           "SAN DIEGO", "RESCUE", "SAN CLEMENTE", "SANTA BARBARA",
                           "TUSTIN", "LOS ANGELES", "SAN CLEMENTE", "PACOIMA",
                           "ROSAMOND", "FRESNO", "POINT ARENA", "EL MONTE",
                           "COSTA MESA", "MORENO VALLEY", "SANTA ANA", "S EL MONTE",
                           "SAN DIEGO", "WALNUT", "SAN JOSE", "RIVERBANK",
                           "SAN MARCOS", "INGLEWOOD", "BELLFLOWER", "BEVERLY HILLS",
                           "SAN MATEO", "EL CAJON", "SN BERNRDNO", "CHULA VISTA",
                           "BELLFLOWER", "CORONA", "CONCORD", "SAN JOSE",
                           "ROWLAND HEIGHTS", "CUDAHY", "LOS ANGELES", "LOS ANGELES",
                           "PALO ALTO", "EL TORO", "REDWOOD CITY", "SANTA ANA",
                           "BRENTWOOD", "OAKLAND", "SAN DIEGO", "APPLEGATE",
                           "SACRAMENTO", "LOS ANGELES", "TORRANCE", "COMPTON",
                           "ESCONDIDO", "CORONA", "SAN MATEO", "WILLOWS",
                           "WILMINGTON", "MONTCLAIR", "BAKERSFIELD", "ALPINE",
                           "LOS ANGELES"),
                  state = "CA",
                  zip = c("92055", "95823", "92509", "90275", "91436", "95814",
                          "91367", "92656", "94704", "90066", "90640", "90027",
                          "92284", "92108", "92840", "95569", "95742", "95207",
                          "90701", "93257", "90043", "92114", "90066", "90723",
                          "91330", "91740", "91766", "90302", "92703", "92584",
                          "95338", "93539", "91351", "93230", "91401", "95330",
                          "94101", "92506", "92867", "93514", "95032", "92227",
                          "92677", "95134", "95301", "95462", "95503", "95524",
                          "92352", "90710", "92262", "90302", "90044", "92129",
                          "92626", "91343", "91765", "90731", "92064", "92404",
                          "92630", "92506", "92570", "93272", "90001", "91351",
                          "90020", "95020", "90260", "92410", "91505", "92108",
                          "94541", "94604", "92108", "92127", "95945", "95610",
                          "94501", "95482", "92832", "94002", "90033", "92253",
                          "95360", "95818", "92414", "92801", "91406", "91321",
                          "92807", "94621", "90260", "94015", "96063", "91803",
                          "94903", "95822", "93243", "96013", "92704", "90004",
                          "90301", "91786", "95648", "93609", "93277", "90026",
                          "91801", "90249", "92843", "93101", "95050", "90744",
                          "91343", "92088", "94574", "91016", "92115", "93306",
                          "90301", "92805", "95831", "90012", "94025", "92121",
                          "93241", "92544", "93280", "93277", "94509", "91306",
                          "91732", "91042", "94538", "94952", "94509", "90004",
                          "90803", "93944", "94024", "90262", "91737", "91744",
                          "90260", "92308", "90002", "90301", "92069", "92124",
                          "95672", "92672", "93103", "92782", "90001", "92672",
                          "91331", "93560", "93720", "95468", "91732", "92627",
                          "92557", "92701", "91733", "92109", "91789", "95110",
                          "95367", "92069", "90301", "90706", "90210", "94403",
                          "92021", "92410", "91910", "90706", "92879", "94518",
                          "95128", "91748", "90201", "90086", "90033", "94303",
                          "92630", "94063", "92708", "94513", "94610", "92122",
                          "95703", "95841", "90042", "90501", "90221", "92026",
                          "92882", "94403", "95988", "90748", "91763", "93305",
                          "91901", "90018"))

# iterate through addresses and put raw responses in a list
results <- list()
for (i in seq(1, nrow(df))) {
  results[[i]] <- get_raw_response(df$address[[i]], df$city[[i]], df$state[[i]], df$zip[[i]]) 
  if (i %% 25 == 0) {
    print(paste0("i = ", i))
  }
}

# examine content types 
content_types <- sapply(results, function(x) x$headers$`content-type`) %>%
  as_tibble()

content_types %>% count(value) %>% print()

# see if json parsing causes error
parsed <- lapply(results, parse_json) 

# extract coordinates for a specific example
parsed[[198]]$result$addressMatches$coordinates %>% print()

# equivalent tidygeocoder query ------------------------------------------------
output <- df %>%
  geocode(method = 'census',
          mode = 'single',
          full_results = TRUE,
          api_options = list(census_return_type = 'geographies'),
          street = "address", city = "city", state = "state", 
          postalcode = "zip")

@elfluffybunny
Copy link

In the past week I've processed about 60,000 addresses using the base code I posted above, sending batches of 500 to the Census using a loop and writing to a list. In ~ 2 of 3 batches, the Census Geocoder returns the batch fully geocoded without issue. This may be why @jessecambon couldn't reproduce the error that @lrsulli and I have been encountering. In the other 1 of 3 batches, the script will terminate and return the error message described above. For now, it seems the remedy is to re-run the batch. Most of the time, on the second run, the Census Geocoder will process all the addresses without incident. Occasionally, I'll have to re-run the batch 3-4 times to get it to go through. It never stops on the same address, so it can't be an issue with bad addresses. I never have to re-run more than 4 times.

For those who are using this awesome and extremely helpful package to obtain Census geocoding data for a large number of addresses, my suggestion would be to split the addresses into batches and return the Census output to a list. This way, if the Census returns an error message, the successfully coded batches saved in the list won't be lost and the specific batch that threw the error can be re-run. If I get some time, I'll post the code in another post in this thread.

@elfluffybunny
Copy link

elfluffybunny commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants