Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning For Reflective Access #106

Open
billdenney opened this issue May 7, 2019 · 16 comments
Open

Warning For Reflective Access #106

billdenney opened this issue May 7, 2019 · 16 comments

Comments

@billdenney
Copy link

When working with the current version of R and rJava, there is a warning with extract_table() indicating:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by RJavaTools to method java.util.ArrayList$Itr.hasNext()
WARNING: Please consider reporting this to the maintainers of RJavaTools
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

Unfortunately, I cannot share the underlying .pdf file that caused the error.

@fpinter
Copy link

fpinter commented Jun 19, 2019

I reproduced using the first code example from the readme.

library("tabulizer")
f <- system.file("examples", "data.pdf", package = "tabulizer")
out1 <- extract_tables(f)

(Mac 10.13, tabulizer 0.2.2, rJava 0.9-11, R 3.6.0, Java 11.0.1)

@bedantaguru
Copy link

Getting the same in Linux too

@ziembaej
Copy link

Just got the same warning. Using R version 3.6.0 (2019-04-26) on Mac OS 10.14.6

Has this caused any actual problems for others?

@bedantaguru
Copy link

In Travis it causes build failure.

@antonio1970
Copy link

Anyone was able to solve it, I got the same error

@MattCowgill
Copy link

Same here

@dernapo
Copy link

dernapo commented Mar 25, 2020

Same issue here

sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8
[4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=de_DE.UTF-8 LC_ADDRESS=de_DE.UTF-8
[10] LC_TELEPHONE=de_DE.UTF-8 LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] janitor_1.2.0 tabulizer_0.2.2 data.table_1.12.6 tidytext_0.2.0 dplyr_0.8.3
[6] stringr_1.4.0 rvest_0.3.4 xml2_1.2.2 selectr_0.4-1 cronR_0.4.0

@lefcgis
Copy link

lefcgis commented Mar 29, 2020

I have the same problem. I'm wondering if this problem is about the "quality document". In other words, there are documents (pdf's) can use it with Tabulizer. But, others not.

For example, if you download this pdf you can use Tabulizer. However, if you use this one cannot. I don't know why!. I don't believe illegal problems with the document. I think the "quality of information".

If you make a paper in Word or Excel, then export to pdf and try it, you can do it! So, it seems Tabulizer algorithm doesn't work in all pdf documents 🧙‍♂️

P.S. I ran in RStudio 1.2.5033 an R 3.6.3 (2020-02-29)

@billdenney
Copy link
Author

@lefcgis, there definitely could be some documents that trigger the issue and some that do not, but it is a Java coding issue and not an issue with a PDF file (as in, the pdf standard is being followed). For more information, see https://stackoverflow.com/questions/50251798/what-is-an-illegal-reflective-access

@lefcgis
Copy link

lefcgis commented Mar 29, 2020

Vale!
So, it's possible that the reason would be Jdk and Jdr packages, because there are prewiew prerequisites to install rJava. Thanks for your answer, @billdenney 🧙‍♂️

@bedantaguru
Copy link

Now it's causing to break my build

@cjyetman
Copy link

For me, this warning only occurs the first time the example code is run in a new R session. Subsequent runs do not show this warning. Is that the same behavior others here are seeing?

The test code I've been using is...

out <- tabulizer::extract_tables(system.file("examples", "data.pdf", package = "tabulizer"))

If so, I'm curious if #125 resolves this issue for you.

@maahutch
Copy link

Same thing happened to me. Got this error the first time then just an empty list each subsequent run. I can read other pdfs but it fails on one which is a different format.

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] tabulizer_0.2.2

loaded via a namespace (and not attached):
[1] tabulizerjars_1.0.1 compiler_4.0.2      tools_4.0.2         rJava_0.9-13       
[5] png_0.1-7 

@cjyetman
Copy link

@maahutch An error, or a warning? Those are significantly different.

@bbolker
Copy link

bbolker commented Jul 29, 2024

FWIW I'm getting a WARNING from Java (not R), and an empty list, the first time. Subsequently I get an empty list without a warning from Javascript.

It's possible that this particular PDF is image-only and has no underlying text anyway .. ?

pg6.pdf

@pachadotdev
Copy link
Contributor

FWIW I'm getting a WARNING from Java (not R), and an empty list, the first time. Subsequently I get an empty list without a warning from Javascript.

It's possible that this particular PDF is image-only and has no underlying text anyway .. ?

pg6.pdf

yes, that would require OCR (i.e., tesseract or paws)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests