Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test out Iceberg 1.7.2 RC0 #1581

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Jan 27, 2025

dependabot bot and others added 2 commits January 27, 2025 16:17
Bumps [pyspark](https://github.com/apache/spark) from 3.5.3 to 3.5.4.
- [Commits](apache/spark@v3.5.3...v3.5.4)

---
updated-dependencies:
- dependency-name: pyspark
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
@Fokko
Copy link
Contributor Author

Fokko commented Jan 27, 2025

Doesn't look good:

2025-01-27T15:48:47.7478478Z ----------------------------- Captured stdout call -----------------------------
2025-01-27T15:48:47.7479011Z #
2025-01-27T15:48:47.7479424Z # A fatal error has been detected by the Java Runtime Environment:
2025-01-27T15:48:47.7479975Z #
2025-01-27T15:48:47.7480374Z #  SIGSEGV (0xb) at pc=0x00007f9d2951dc11, pid=11022, tid=11059
2025-01-27T15:48:47.7480874Z #
2025-01-27T15:48:47.7481430Z # JRE version: OpenJDK Runtime Environment Temurin-11.0.25+9 (11.0.25+9) (build 11.0.25+9)
2025-01-27T15:48:47.7482564Z # Java VM: OpenJDK 64-Bit Server VM Temurin-11.0.25+9 (11.0.25+9, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
2025-01-27T15:48:47.7483358Z # Problematic frame:
2025-01-27T15:48:47.7483791Z # V  [libjvm.so+0xd1dc11][thread 11935 also had an error]
2025-01-27T15:48:47.7484783Z   Reflection::verify_class_access(Klass const*, InstanceKlass const*, bool)+0x151
2025-01-27T15:48:47.7486268Z #
2025-01-27T15:48:47.7487561Z # Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h" (or dumping to /home/runner/work/iceberg-python/iceberg-python/core.11022)
2025-01-27T15:48:47.7489111Z #
2025-01-27T15:48:47.7489511Z # An error report file with more information is saved as:
2025-01-27T15:48:47.7490224Z # /home/runner/work/iceberg-python/iceberg-python/hs_err_pid11022.log
2025-01-27T15:48:47.7490821Z #
2025-01-27T15:48:47.7491131Z # Compiler replay data is saved as:
2025-01-27T15:48:47.7491739Z # /home/runner/work/iceberg-python/iceberg-python/replay_pid11022.log
2025-01-27T15:48:47.7492308Z #
2025-01-27T15:48:47.7492662Z # If you would like to submit a bug report, please visit:
2025-01-27T15:48:47.7493254Z #   https://github.com/adoptium/adoptium-support/issues
2025-01-27T15:48:47.7493729Z #
2025-01-27T15:48:47.7494146Z ------------------------------ Captured log call -------------------------------

It works on my local machine 🫠

@Fokko Fokko force-pushed the fd-test-1-7-2-rc0 branch from 0031101 to f3f6295 Compare January 27, 2025 20:22
@corleyma
Copy link

corleyma commented Jan 27, 2025

Not to muddy the waters since this probably is not literally the same thing but:

I experienced something similar(ish) to this using Iceberg 1.7.1 with Spark 3.5.4. Downgrading to 3.5.3 fixed it, didn't have time to figure out what was actually going on.

(I only saw the problem when going MERGE INTOs against tables stored in GCP and only for certain files; integration tests against Minio never surfaced anything like it).

Here's what I got:

# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007b90e8939a98, pid=32, tid=82
#
# JRE version: OpenJDK Runtime Environment (17.0.13+11) (build 17.0.13+11-Debian-2deb12u1)
# Java VM: OpenJDK 64-Bit Server VM (17.0.13+11-Debian-2deb12u1, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0xe43a98] SymbolTable::do_lookup(char const*, int, unsigned long)+0xd8
#
# Core dump will be written. Default location: /core.%e.32.%t
#
# An error report file with more information is saved as:
# /usr/src/app/hs_err_pid32.log
#
# If you would like to submit a bug report, please visit:
# https://bugs.debian.org/openjdk-17
#

In my case, I do have a setup where I am co-mingling edits to table metadata (schema updates) made by pyiceberg with writes made by Spark/pyspark, so I do think the interplay may be related somehow. I never got this failure on initial table write, only when subsequently upserting to a table that some data in it.

@Fokko
Copy link
Contributor Author

Fokko commented Jan 27, 2025

Thanks @corleyma for jumping in here, at least I'm not alone 😃

You made me think, and I don't think we need to bump Spark separately, as in #1461. Spark 3.5.4 is not yet supported, see apache/iceberg#11731. So we need to bump to 1.7.2 here as well:

f"--packages org.apache.iceberg:iceberg-spark-runtime-{spark_version}_{scala_version}:{iceberg_version},"
I just expected a different error 😮‍💨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants