-
Notifications
You must be signed in to change notification settings - Fork 101
Open
Labels
type::bugdescribes erroneous operation, use severity::* to classify the typedescribes erroneous operation, use severity::* to classify the type
Description
Checklist
- I added a descriptive title
- I searched open reports and couldn't find a duplicate
What happened?
S3 access via pyarrow's filesystem doesn't work when executed in a Spark environment created via conda-pack. For simplicity, my example code below doesn't use Spark but the behavior is the same.
I'm trying to run this code:
from pyarrow.fs import S3FileSystem
import ssl
print(ssl.get_default_verify_paths())
print(S3FileSystem.from_uri("s3://s3-bucket/"))I'm creating a conda package like so:
(base) bash-4.4# conda create --name pyarrow_conda_from_docker -c conda-forge conda-pack pyarrow python=3.10
Channels:
- conda-forge
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /root/miniconda3/envs/pyarrow_conda_from_docker
added / updated specs:
- conda-pack
- pyarrow
- python=3.10
The following packages will be downloaded:
package | build
---------------------------|-----------------
_libgcc_mutex-0.1 | conda_forge 3 KB conda-forge
_openmp_mutex-4.5 | 2_gnu 23 KB conda-forge
aws-c-auth-0.9.1 | h48c9088_3 120 KB conda-forge
aws-c-cal-0.9.2 | he7b75e1_1 50 KB conda-forge
aws-c-common-0.12.4 | hb03c661_0 231 KB conda-forge
aws-c-compression-0.3.1 | h92c474e_6 22 KB conda-forge
aws-c-event-stream-0.5.6 | h82d11aa_3 58 KB conda-forge
aws-c-http-0.10.4 | h94feff3_3 219 KB conda-forge
aws-c-io-0.22.0 | h57f3b0d_1 177 KB conda-forge
aws-c-mqtt-0.13.3 | h2b1cf8c_6 211 KB conda-forge
aws-c-s3-0.8.6 | h4e5ac4b_5 134 KB conda-forge
aws-c-sdkutils-0.2.4 | h92c474e_1 58 KB conda-forge
aws-checksums-0.2.7 | h92c474e_2 75 KB conda-forge
aws-crt-cpp-0.34.4 | h60c762c_0 399 KB conda-forge
aws-sdk-cpp-1.11.606 | h32384e2_4 3.2 MB conda-forge
azure-core-cpp-1.16.0 | h3a458e0_1 344 KB conda-forge
azure-identity-cpp-1.12.0 | ha729027_0 236 KB conda-forge
azure-storage-blobs-cpp-12.14.0| hb1c9500_1 564 KB conda-forge
azure-storage-common-cpp-12.10.0| h4bb41a7_3 146 KB conda-forge
azure-storage-files-datalake-cpp-12.12.0| h8b27e44_3 293 KB conda-forge
bzip2-1.0.8 | hda65f42_8 254 KB conda-forge
c-ares-1.34.5 | hb9d3cd8_0 202 KB conda-forge
ca-certificates-2025.10.5 | hbd8a1cb_0 152 KB conda-forge
conda-pack-0.8.1 | pyhd8ed1ab_1 34 KB conda-forge
gflags-2.2.2 | h5888daf_1005 117 KB conda-forge
glog-0.7.1 | hbabe93e_0 140 KB conda-forge
icu-75.1 | he02047a_0 11.6 MB conda-forge
keyutils-1.6.3 | hb9d3cd8_0 131 KB conda-forge
krb5-1.21.3 | h659f571_0 1.3 MB conda-forge
ld_impl_linux-64-2.44 | ha97dd6f_2 730 KB conda-forge
libabseil-20250512.1 | cxx17_hba17884_0 1.2 MB conda-forge
libarrow-21.0.0 | h56a6dad_8_cpu 5.9 MB conda-forge
libarrow-acero-21.0.0 | h635bf11_8_cpu 568 KB conda-forge
libarrow-compute-21.0.0 | h8c2c5c3_8_cpu 2.9 MB conda-forge
libarrow-dataset-21.0.0 | h635bf11_8_cpu 566 KB conda-forge
libarrow-substrait-21.0.0 | h3f74fd7_8_cpu 472 KB conda-forge
libbrotlicommon-1.1.0 | hb03c661_4 68 KB conda-forge
libbrotlidec-1.1.0 | hb03c661_4 33 KB conda-forge
libbrotlienc-1.1.0 | hb03c661_4 283 KB conda-forge
libcrc32c-1.1.2 | h9c3ff4c_0 20 KB conda-forge
libcurl-8.14.1 | h332b0f4_0 439 KB conda-forge
libedit-3.1.20250104 | pl5321h7949ede_0 132 KB conda-forge
libev-4.33 | hd590300_2 110 KB conda-forge
libevent-2.1.12 | hf998b51_1 417 KB conda-forge
libexpat-2.7.1 | hecca717_0 73 KB conda-forge
libffi-3.4.6 | h2dba641_1 56 KB conda-forge
libgcc-15.2.0 | h767d61c_7 803 KB conda-forge
libgcc-ng-15.2.0 | h69a702a_7 29 KB conda-forge
libgomp-15.2.0 | h767d61c_7 437 KB conda-forge
libgoogle-cloud-2.39.0 | hdb79228_0 1.2 MB conda-forge
libgoogle-cloud-storage-2.39.0| hdbdcf42_0 785 KB conda-forge
libgrpc-1.73.1 | h1e535eb_0 8.0 MB conda-forge
libiconv-1.18 | h3b78370_2 772 KB conda-forge
liblzma-5.8.1 | hb9d3cd8_2 110 KB conda-forge
libnghttp2-1.67.0 | had1ee68_0 651 KB conda-forge
libnsl-2.0.1 | hb9d3cd8_1 33 KB conda-forge
libopentelemetry-cpp-1.21.0| hb9b0907_1 865 KB conda-forge
libopentelemetry-cpp-headers-1.21.0| ha770c72_1 355 KB conda-forge
libparquet-21.0.0 | h790f06f_8_cpu 1.3 MB conda-forge
libprotobuf-6.31.1 | h9ef548d_1 3.8 MB conda-forge
libre2-11-2025.08.12 | h7b12aa8_1 206 KB conda-forge
libsqlite-3.50.4 | h0c1763c_0 911 KB conda-forge
libssh2-1.11.1 | hcf80075_0 298 KB conda-forge
libstdcxx-15.2.0 | h8f9b012_7 3.7 MB conda-forge
libstdcxx-ng-15.2.0 | h4852527_7 29 KB conda-forge
libthrift-0.22.0 | h454ac66_1 414 KB conda-forge
libutf8proc-2.11.0 | hb04c3b8_0 84 KB conda-forge
libuuid-2.41.2 | he9a06e4_0 36 KB conda-forge
libxcrypt-4.4.36 | hd590300_1 98 KB conda-forge
libxml2-2.15.0 | h26afc86_1 44 KB conda-forge
libxml2-16-2.15.0 | ha9997c6_1 543 KB conda-forge
libzlib-1.3.1 | hb9d3cd8_2 60 KB conda-forge
lz4-c-1.10.0 | h5888daf_1 163 KB conda-forge
ncurses-6.5 | h2d0b736_3 871 KB conda-forge
nlohmann_json-3.12.0 | h54a6638_1 133 KB conda-forge
openssl-3.5.4 | h26f9b46_0 3.0 MB conda-forge
orc-2.2.1 | hd747db4_0 1.3 MB conda-forge
pip-25.2 | pyh8b19718_0 1.1 MB conda-forge
prometheus-cpp-1.3.0 | ha5d0236_0 195 KB conda-forge
pyarrow-21.0.0 | py310hff52083_1 26 KB conda-forge
pyarrow-core-21.0.0 |py310h923f568_1_cpu 5.0 MB conda-forge
python-3.10.18 |hd6af730_0_cpython 23.9 MB conda-forge
python_abi-3.10 | 8_cp310 7 KB conda-forge
re2-2025.08.12 | h5301d42_1 27 KB conda-forge
readline-8.2 | h8c095d6_2 276 KB conda-forge
s2n-1.5.26 | h5ac9029_0 382 KB conda-forge
setuptools-80.9.0 | pyhff2d567_0 731 KB conda-forge
snappy-1.2.2 | h03e3b7b_0 45 KB conda-forge
tk-8.6.13 |noxft_hd72426e_102 3.1 MB conda-forge
tzdata-2025b | h78e105d_0 120 KB conda-forge
wheel-0.45.1 | pyhd8ed1ab_1 61 KB conda-forge
zlib-1.3.1 | hb9d3cd8_2 90 KB conda-forge
zstd-1.5.7 | hb8e6e7a_2 554 KB conda-forge
------------------------------------------------------------
Total: 100.7 MB
The following NEW packages will be INSTALLED:
_libgcc_mutex conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
_openmp_mutex conda-forge/linux-64::_openmp_mutex-4.5-2_gnu
aws-c-auth conda-forge/linux-64::aws-c-auth-0.9.1-h48c9088_3
aws-c-cal conda-forge/linux-64::aws-c-cal-0.9.2-he7b75e1_1
aws-c-common conda-forge/linux-64::aws-c-common-0.12.4-hb03c661_0
aws-c-compression conda-forge/linux-64::aws-c-compression-0.3.1-h92c474e_6
aws-c-event-stream conda-forge/linux-64::aws-c-event-stream-0.5.6-h82d11aa_3
aws-c-http conda-forge/linux-64::aws-c-http-0.10.4-h94feff3_3
aws-c-io conda-forge/linux-64::aws-c-io-0.22.0-h57f3b0d_1
aws-c-mqtt conda-forge/linux-64::aws-c-mqtt-0.13.3-h2b1cf8c_6
aws-c-s3 conda-forge/linux-64::aws-c-s3-0.8.6-h4e5ac4b_5
aws-c-sdkutils conda-forge/linux-64::aws-c-sdkutils-0.2.4-h92c474e_1
aws-checksums conda-forge/linux-64::aws-checksums-0.2.7-h92c474e_2
aws-crt-cpp conda-forge/linux-64::aws-crt-cpp-0.34.4-h60c762c_0
aws-sdk-cpp conda-forge/linux-64::aws-sdk-cpp-1.11.606-h32384e2_4
azure-core-cpp conda-forge/linux-64::azure-core-cpp-1.16.0-h3a458e0_1
azure-identity-cpp conda-forge/linux-64::azure-identity-cpp-1.12.0-ha729027_0
azure-storage-blo~ conda-forge/linux-64::azure-storage-blobs-cpp-12.14.0-hb1c9500_1
azure-storage-com~ conda-forge/linux-64::azure-storage-common-cpp-12.10.0-h4bb41a7_3
azure-storage-fil~ conda-forge/linux-64::azure-storage-files-datalake-cpp-12.12.0-h8b27e44_3
bzip2 conda-forge/linux-64::bzip2-1.0.8-hda65f42_8
c-ares conda-forge/linux-64::c-ares-1.34.5-hb9d3cd8_0
ca-certificates conda-forge/noarch::ca-certificates-2025.10.5-hbd8a1cb_0
conda-pack conda-forge/noarch::conda-pack-0.8.1-pyhd8ed1ab_1
gflags conda-forge/linux-64::gflags-2.2.2-h5888daf_1005
glog conda-forge/linux-64::glog-0.7.1-hbabe93e_0
icu conda-forge/linux-64::icu-75.1-he02047a_0
keyutils conda-forge/linux-64::keyutils-1.6.3-hb9d3cd8_0
krb5 conda-forge/linux-64::krb5-1.21.3-h659f571_0
ld_impl_linux-64 conda-forge/linux-64::ld_impl_linux-64-2.44-ha97dd6f_2
libabseil conda-forge/linux-64::libabseil-20250512.1-cxx17_hba17884_0
libarrow conda-forge/linux-64::libarrow-21.0.0-h56a6dad_8_cpu
libarrow-acero conda-forge/linux-64::libarrow-acero-21.0.0-h635bf11_8_cpu
libarrow-compute conda-forge/linux-64::libarrow-compute-21.0.0-h8c2c5c3_8_cpu
libarrow-dataset conda-forge/linux-64::libarrow-dataset-21.0.0-h635bf11_8_cpu
libarrow-substrait conda-forge/linux-64::libarrow-substrait-21.0.0-h3f74fd7_8_cpu
libbrotlicommon conda-forge/linux-64::libbrotlicommon-1.1.0-hb03c661_4
libbrotlidec conda-forge/linux-64::libbrotlidec-1.1.0-hb03c661_4
libbrotlienc conda-forge/linux-64::libbrotlienc-1.1.0-hb03c661_4
libcrc32c conda-forge/linux-64::libcrc32c-1.1.2-h9c3ff4c_0
libcurl conda-forge/linux-64::libcurl-8.14.1-h332b0f4_0
libedit conda-forge/linux-64::libedit-3.1.20250104-pl5321h7949ede_0
libev conda-forge/linux-64::libev-4.33-hd590300_2
libevent conda-forge/linux-64::libevent-2.1.12-hf998b51_1
libexpat conda-forge/linux-64::libexpat-2.7.1-hecca717_0
libffi conda-forge/linux-64::libffi-3.4.6-h2dba641_1
libgcc conda-forge/linux-64::libgcc-15.2.0-h767d61c_7
libgcc-ng conda-forge/linux-64::libgcc-ng-15.2.0-h69a702a_7
libgomp conda-forge/linux-64::libgomp-15.2.0-h767d61c_7
libgoogle-cloud conda-forge/linux-64::libgoogle-cloud-2.39.0-hdb79228_0
libgoogle-cloud-s~ conda-forge/linux-64::libgoogle-cloud-storage-2.39.0-hdbdcf42_0
libgrpc conda-forge/linux-64::libgrpc-1.73.1-h1e535eb_0
libiconv conda-forge/linux-64::libiconv-1.18-h3b78370_2
liblzma conda-forge/linux-64::liblzma-5.8.1-hb9d3cd8_2
libnghttp2 conda-forge/linux-64::libnghttp2-1.67.0-had1ee68_0
libnsl conda-forge/linux-64::libnsl-2.0.1-hb9d3cd8_1
libopentelemetry-~ conda-forge/linux-64::libopentelemetry-cpp-1.21.0-hb9b0907_1
libopentelemetry-~ conda-forge/linux-64::libopentelemetry-cpp-headers-1.21.0-ha770c72_1
libparquet conda-forge/linux-64::libparquet-21.0.0-h790f06f_8_cpu
libprotobuf conda-forge/linux-64::libprotobuf-6.31.1-h9ef548d_1
libre2-11 conda-forge/linux-64::libre2-11-2025.08.12-h7b12aa8_1
libsqlite conda-forge/linux-64::libsqlite-3.50.4-h0c1763c_0
libssh2 conda-forge/linux-64::libssh2-1.11.1-hcf80075_0
libstdcxx conda-forge/linux-64::libstdcxx-15.2.0-h8f9b012_7
libstdcxx-ng conda-forge/linux-64::libstdcxx-ng-15.2.0-h4852527_7
libthrift conda-forge/linux-64::libthrift-0.22.0-h454ac66_1
libutf8proc conda-forge/linux-64::libutf8proc-2.11.0-hb04c3b8_0
libuuid conda-forge/linux-64::libuuid-2.41.2-he9a06e4_0
libxcrypt conda-forge/linux-64::libxcrypt-4.4.36-hd590300_1
libxml2 conda-forge/linux-64::libxml2-2.15.0-h26afc86_1
libxml2-16 conda-forge/linux-64::libxml2-16-2.15.0-ha9997c6_1
libzlib conda-forge/linux-64::libzlib-1.3.1-hb9d3cd8_2
lz4-c conda-forge/linux-64::lz4-c-1.10.0-h5888daf_1
ncurses conda-forge/linux-64::ncurses-6.5-h2d0b736_3
nlohmann_json conda-forge/linux-64::nlohmann_json-3.12.0-h54a6638_1
openssl conda-forge/linux-64::openssl-3.5.4-h26f9b46_0
orc conda-forge/linux-64::orc-2.2.1-hd747db4_0
pip conda-forge/noarch::pip-25.2-pyh8b19718_0
prometheus-cpp conda-forge/linux-64::prometheus-cpp-1.3.0-ha5d0236_0
pyarrow conda-forge/linux-64::pyarrow-21.0.0-py310hff52083_1
pyarrow-core conda-forge/linux-64::pyarrow-core-21.0.0-py310h923f568_1_cpu
python conda-forge/linux-64::python-3.10.18-hd6af730_0_cpython
python_abi conda-forge/noarch::python_abi-3.10-8_cp310
re2 conda-forge/linux-64::re2-2025.08.12-h5301d42_1
readline conda-forge/linux-64::readline-8.2-h8c095d6_2
s2n conda-forge/linux-64::s2n-1.5.26-h5ac9029_0
setuptools conda-forge/noarch::setuptools-80.9.0-pyhff2d567_0
snappy conda-forge/linux-64::snappy-1.2.2-h03e3b7b_0
tk conda-forge/linux-64::tk-8.6.13-noxft_hd72426e_102
tzdata conda-forge/noarch::tzdata-2025b-h78e105d_0
wheel conda-forge/noarch::wheel-0.45.1-pyhd8ed1ab_1
zlib conda-forge/linux-64::zlib-1.3.1-hb9d3cd8_2
zstd conda-forge/linux-64::zstd-1.5.7-hb8e6e7a_2
Proceed ([y]/n)? y
Downloading and Extracting Packages:
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate pyarrow_conda_from_docker
#
# To deactivate an active environment, use
#
# $ conda deactivate
(base) bash-4.4# conda activate pyarrow_conda_from_docker
(pyarrow_conda_from_docker) bash-4.4# conda-pack -n pyarrow_conda_from_docker --exclude */__pycache__/* -o /conda_env/indocker/pyarrow_conda_from_docker.tar.gz
/root/miniconda3/envs/pyarrow_conda_from_docker/lib/python3.10/site-packages/conda_pack/core.py:16: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Collecting packages...
Packing environment at '/root/miniconda3/envs/pyarrow_conda_from_docker' to '/conda_env/indocker/pyarrow_conda_from_docker.tar.gz'
[########################################] | 100% Completed | 7.4s
Then, when I unpack this
tkopczynski@dotdata ~/t/c/indocker> tar zxf pyarrow_conda_from_docker.tar.gz -C pyarrow_conda/
I try to check my S3 connectivity:
bash-4.4# /conda_env/indocker/pyarrow_conda/bin/python /conda_env/main.py
DefaultVerifyPaths(cafile=None, capath=None, openssl_cafile_env='SSL_CERT_FILE', openssl_cafile='/home/conda/feedstock_root/build_artifacts/openssl_split_1759323449041/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho/ssl/cert.pem', openssl_capath_env='SSL_CERT_DIR', openssl_capath='/home/conda/feedstock_root/build_artifacts/openssl_split_1759323449041/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho/ssl/certs')
Traceback (most recent call last):
File "/conda_env/main.py", line 5, in <module>
print(S3FileSystem.from_uri("s3://s3-bucket/"))
File "pyarrow/_fs.pyx", line 502, in pyarrow._fs.FileSystem.from_uri
File "pyarrow/_fs.pyx", line 457, in pyarrow._fs.FileSystem._native_from_uri
File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
OSError: When resolving region for bucket 's3-bucket': AWS Error NETWORK_CONNECTION during HeadBucket operation: curlCode: 77, Problem with the SSL CA cert (path? access rights?); Details: error setting certificate file: /home/conda/feedstock_root/build_artifacts/curl_split_recipe_1749032811691/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p
It fails with an SSL error.
However, it works when I do conda-unpack:
(pyarrow_conda) bash-4.4# conda-unpack
(pyarrow_conda) bash-4.4# /conda_env/indocker/pyarrow_conda/bin/python /conda_env/main.py
DefaultVerifyPaths(cafile='/conda_env/indocker/pyarrow_conda/ssl/cert.pem', capath='/conda_env/indocker/pyarrow_conda/ssl/certs', openssl_cafile_env='SSL_CERT_FILE', openssl_cafile='/conda_env/indocker/pyarrow_conda/ssl/cert.pem', openssl_capath_env='SSL_CERT_DIR', openssl_capath='/conda_env/indocker/pyarrow_conda/ssl/certs')
(<pyarrow._s3fs.S3FileSystem object at 0x767fc08f90f0>, 's3-bucket')
Unfortunately, in the Spark environment it's impossible to run conda-unpack right now which was discussed in #89. Also, conda-unpack is not mentioned in the official doc: https://conda.github.io/conda-pack/spark.html.
Additional Context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
type::bugdescribes erroneous operation, use severity::* to classify the typedescribes erroneous operation, use severity::* to classify the type