Description
Running Java applications using the current base image amazon/aws-lambda-java:21.2025.02.24.10-x86_64
produces exceptions when reading from or writing to files with non-ascii characters in their filenames.
This is because the read-only system property sun.jnu.encoding
is initialized with the value ANSI_X3.4-1968
by the JVM and then the Java IO code cannot map the unicode characters to this encoding when constructing the File
/ Path
objects.
The underlying problem seems to be the absence of the proper glibc extensions for the English locale in the base image: installing the package glibc-langpack-en
fixes the problem.
This issue is NOT reproducible with the images for other Java versions
- 17:
amazon/aws-lambda-java:17.2025.02.24.09-x86_64
- 11:
amazon/aws-lambda-java:11.2025.02.24.09-x86_64
Would you therefore consider adding this package back to default Java 21 image as well?
Potentially related issues:
- Unsupported Locale Error with upgrading to Python 3.12 image from Python 3.10/11 #229
- Considerations on Java base docker image size #155
Steps to reproduce
Save this Java program as a file called DebugEnv.java
:
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class DebugEnv {
public static void main(String[] args) throws IOException {
System.out.println("Environment Variables:");
System.getenv().forEach((k, v) -> System.out.println(k + ": " + v));
System.out.println();
System.out.println("sun.jnu.encoding: " + System.getProperty("sun.jnu.encoding"));
Path tmpDir = Paths.get(System.getProperty("java.io.tmpdir"));
Files.write(tmpDir.resolve("Germän Ümläüts.txt"), "does not matter".getBytes(StandardCharsets.UTF_8));
Files.write(tmpDir.resolve("隨機文字.txt"), "does not matter".getBytes(StandardCharsets.UTF_8));
}
}
Then run this docker command from the same directory:
docker run --rm --platform=linux/amd64 -it --entrypoint=java -v ${PWD}/DebugEnv.java:/var/task/DebugEnv.java amazon/aws-lambda-java:21 DebugEnv.java
Output:
Environment Variables:
HOME: /root
LAMBDA_RUNTIME_DIR: /var/runtime
LAMBDA_TASK_ROOT: /var/task
PATH: /var/lang/bin:/usr/local/bin:/usr/bin/:/bin:/opt/bin
TZ: :/etc/localtime
LD_LIBRARY_PATH: /var/lang/lib:/lib64:/usr/lib64:/var/runtime:/var/runtime/lib:/var/task:/var/task/lib:/opt/lib
TERM: xterm
LANG: en_US.UTF-8
HOSTNAME: 54ff4216deb1
sun.jnu.encoding: ANSI_X3.4-1968
Exception in thread "main" java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: t?st.txt
at java.base/sun.nio.fs.UnixPath.encode(Unknown Source)
at java.base/sun.nio.fs.UnixPath.<init>(Unknown Source)
at java.base/sun.nio.fs.UnixFileSystem.getPath(Unknown Source)
at java.base/java.nio.file.Path.resolve(Unknown Source)
at DebugEnv.main(DebugEnv.java:18)
As you can see the LANG
environment variable has been properly set to en_US.UTF-8
, yet the sun.jnu.encoding
system property has been intialized by the JVM with ANSI_X3.4-1968
. Consequently, the file write operations fail with an exception.
Fixing the image
Open a shell in the default image:
docker run --rm --platform=linux/amd64 -it --entrypoint=/bin/sh -v ${PWD}/DebugEnv.java:/var/task/DebugEnv.java amazon/aws-lambda-java:21
Install package glibc-langpack-en
:
dnf install glibc-langpack-en
Run the demo program:
java DebugEnv.java
Output:
Environment Variables:
HOME: /root
SHLVL: 1
LAMBDA_RUNTIME_DIR: /var/runtime
PATH: /var/lang/bin:/usr/local/bin:/usr/bin/:/bin:/opt/bin
LAMBDA_TASK_ROOT: /var/task
TZ: :/etc/localtime
LD_LIBRARY_PATH: /var/lang/lib:/lib64:/usr/lib64:/var/runtime:/var/runtime/lib:/var/task:/var/task/lib:/opt/lib
TERM: xterm
PWD: /var/task
_: /var/lang/bin/java
LANG: en_US.UTF-8
HOSTNAME: 4a0c0c2fbc7f
sun.jnu.encoding: UTF-8
As you can see this time the sun.jnu.encoding
system property has been intialized by the JVM with UTF-8
and the file write operations succeed without any exceptions.