Introduce a `pread` Directory based on Panama-FFI ?

### Description

> Following is generally written by LLM but benchmark is run by myself :) 

## 1. Motivation: `NIOFSDirectory` is still relevant

In recent memory-constrained deployments (cgroup-limited containers with large indices), `MMapDirectory` triggered severe page-fault storms — `pgmajfault` rates spiking by an order of magnitude once the working set exceeded the cgroup limit, with sharply degraded query latency. Switching to `NIOFSDirectory` helps us resolve it.

## 2. Problem: a JDK monitor caps `NIOFSDirectory` at ~4 threads

After moving more workloads onto `NIOFSDirectory`, we hit a hard scaling ceiling. The bottleneck is **not** the kernel — it's a synchronized block in `sun.nio.ch.FileChannelImpl`. Every positioned read registers the calling thread into a `NativeThreadSet` (so a concurrent `close()` can interrupt it via `pthread_kill`), and that registration takes a global monitor on every read.

```java
// sun.nio.ch.FileChannelImpl
private int readInternal(ByteBuffer dst, long position) throws IOException {
    int n = 0;
    int ti = -1;
    try {
        beginBlocking();
        // ↓↓↓ contention point — monitor-protected, on every single read ↓↓↓
        ti = threads.add();
        if (!isOpen()) return -1;
        do {
            // ... Blocker.begin / IOUtil.read(fd, dst, position, ...) / Blocker.end ...
        } while ((n == IOStatus.INTERRUPTED) && isOpen());
        return IOStatus.normalize(n);
    } finally {
        threads.remove(ti);   // takes the same monitor again
        endBlocking(n > 0);
    }
}
```

```java
// sun.nio.ch.NativeThreadSet — the monitor every reader fights for
int add() {
    long th = NativeThread.current();
    synchronized (this) {                              // ← global monitor per channel
        // ... grow array, find free slot, write thread handle ...
    }
}
```

Past ~4 threads, this monitor's cache-line bouncing dominates the cost of `pread64` itself, and throughput stops scaling. This is structurally tied to the `Channel.close()` interruption contract and unlikely to be removed from the JDK in the near term.

## 3. Benchmark: native `pread(2)` via Panama FFI scales 4× higher

JMH on Java 25, Linux x86_64, NVMe; 1 GiB file, 16 KiB random reads, 16 reads/op. Throughput in **ops/ms** (higher is better):

| Benchmark | 1 thr | 2 thr | 4 thr | 8 thr | 16 thr | 32 thr |
|---|---:|---:|---:|---:|---:|---:|
| `ffiPread` | 371.8 | 633.8 | 1104.5 | **1854.5** | **2838.1** | **2862.5** |
| `fileChannelReadDirect` | 358.9 | 428.1 | 683.4 | 637.3 | 737.0 | 737.4 |
| `fileChannelReadHeap` | 318.1 | 495.4 | 668.2 | 596.0 | 757.4 | 712.8 |

- 1 thread: FFI is ~4% faster — same syscall, less Java overhead.
- `FileChannel` plateaus at ~700 ops/ms from 4 threads onward; profiling shows time inside `NativeThreadSet`'s monitor.
- FFI scales near-linearly to 16 threads, then hits the hardware ceiling at 32.

## 4. Proposal: `PreadDirectory`

A new `Directory` that performs random reads via `pread(2)` through Panama FFI:

- **POSIX** → FFI `pread`. No `NativeThreadSet`, no monitor, stateless syscall.
- **Non-POSIX** → fallback to `NIOFSDirectory`. Behavior never worse than today;


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a `pread` Directory based on Panama-FFI ? #16044

Description

1. Motivation: `NIOFSDirectory` is still relevant

2. Problem: a JDK monitor caps `NIOFSDirectory` at ~4 threads

3. Benchmark: native `pread(2)` via Panama FFI scales 4× higher

4. Proposal: `PreadDirectory`

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Benchmark	1 thr	2 thr	4 thr	8 thr	16 thr	32 thr
`ffiPread`	371.8	633.8	1104.5	1854.5	2838.1	2862.5
`fileChannelReadDirect`	358.9	428.1	683.4	637.3	737.0	737.4
`fileChannelReadHeap`	318.1	495.4	668.2	596.0	757.4	712.8

Introduce a pread Directory based on Panama-FFI ? #16044

Description

Description

1. Motivation: NIOFSDirectory is still relevant

2. Problem: a JDK monitor caps NIOFSDirectory at ~4 threads

3. Benchmark: native pread(2) via Panama FFI scales 4× higher

4. Proposal: PreadDirectory

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Introduce a `pread` Directory based on Panama-FFI ? #16044

1. Motivation: `NIOFSDirectory` is still relevant

2. Problem: a JDK monitor caps `NIOFSDirectory` at ~4 threads

3. Benchmark: native `pread(2)` via Panama FFI scales 4× higher

4. Proposal: `PreadDirectory`