Skip to content

Memory Allocation Optimization for String to Path Conversion on non-JVM Platforms #1629

@blastmann

Description

@blastmann

Issue Description

When converting strings to Path objects using String.toPath() on non-JVM platforms, there's a potential memory allocation inefficiency. Unlike the JVM implementation that utilizes a SegmentPool for recycling Segment objects, the non-JVM implementation creates new Segment objects for each conversion without reusing them.

Technical Analysis

The current implementation in String.commonToPath() creates a new Buffer for each conversion:

internal fun String.commonToPath(normalize: Boolean): Path {
  return Buffer().writeUtf8(this).toPath(normalize)
}

On JVM platforms, Segment objects are pooled through the SegmentPool:

// JVM implementation
internal actual object SegmentPool {
  actual val MAX_SIZE = 64 * 1024 // 64 KiB
  // Segment recycling logic...
}

While on non-JVM platforms, the SegmentPool is effectively a no-op:

// Non-JVM implementation
internal actual object SegmentPool {
  actual val MAX_SIZE: Int = 0
  actual val byteCount: Int = 0
  actual fun take(): Segment = Segment()
  actual fun recycle(segment: Segment) {
  }
}

This means that each toPath() call on non-JVM platforms creates:

  1. A new Buffer
  2. One or more new Segment objects
  3. Processes the path normalization logic
  4. Reads the result into a ByteString
  5. Creates a new Path object

For applications that perform frequent path operations, this can lead to excessive memory allocations and increased GC pressure.

Steps to Reproduce

Benchmark code that demonstrates the issue:

// Create many paths from strings in a loop
fun benchmarkPathCreation() {
    val start = currentTimeMillis()
    for (i in 1..10000) {
        val path = "user/documents/file$i.txt".toPath()
        // Use path...
    }
    val end = currentTimeMillis()
    println("Time: ${end - start}ms")
}

On non-JVM platforms, this creates 10,000+ Buffer and Segment objects that must be garbage collected.

Proposed Solutions

  1. Implement a Segment pooling mechanism for non-JVM platforms
    Similar to the JVM implementation but adapted for non-JVM environments.

  2. Add a caching mechanism for Path objects
    Introduce a Path cache for commonly used paths, potentially as an opt-in feature.

  3. Optimize the Path creation process
    For simple paths that don't require normalization, consider a more direct creation path.

  4. Provide developer guidance
    Document this behavior and provide best practices for efficient Path handling on non-JVM platforms.

Environment

  • Okio version: 3.9.7
  • Platforms affected: Non-JVM platforms (Kotlin Native)
  • Priority: Medium (performance optimization)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions