[FEATURE] Improve EncryptorImpl with Asynchronous Handling for Scalability #3510

dhrubo-os · 2025-02-06T21:06:35Z

Is your feature request related to a problem?

Summary
The EncryptorImpl class is responsible for encrypting and decrypting text using a tenant-specific master key. It relies on the initMasterKey method to ensure the master key is initialized, creating the plugins-ml-config index if necessary. If the process exceeds 3 seconds, the operation times out. This approach introduces scalability issues and inefficiencies in handling encryption requests.

Problem Statement
The current implementation relies on blocking behavior with a CountDownLatch in initMasterKey, enforcing a strict 3-second timeout. This has several drawbacks:

Scalability Bottleneck: Blocking the thread for up to 3 seconds can lead to resource exhaustion under high concurrency.
Timeout Risks: If the master key retrieval or initialization takes longer than expected due to cluster conditions, requests fail unnecessarily.
Suboptimal Asynchronous Handling: OpenSearch provides ActionListener for non-blocking operations, but the current approach does not leverage it effectively.
Proposed Solution
Refactor initMasterKey to use a fully asynchronous, ActionListener-based approach, removing the blocking timeout and ensuring encryption/decryption requests proceed once initialization completes. The improved flow should:

Replace CountDownLatch with ActionListener to handle master key initialization without blocking.
Queue encryption/decryption requests while the master key is being initialized, processing them once ready.
Ensure failure handling is robust—if the master key cannot be retrieved, propagate the appropriate failure instead of timing out.
This aligns with OpenSearch's architecture, where long-running tasks should use non-blocking patterns to improve system throughput.

Expected Impact
Improved Performance: Eliminates unnecessary thread blocking, allowing more efficient request processing.
Better Scalability: Supports high-concurrency workloads without running into thread contention.
Higher Reliability: Reduces unnecessary failures due to timeout conditions, ensuring encryption requests complete successfully as soon as the master key is available.
By adopting this approach, we enhance the robustness and efficiency of encryption handling in OpenSearch ML Commons, making it resilient to operational delays while ensuring seamless user experience.

Would love to hear thoughts from the team on any additional considerations before moving forward with implementation.

dhrubo-os added enhancement New feature or request untriaged labels Feb 6, 2025

Zhangxunmt assigned dhrubo-os Feb 11, 2025

Zhangxunmt removed the untriaged label Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Improve EncryptorImpl with Asynchronous Handling for Scalability #3510

[FEATURE] Improve EncryptorImpl with Asynchronous Handling for Scalability #3510

dhrubo-os commented Feb 6, 2025

[FEATURE] Improve EncryptorImpl with Asynchronous Handling for Scalability #3510

[FEATURE] Improve EncryptorImpl with Asynchronous Handling for Scalability #3510

Comments

dhrubo-os commented Feb 6, 2025