Description
For StoredField merging, the merge strategy has to switch from the BULK mode to the DOC mode even if only a document is deleted, which significantly increases merging overhead.
StoredField is compressed at the chunk granularity I propose introducing a new merge strategy:
-
For the chunk with no deleted documents, retain copyChunk during merging;
-
For the chunk containing deleted documents, fall back to copyOneDoc.
Furthermore, we can leverage the segment deletion ratio (e.g., a threshold below 5%) as a condition to enable or disable this optimized merging logic.
Description
For StoredField merging, the merge strategy has to switch from the
BULKmode to theDOCmode even if only a document is deleted, which significantly increases merging overhead.lucene/lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsWriter.java
Line 684 in 2dee50e
StoredField is compressed at the chunk granularity I propose introducing a new merge strategy:
For the chunk with no deleted documents, retain
copyChunkduring merging;For the chunk containing deleted documents, fall back to
copyOneDoc.Furthermore, we can leverage the segment deletion ratio (e.g., a threshold below 5%) as a condition to enable or disable this optimized merging logic.