De duplicator scenarios

When a processing node combines multiple Part 3 documents, it has to deal with the style and region definitions within each of the input documents. The first step is simply to preserve all of the definition and to prefix their identifier with the sequence number to ensure they are unique. Next, the de-duplicator examines the definitions of the styles and regions in the combined document. When it finds identical definitions it merges them and then fixes any references to styles that have been merged.

For example, the re-segmenter outputs a document with these (simplified) styles:

<style xml:id="seq1234_style1" tts:backgroundColor="black" tts:color="lime" tts:fontSize="1c 2c"/>
<style xml:id="seq1235_style2" tts:color="lime" tts:fontSize="1c 2c" tts:backgroundColor="#000000" />
<style xml:id="seq1236_style3" style="seq1234_style2" />
<style xml:id="seq1237_style4" tts:color="lime" tts:fontSize="10px 20px" tts:backgroundColor="black" />

In this case, the first three styles will be merged because their definitions are functionally identical. The fourth style will not be merged even if the size of the font computed in pixels is equal to the size specified in cells.

This can get quite complicated and tricky to test, so for each scenario a separate Part 3 document was created. Below is a list of tested scenarios.

Test Doc	Example scenario	Expected Output
*NoStyleNoRegion	No changes
*NoStylesOneRegion	No changes
*OneStyleNoRegions	No changes
*OneStyleOneRegion	No changes
*1Sty1RegionWithOneStyleAttr	No change
*3DupSty3DupRegAllAttsSpecified	1 style 1 region
EL
*6Sty3Dup6Reg3DupForeignNamespace	4 styles, 4 regions
*1Sty1Reg4DupAtts	1 style, 1 region with deduped style attributes
*NoDupStyNoDupReg	No changes
3DupSty3DupRegRefs	content elements references changed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

De duplicator scenarios

Clone this wiki locally