DocumentJoinRunsWithSameFormatting Method |
Namespace: Aspose.Words
This is an optimization method. Some documents contain adjacent runs with same formatting. Usually this occurs if a document was intensively edited manually. You can reduce the document size and speed up further processing by joining these runs.
The operation checks every Paragraph node in the document for adjacent Run nodes having identical properties. It ignores unique identifiers used to track editing sessions of run creation and modification. First run in every joining sequence accumulates all text. Remaining runs are deleted from the document.
// Let's load this particular document. It contains a lot of content that has been edited many times // This means the document will most likely contain a large number of runs with duplicate formatting Document doc = new Document(MyDir + "Rendering.docx"); // This is for illustration purposes only, remember how many run nodes we had in the original document int runsBefore = doc.GetChildNodes(NodeType.Run, true).Count; // Join runs with the same formatting. This is useful to speed up processing and may also reduce redundant // tags when exporting to HTML which will reduce the output file size int joinCount = doc.JoinRunsWithSameFormatting(); // This is for illustration purposes only, see how many runs are left after joining int runsAfter = doc.GetChildNodes(NodeType.Run, true).Count; Console.WriteLine("Number of runs before: {0}, after: {1}, joins: {2}", runsBefore, runsAfter, joinCount); // Save the optimized document to disk doc.Save(ArtifactsDir + "Document.JoinRunsWithSameFormatting.html");