public class ParagraphAbsorber extends Object
Represents an absorber object of page structure objects such as sections and paragraphs.
Performs search for sections and paragraphs of text and provides access for rectangles and polydons that describes it in text coordinate space.
Also performs text segments search and provides access to search results via TextFragments
collections grouped by structure elements.
ParagraphAbsorber.PageMarkups
collection will contains PageMarkup
objects that represents page structure by collections of MarkupSection
and MarkupParagraph
.
The TextFragment
object provides access to the search occurrence text, text properties, and allows to edit text and change the text state (font, font size, color etc).Constructor and Description |
---|
ParagraphAbsorber()
Initializes a new instance of the
ParagraphAbsorber that performs search for sections/paragraphs of the document or page. |
ParagraphAbsorber(int sectionsSearchDepth)
Initializes a new instance of the
ParagraphAbsorber that performs search for sections/paragraphs of the document or page. |
Modifier and Type | Method and Description |
---|---|
List<PageMarkup> |
getPageMarkups()
Gets collection of
PageMarkup that were absorbed. |
int |
getSectionsSearchDepth()
Gets or sets value that instructs how many times sequential searches for more fine elements of structure will be performed.
|
void |
setSectionsSearchDepth(int value)
Gets or sets value that instructs how many times sequential searches for more fine elements of structure will be performed.
|
void |
visit(Document doc)
Performs search for sections and paragraphs on the specified
Document . |
void |
visit(Page page)
Performs search on the specified
Page . |
public ParagraphAbsorber()
Initializes a new instance of the ParagraphAbsorber
that performs search for sections/paragraphs of the document or page.
public ParagraphAbsorber(int sectionsSearchDepth)
Initializes a new instance of the ParagraphAbsorber
that performs search for sections/paragraphs of the document or page.
sectionsSearchDepth
- Number of sequential searches for more fine elements of structure that will be performed.
ParagraphAbsorber.SectionsSearchDepth
property for more hints about the parameter.
public List<PageMarkup> getPageMarkups()
Gets collection of PageMarkup
that were absorbed.
public int getSectionsSearchDepth()
Gets or sets value that instructs how many times sequential searches for more fine elements of structure will be performed. Default search depth is 3. It means three searches for horizontally divided sections (headers, paragraphs etc) and three searches for vertically divided ones (columns).
public void setSectionsSearchDepth(int value)
Gets or sets value that instructs how many times sequential searches for more fine elements of structure will be performed. Default search depth is 3. It means three searches for horizontally divided sections (headers, paragraphs etc) and three searches for vertically divided ones (columns).
value
- int valuepublic void visit(Document doc)
Document
.doc
- Pdf document object.public void visit(Page page)
Performs search on the specified Page
.
page
- Pdf pocument page object.