public final class TextFragmentAbsorber extends TextAbsorber
Represents an absorber object of text fragments. Performs text search and provides access to
search results via TextFragmentAbsorber.TextFragments
collection.
The example demonstrates how to find text on the first PDF document page and replace the text and it's font. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Find font that will be used to change document text font com.aspose.pdf.Font font = FontRepository.findFont("Arial"); // Create TextFragmentAbsorber object to find all "hello world" text occurrences TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello world"); // Accept the absorber for first page doc.getPages().get(1).accept(absorber); // Change text and font of the first text occurrence absorber.getTextFragments().get_Item(1).setText ( "hi world"); absorber.getTextFragments().get_Item(1).getTextState().setFont ( font); // Save document doc.save("D:\\Tests\\output.pdf");
The TextFragmentAbsorber
object is basically used in text search scenario. When the
search is completed the occurrences are represented with TextFragment
objects that the
TextFragmentAbsorber.TextFragments
collection contains. The TextFragment
object
provides access to the search occurrence text, text properties, and allows to edit text and
change the text state (font, font size, color etc).
Constructor and Description |
---|
TextFragmentAbsorber()
Initializes a new instance of the
TextFragmentAbsorber that performs search of all
text segments of the document or page. |
TextFragmentAbsorber(Pattern regex)
Initializes a new instance of the
TextFragmentAbsorber class for the specified System.Text.RegularExpressions.Regex class object. |
TextFragmentAbsorber(Pattern regex,
TextEditOptions textEditOptions)
Initializes a new instance of the
TextFragmentAbsorber class for the specified text phrase and text edit options. |
TextFragmentAbsorber(Pattern regex,
TextSearchOptions textSearchOptions)
Initializes a new instance of the
TextFragmentAbsorber class for the specified text phrase and text search options. |
TextFragmentAbsorber(String phrase)
Initializes a new instance of the
TextFragmentAbsorber class for the specified text
phrase. |
TextFragmentAbsorber(String phrase,
TextEditOptions textEditOptions)
Initializes a new instance of the
TextFragmentAbsorber class for the specified text
phrase and text edit options. |
TextFragmentAbsorber(String phrase,
TextSearchOptions textSearchOptions)
Initializes a new instance of the
TextFragmentAbsorber class for the specified text
phrase and text search options. |
TextFragmentAbsorber(String phrase,
TextSearchOptions textSearchOptions,
TextEditOptions textEditOptions)
Initializes a new instance of the
TextFragmentAbsorber class for the specified text
phrase, text search options and text edit options. |
TextFragmentAbsorber(TextEditOptions textEditOptions)
Initializes a new instance of the
TextFragmentAbsorber with text edit options, that
performs search of all text segments of the document or page. |
Modifier and Type | Method and Description |
---|---|
void |
applyForAllFragments(float fontSize)
Applies font size for all text fragments that were absorbed.
|
void |
applyForAllFragments(Font font)
Applies font for all text fragments that were absorbed.
|
void |
applyForAllFragments(Font font,
float fontSize)
Applies font and size for all text fragments that were absorbed.
|
List<TextExtractionError> |
getErrors()
List of
TextExtractionError objects. |
TextExtractionOptions |
getExtractionOptions()
Gets text extraction options.
|
String |
getPhrase()
Gets phrase that the
TextFragmentAbsorber searches on the PDF document or page. |
String |
getText()
Gets extracted text that the
TextAbsorber extracts on the PDF document or page. |
TextEditOptions |
getTextEditOptions()
Gets text edit options.
|
TextFragmentCollection |
getTextFragments()
Gets collection of search occurrences that are presented with
TextFragment objects. |
TextReplaceOptions |
getTextReplaceOptions()
Gets text replace options.
|
TextSearchOptions |
getTextSearchOptions()
Gets search options.
|
boolean |
hasErrors_Fragment()
Value indicates whether errors were found during text extraction.
|
void |
removeAllText(Document document)
Removes all text from the document.
|
void |
removeAllText(Page page)
Removes all text from the specified page.
|
void |
removeAllText(Page page,
Rectangle rect)
Removes text inside the specified rectangle from the specified page.
|
void |
reset()
Clears TextFragments collection of this
TextFragmentAbsorber object. |
void |
setExtractionOptions(TextExtractionOptions value)
Sets text extraction options.
|
void |
setPhrase(String value)
Sets phrase that the
TextFragmentAbsorber searches on the PDF document or page. |
void |
setTextEditOptions(TextEditOptions value)
Sets text edit options.
|
void |
setTextFragments(TextFragmentCollection value)
Sets collection of search occurrences that are presented with
TextFragment objects. |
void |
setTextReplaceOptions(TextReplaceOptions value)
Sets text replace options.
|
void |
setTextSearchOptions(TextSearchOptions value)
Sets search options.
|
void |
visit(IDocument pdf)
Performs search on the specified document.
|
void |
visit(Page page)
Performs search on the specified page.
|
void |
visit(XForm xForm)
Performs search on the specified form object.
|
hasErrors
public TextFragmentAbsorber()
Initializes a new instance of the TextFragmentAbsorber
that performs search of all
text segments of the document or page.
The example demonstrates how to find text on the first PDF document page and replace the text. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Find font that will be used to change document text font Font font = FontRepository.findFont("Arial"); // Create TextFragmentAbsorber object TextFragmentAbsorber absorber = new TextFragmentAbsorber(); // Make the absorber to search all "hello world" text occurrences absorber.setPhrase ( "hello world"); // Accept the absorber for first page doc.getPages().get(1).accept(absorber); // Change text of the first text occurrence absorber.getTextFragments().get_Item(1).setText ( "hi world"); // Save document doc.save("D:\\Tests\\output.pdf");
Performs text search and provides access to search results via
TextFragmentAbsorber.TextFragments
collection.
public TextFragmentAbsorber(TextEditOptions textEditOptions)
Initializes a new instance of the TextFragmentAbsorber
with text edit options, that
performs search of all text segments of the document or page.
The example demonstrates how to find all text fragments on the first PDF document page and replace font for them.
// Open document
Document doc = new Document("D:\\Tests\\input.pdf");
// Create TextFragmentAbsorber object
TextFragmentAbsorber absorber = new TextFragmentAbsorber(new TextEditOptions(TextEditOptions.FontReplace
.RemoveUnusedFonts));
// Accept the absorber for first page
doc.getPages()get(1).accept(absorber);
// Find Courier font
Font font = FontRepository.findFont("Courier");
// Set the font for all the text fragments
for (TextFragment textFragment : (Iterable<TextFragment>)
absorber.TextFragments)
{
textFragment.getTextState().setFont ( font);
}
// Save document
doc.save("D:\\Tests\\output.pdf");
textEditOptions
- Text edit options (Allows to turn on some edit features).
Performs text search and provides access to search results via
TextFragmentAbsorber.TextFragments
collection.
public TextFragmentAbsorber(String phrase)
Initializes a new instance of the TextFragmentAbsorber
class for the specified text
phrase.
The example demonstrates how to find text on the first PDF document page and replace the text and it's font. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Find font that will be used to change document text font com.aspose.pdf.Font font = FontRepository.findFont("Arial"); // Create TextFragmentAbsorber object to find all "hello world" text occurrences TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello world"); // Accept the absorber for first page doc.getPages().get_Item(1).accept(absorber); // Change text and font of the first text occurrence absorber.getTextFragments().get_Item(1).setText ( "hi world"); absorber.getTextFragments().get_Item(1).getTextState().setFont ( font); // Save document doc.save("D:\\Tests\\output.pdf");
phrase
- Phrase that the TextFragmentAbsorber
searches
Performs text search of the specified phrase and provides access to search results via
TextFragmentAbsorber.TextFragments
collection.
public TextFragmentAbsorber(Pattern regex)
Initializes a new instance of the TextFragmentAbsorber
class for the specified System.Text.RegularExpressions.Regex class object.
// Open document Document doc = new Document("input.pdf"); // Find font that will be used to change document text font Font font = FontRepository.findFont("Arial"); // Create TextAbsorber object to find all instances of the input regex TextFragmentAbsorber absorber = new TextFragmentAbsorber(new Regex("h\\w*?o")); // Accept the absorber for first page doc.getPages().get_item(1).accept(absorber); // we should find "hello" word and replace it with "Hi" absorber.getTextFragments().get_item(1).setText("Hi"); // Save document doc.save("output.pdf");
regex
- System.Text.RegularExpressions.Regex class object that the TextFragmentAbsorber
searches
Performs text search of the specified phrase and provides access to search results via TextFragmentAbsorber.TextFragments
(getTextFragments()
/setTextFragments(TextFragmentCollection)
) collection.
public TextFragmentAbsorber(String phrase, TextSearchOptions textSearchOptions)
Initializes a new instance of the TextFragmentAbsorber
class for the specified text
phrase and text search options.
The example demonstrates how to find text with regular expression on the first PDF document page and replace the text. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Create TextFragmentAbsorber object that searches all words starting 'h' and ending 'o' using regular expression. TextFragmentAbsorber absorber = new TextFragmentAbsorber("h\\w*?o", new TextSearchOptions(true)); // we should find "hello" word and replace it with "Hi" doc.getPages().get_Item(1).accept(absorber); absorber.getTextFragments().get_Item(1).setText ( "Hi"); // Save document doc.save("D:\\Tests\\output.pdf");
phrase
- Phrase that the TextFragmentAbsorber
searchestextSearchOptions
- Text search options (Allows to turn on some search features. For example, search with
regular
expression)
Performs text search of the specified phrase and provides access to search results via
TextFragmentAbsorber.TextFragments
collection.
public TextFragmentAbsorber(Pattern regex, TextSearchOptions textSearchOptions)
Initializes a new instance of the TextFragmentAbsorber
class for the specified text phrase and text search options.
// Open document Document doc = new Document("input.pdf"); // Create TextFragmentAbsorber object that searches all words starting 'h' and ending 'o' using regular expression. TextFragmentAbsorber absorber = new TextFragmentAbsorber(new Regex("h\\w*?o"), new TextSearchOptions(true)); // we should find "hello" word and replace it with "Hi" doc.getPages().get_Item(1).accept(absorber); absorber.getTextFragments.get_Item(1).setText("Hi"); // Save document doc.save("output.pdf");
regex
- Regex class object that the TextFragmentAbsorber
searchestextSearchOptions
- Text search options (Allows to turn on some search features.)
Performs text search of the specified phrase and provides access to search results via TextFragmentAbsorber.TextFragments
(getTextFragments()
/setTextFragments(TextFragmentCollection)
) collection.
public TextFragmentAbsorber(String phrase, TextSearchOptions textSearchOptions, TextEditOptions textEditOptions)
Initializes a new instance of the TextFragmentAbsorber
class for the specified text
phrase, text search options and text edit options. The text edit options are not supported
yet.
The example demonstrates how to find text with regular expression on the first PDF document page and replace the text. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Create TextFragmentAbsorber object that searches all words starting 'h' and ending 'o' using regular expression. TextFragmentAbsorber absorber = new TextFragmentAbsorber("h\w*?o", new TextSearchOptions(true)); // we should find "hello" word and replace it with "Hi" doc.getPages().get_item(1).accept(absorber); absorber.getTextFragments().get_Item(1).setText ( "Hi"); // Save document doc.save("D:\\Tests\\output.pdf");
phrase
- Phrase that the TextFragmentAbsorber
searchestextSearchOptions
- Text search options (Allows to turn on some search features. For example, search with
regular
expression)textEditOptions
- Text edit options (Allows to turn on some edit features. For example, define special
behavior
when requested symbol cannot be written with font). The parameter is not supported yet.
Performs text search of the specified phrase and provides access to search results via
TextFragmentAbsorber.TextFragments
collection.
public TextFragmentAbsorber(Pattern regex, TextEditOptions textEditOptions)
Initializes a new instance of the TextFragmentAbsorber
class for the specified text phrase and text edit options.
regex
- System.Text.RegularExpressions.Regex class object that the TextFragmentAbsorber
searchestextEditOptions
- Text edit options (Allows to turn on some edit features).
Performs text search of the specified phrase and provides access to search results via TextFragmentAbsorber.TextFragments
(getTextFragments()
/setTextFragments(TextFragmentCollection)
) collection.
public TextFragmentAbsorber(String phrase, TextEditOptions textEditOptions)
Initializes a new instance of the TextFragmentAbsorber
class for the specified text
phrase and text edit options.
phrase
- Phrase that the TextFragmentAbsorber
searchestextEditOptions
- Text edit options (Allows to turn on some edit features).
Performs text search of the specified phrase and provides access to search results via
TextFragmentAbsorber.TextFragments
collection.
public TextFragmentCollection getTextFragments()
Gets collection of search occurrences that are presented with TextFragment
objects.
The example demonstrates how to find text on the first PDF document page and replace all search occurrences
with new text.
// Open document
Document doc = new Document("D:\\Tests\\input.pdf");
// Find font that will be used to change document text font
Font font = FontRepository.findFont("Arial");
// Create TextFragmentAbsorber object to find all "hello world" text occurrences
TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello world");
// Accept the absorber for first page
doc.getPages().get(1).accept(absorber);
// Change text of all search occurrences
for (TextFragment textFragment : (Iterable<TextFragment>)
absorber.getTextFragments())
{
textFragment.setText ( "hi world");
}
// Save document
doc.save("D:\\Tests\\output.pdf");
public void setTextFragments(TextFragmentCollection value)
Sets collection of search occurrences that are presented with TextFragment
objects.
value
- TextFragmentCollection object
The example demonstrates how to find text on the first PDF document page and replace all search
occurrences with new text.
// Open document
Document doc = new Document("D:\\Tests\\input.pdf");
// Find font that will be used to change document text font
Font font = FontRepository.findFont("Arial");
// Create TextFragmentAbsorber object to find all "hello world" text occurrences
TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello world");
// Accept the absorber for first page
doc.getPages().get(1).accept(absorber);
// Change text of all search occurrences
for (TextFragment textFragment : (Iterable<TextFragment>)
absorber.getTextFragments())
{
textFragment.setText ( "hi world");
}
// Save document
doc.save("D:\\Tests\\output.pdf");
public String getPhrase()
Gets phrase that the TextFragmentAbsorber
searches on the PDF document or page.
The example demonstrates how to perform search text several times and perform text replacements. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Create TextFragmentAbsorber object to find all "hello" text occurrences TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello"); doc.getPages().get(1).accept(absorber); absorber.getTextFragments().get_Item(1).setText ( "Hi"); // search another word and replace it absorber.setPhrase ( "world"); doc.getPages().get(1).accept(absorber); absorber.getTextFragments().get_Item(1).setText ( "John"); // Save document doc.save("D:\\Tests\\output.pdf");
public void setPhrase(String value)
Sets phrase that the TextFragmentAbsorber
searches on the PDF document or page.
value
- String value
The example demonstrates how to perform search text several times and perform text replacements. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Create TextFragmentAbsorber object to find all "hello" text occurrences TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello"); doc.getPages().get(1).accept(absorber); absorber.getTextFragments().get_Item(1).setText ( "Hi"); // search another word and replace it absorber.setPhrase ( "world"); doc.getPages().get(1).accept(absorber); absorber.getTextFragments().get_Item(1).setText ( "John"); // Save document doc.save("D:\\Tests\\output.pdf");
public TextSearchOptions getTextSearchOptions()
Gets search options. The options enable search using regular expressions.
getTextSearchOptions
in class TextAbsorber
The example demonstrates how to perform search text using regular expression. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Create TextFragmentAbsorber object TextFragmentAbsorber absorber = new TextFragmentAbsorber(); // make the absorber to search all words starting 'h' and ending 'o' using regular expression. absorber.setPhrase ( "h\w*?o"); absorber.setTextSearchOptions ( new TextSearchOptions(true)); // we should find "hello" word and replace it with "Hi" doc.getPages().get(1).accept(absorber); absorber.getTextFragments().get_Item(1).setText ( "Hi"); // Save document doc.save("D:\\Tests\\output.pdf");
public void setTextSearchOptions(TextSearchOptions value)
Sets search options. The options enable search using regular expressions.
setTextSearchOptions
in class TextAbsorber
value
- TextSearchOptions object
The example demonstrates how to perform search text using regular expression. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Create TextFragmentAbsorber object TextFragmentAbsorber absorber = new TextFragmentAbsorber(); // make the absorber to search all words starting 'h' and ending 'o' using regular expression. absorber.setPhrase ( "h\w*?o"); absorber.setTextSearchOptions ( new TextSearchOptions(true)); // we should find "hello" word and replace it with "Hi" doc.getPages().get(1).accept(absorber); absorber.getTextFragments().get_Item(1).setText ( "Hi"); // Save document doc.save("D:\\Tests\\output.pdf");
public TextEditOptions getTextEditOptions()
Gets text edit options. The options define special behavior when requested symbol cannot be written with font.
public void setTextEditOptions(TextEditOptions value)
Sets text edit options. The options define special behavior when requested symbol cannot be written with font.
value
- TextEditOptions objectpublic TextReplaceOptions getTextReplaceOptions()
Gets text replace options. The options define behavior when text fragment is replaced to more short.
public void setTextReplaceOptions(TextReplaceOptions value)
Sets text replace options. The options define behavior when text fragment is replaced to more short.
value
- TextReplaceOptions valuepublic boolean hasErrors_Fragment()
Value indicates whether errors were found during text extraction. Searching for errors will performed only if TextSearchOptions.LogTextExtractionErrors = true; And it may decrease performance.
public List<TextExtractionError> getErrors()
List of TextExtractionError
objects. It contain information about errors were found
during text extraction. Searching for errors will performed only if
TextSearchOptions.LogTextExtractionErrors = true; And it may decrease performance.
getErrors
in class TextAbsorber
public String getText()
Gets extracted text that the TextAbsorber
extracts on the PDF document or page.
getText
in class TextAbsorber
The example demonstrates how to extract text from all pages of the PDF document. // open document Document doc = new Document(inFile); // create TextAbsorber object to extract text TextAbsorber absorber = new TextAbsorber(); // accept the absorber for all document's pages doc.getPages().accept(absorber); // get the extracted text String extractedText = absorber.getText();
public void visit(Page page)
Performs search on the specified page.
The example demonstrates how to find text on the first PDF document page and replace the text.
// Open document
Document doc = new Document("D:\\Tests\\input.pdf");
// Find font that will be used to change document text font
Font font = FontRepository.findFont("Arial");
// Create TextFragmentAbsorber object to find all "hello world" text occurrences
TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello world");
// Accept the absorber for first page
absorber.visit(doc.getPages().get(1));
// Change text of all search occurrences
for (TextFragment textFragment : (Iterable<TextFragment>)
absorber.getTextFragments())
{
textFragment.setText ( "hi world");
}
// Save document
doc.save("D:\\Tests\\output.pdf");
visit
in class TextAbsorber
page
- PDF document page object.public void visit(IDocument pdf)
Performs search on the specified document.
The example demonstrates how to find text on PDF document and replace text of all search occurrences. // Open document Document doc = new Document("D:\\Tests\\input.pdf"); // Find font that will be used to change document text font Font font = FontRepository.findFont("Arial"); // Create TextFragmentAbsorber object to find all "hello world" text occurrences TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello world"); // Accept the absorber for first page absorber.visit(doc); // Change text of the first text occurrence absorber.getTextFragments().get_Item(1).setText ( "hi world"); // Save document doc.save("D:\\Tests\\output.pdf");
visit
in class TextAbsorber
pdf
- PDF document object.public void applyForAllFragments(Font font)
Applies font for all text fragments that were absorbed. It works faster than looping through the fragments if all fragments on the page(s) were absorbed. Otherwise it works similar with looping.
font
- Font
of the text.public void applyForAllFragments(float fontSize)
Applies font size for all text fragments that were absorbed. It works faster than looping through the fragments if all fragments on the page(s) were absorbed. Otherwise it works similar with looping.
fontSize
- Font size of the text.public void applyForAllFragments(Font font, float fontSize)
Applies font and size for all text fragments that were absorbed. It works faster than looping through the fragments if all fragments on the page(s) were absorbed. Otherwise it works similar with looping.
font
- Font
of the text.fontSize
- Font size of the text.public void reset()
Clears TextFragments collection of this TextFragmentAbsorber
object.
public void removeAllText(Page page)
Removes all text from the specified page.
page
- PDF document page object.public final void removeAllText(Page page, Rectangle rect)
Removes text inside the specified rectangle from the specified page.
page
- PDF document page object.rect
- Rectangle
to remove text inside.public void removeAllText(Document document)
Removes all text from the document.
document
- PDF document object.public void visit(XForm xForm)
Performs search on the specified form object.
visit
in class TextAbsorber
xForm
- Pdf form object.public TextExtractionOptions getExtractionOptions()
Gets text extraction options.
getExtractionOptions
in class TextAbsorber
public void setExtractionOptions(TextExtractionOptions value)
Sets text extraction options.
setExtractionOptions
in class TextAbsorber
value
- TextExtractionOptions object