public final class OCRConfig extends Object
Configuration for OCR system. OCRConfig provides methods for customizing of the image processing and recognition
OcrEngine ocr = new OcrEngine(); ocr.getConfig().setAdjustRotation(AdjustRotationMode.UserDefined); ocr.getConfig().setAdjustRotationAngle(90);
Constructor and Description |
---|
OCRConfig()
Creates example of
OCRConfig with default parameteres values |
Modifier and Type | Method and Description |
---|---|
void |
addRecognitionBlock(IRecognitionBlock recognitionBlock)
Adds a rectangular block to user defined RecognitionBlocks
|
void |
clearRecognitionBlocks()
Clear recognition blocks array.
|
int |
getAdjustRotation()
Gets a value of adjust rotation mode
AdjustRotationMode . |
double |
getAdjustRotationAngle()
Gets a value of clockwise rotation angle, in degrees.
|
int |
getAdjustUpsideDownRotation()
Gets a value of upsidedown rotation mode
AdjustUpsideDownRotationMode . |
CorrectionFilters |
getCorrectionFilters()
Gets filters for correction before recognition
|
boolean |
getDeleteTableLines()
Gets a value that indicates whether need to find and delete table lines in document
Don't use this property unnecessarily, because it may remove some meaningful information from the image and will increase total processing time
|
boolean |
getDetectReadingOrder()
Gets a value that indicates whether special reading order detection operation must be applied to textblocks
Default value is TRUE
If OCR process works too much time or is not responding, try to set this property to FALSE
|
boolean |
getDetectTextRegions()
Gets a value indicating whether automatical detection of the regions with text must be used.
|
boolean |
getDoSpellingCorrection()
Gets a value indicating whether automatic spelling correction should be applied.
|
int |
getProbabilityRow()
Gets row with probable symbols to return probable symbols from.
|
boolean |
getProcessColoredBackground()
Turn on this option if input image have complex colored background and standart process works bad
|
int |
getQuantizationPalleteSize()
It is advanced option, which control number of colors to image pallete while quantization; this option make sence olny if ProcessColoredBackground is true; defalt value is 5
|
List<IRecognitionBlock> |
getRecognitionBlocks()
User defined blocks that determines page layout
|
boolean |
getRemoveNonText()
Get this parameter to TRUE if image contains non-text components (e.g.
|
boolean |
getSavePreprocessedImages()
Gets a value indicating whether saving of preprocessed images is enabled.
|
char[] |
getWhitelist()
Gets or sets a white list of characters.
|
void |
setAdjustRotation(int value)
Sets a value of adjust rotation mode
AdjustRotationMode . |
void |
setAdjustRotationAngle(double value)
Sets a value of clockwise rotation angle, in degrees.
|
void |
setAdjustUpsideDownRotation(int value)
Sets a value of upsidedown rotation mode
AdjustUpsideDownRotationMode . |
void |
setCorrectionFilters(CorrectionFilters value)
Sets filters for correction before recognition
|
void |
setDeleteTableLines(boolean value)
Sets a value that indicates whether need to find and delete table lines in document
Don't use this property unnecessarily, because it may remove some meaningful information from the image and will increase total processing time
|
void |
setDetectReadingOrder(boolean value)
Sets a value that indicates whether special reading order detection operation must be applied to textblocks
Default value is TRUE
If OCR process works too much time or is not responding, try to set this property to FALSE
|
void |
setDetectTextRegions(boolean value)
Sets a value indicating whether automatical detection of the regions with text must be used.
|
void |
setDoSpellingCorrection(boolean value)
Sets a value indicating whether automatic spelling correction should be applied.
|
void |
setProbabilityRow(int value)
Sets row with probable symbols to return probable symbols from.
|
void |
setProcessColoredBackground(boolean value)
Turn on this option if input image have complex colored background and standart process works bad
|
void |
setQuantizationPalleteSize(int value)
It is advanced option, which control number of colors to image pallete while quantization; this option make sence olny if ProcessColoredBackground is true; defalt value is 5
|
void |
setRemoveNonText(boolean value)
Set this parameter to TRUE if image contains non-text components (e.g.
|
void |
setSavePreprocessedImages(boolean value)
Sets a value indicating whether saving of preprocessed images is enabled.
|
void |
setWhitelist(char[] value)
Sets a white list of characters.
|
public OCRConfig()
Creates example of OCRConfig
with default parameteres values
public void addRecognitionBlock(IRecognitionBlock recognitionBlock)
Adds a rectangular block to user defined RecognitionBlocks
recognitionBlock
- a block to addpublic void clearRecognitionBlocks()
Clear recognition blocks array.
public int getAdjustRotation()
Gets a value of adjust rotation mode AdjustRotationMode
. Possible values:
AdjustRotationMode.Automatic
- skew angle is defined automatically (this may take some time but will improve recognition quality)
AdjustRotationMode.UserDefined
- image is rotated on angle defined in AdjustRotationAngle
AdjustRotationMode.Disabled
- no image rotation will be applied
public double getAdjustRotationAngle()
Gets a value of clockwise rotation angle, in degrees. Use only when AdjustRotation is equal to AdjustRotationMode.UserDefined.
public int getAdjustUpsideDownRotation()
Gets a value of upsidedown rotation mode AdjustUpsideDownRotationMode
.
Set it to AdjustUpsideDownRotationMode.Flip
if text on image is upsidedown so it will be rotated to 180 degrees
public CorrectionFilters getCorrectionFilters()
Gets filters for correction before recognition
public boolean getDeleteTableLines()
Gets a value that indicates whether need to find and delete table lines in document Don't use this property unnecessarily, because it may remove some meaningful information from the image and will increase total processing time
OcrEngine ocr = new OcrEngine(); ocr.setImage(ImageStream.fromFile("image.tiff")); ocr.getLanguageContainer().addLanguage(LanguageFactory.load( "Portuguese-RSC-HS-PB-ResourcesAllCharsNet.zip")); // Resource file name ocr.getConfig().setDeleteTableLines(true); if (ocr.Process()) { System.out.println(ocr.getText()); }
public boolean getDetectReadingOrder()
Gets a value that indicates whether special reading order detection operation must be applied to textblocks Default value is TRUE If OCR process works too much time or is not responding, try to set this property to FALSE
OcrEngine ocr = new OcrEngine(); ocr.setImage(ImageStream.fromFile("image.tiff")); ocr.getLanguageContainer().addLanguage(LanguageFactory.load( "Portuguese-RSC-HS-PB-ResourcesAllCharsNet.zip")); // Resource file name ocr.getConfig().setDetectReadingOrder(false); if (ocr.Process()) { System.out.println(ocr.getText()); }
public boolean getDetectTextRegions()
Gets a value indicating whether automatical detection of the regions with text must be used. If this property is set to "true", manually set recognition blocks will be ignored.
public boolean getDoSpellingCorrection()
Gets a value indicating whether automatic spelling correction should be applied. Use this option to improve OCR result but notice that this will increase total processing time
OcrEngine ocr = new OcrEngine(); ocr.setImage(ImageStream.fromFile("image.tiff")); ocr.getLanguageContainer().addLanguage(LanguageFactory.load( "Portuguese-RSC-HS-PB-ResourcesAllCharsNet.zip")); // Resource file name ocr.getConfig().setDoSpellingCorrection(true); if (ocr.process()) { System.out.println(ocr.getText()); }
public int getProbabilityRow()
Gets row with probable symbols to return probable symbols from. Default value is 0, so only the most probable symbols return.
public boolean getProcessColoredBackground()
Turn on this option if input image have complex colored background and standart process works bad
public int getQuantizationPalleteSize()
It is advanced option, which control number of colors to image pallete while quantization; this option make sence olny if ProcessColoredBackground is true; defalt value is 5
public List<IRecognitionBlock> getRecognitionBlocks()
User defined blocks that determines page layout
public boolean getRemoveNonText()
Get this parameter to TRUE if image contains non-text components (e.g. pictures, graphics) that should be detected and ignored during recognition process
OcrEngine ocr = new OcrEngine(); ocr.setImage(ImageStream.fromFile("image.tiff")); ocr.getLanguageContainer().addLanguage(LanguageFactory.load( "Portuguese-RSC-HS-PB-ResourcesAllCharsNet.zip")); // Resource file name ocr.getConfig().setRemoveNonText(true); if (ocr.Process()) { System.out.println(ocr.Text); }
public boolean getSavePreprocessedImages()
public char[] getWhitelist()
Gets or sets a white list of characters. This property allows to specify a new recognition alphabet (in other words – to restrict recognition alphabet). This feature may be useful when, for example, images you need to process contain only digits. Whitelist may guarantee that recognition result will contain only digits without any characters (e.g. digit ‘0’ may be recognized as char ‘O’, or ‘1’ as ‘I’).
OcrEngine ocr = new OcrEngine(); ocr.getConfig().setWhitelist(new char[] {'1', '2', '3', '4', '5', '6', '7', '8', '9', '0'}); ocr.setImage(com.aspose.ocr.ImageStream.fromFile("numbers.png")); ocr.process();
public void setAdjustRotation(int value)
Sets a value of adjust rotation mode AdjustRotationMode
. Possible values:
AdjustRotationMode.Automatic
- skew angle is defined automatically (this may take some time but will improve recognition quality)
AdjustRotationMode.UserDefined
- image is rotated on angle defined in AdjustRotationAngle
AdjustRotationMode.Disabled
- no image rotation will be applied
public void setAdjustRotationAngle(double value)
Sets a value of clockwise rotation angle, in degrees. Use only when AdjustRotation is equal to AdjustRotationMode.UserDefined.
public void setAdjustUpsideDownRotation(int value)
Sets a value of upsidedown rotation mode AdjustUpsideDownRotationMode
.
Set it to AdjustUpsideDownRotationMode.Flip
if text on image is upsidedown so it will be rotated to 180 degrees
public void setCorrectionFilters(CorrectionFilters value)
Sets filters for correction before recognition
public void setDeleteTableLines(boolean value)
Sets a value that indicates whether need to find and delete table lines in document Don't use this property unnecessarily, because it may remove some meaningful information from the image and will increase total processing time
OcrEngine ocr = new OcrEngine(); ocr.setImage(ImageStream.fromFile("image.tiff")); ocr.getLanguageContainer().addLanguage(LanguageFactory.load( "Portuguese-RSC-HS-PB-ResourcesAllCharsNet.zip")); // Resource file name ocr.getConfig().setDeleteTableLines(true); if (ocr.Process()) { System.out.println(ocr.getText()); }
public void setDetectReadingOrder(boolean value)
Sets a value that indicates whether special reading order detection operation must be applied to textblocks Default value is TRUE If OCR process works too much time or is not responding, try to set this property to FALSE
OcrEngine ocr = new OcrEngine(); ocr.setImage(ImageStream.fromFile("image.tiff")); ocr.getLanguageContainer().addLanguage(LanguageFactory.load( "Portuguese-RSC-HS-PB-ResourcesAllCharsNet.zip")); // Resource file name ocr.getConfig().setDetectReadingOrder(false); if (ocr.Process()) { System.out.println(ocr.getText()); }
public void setDetectTextRegions(boolean value)
Sets a value indicating whether automatical detection of the regions with text must be used. If this property is set to "true", manually set recognition blocks will be ignored.
public void setDoSpellingCorrection(boolean value)
Sets a value indicating whether automatic spelling correction should be applied. Use this option to improve OCR result but notice that this will increase total processing time
public void setProbabilityRow(int value)
Sets row with probable symbols to return probable symbols from. Default value is 0, so only the most probable symbols return.
public void setProcessColoredBackground(boolean value)
Turn on this option if input image have complex colored background and standart process works bad
public void setQuantizationPalleteSize(int value)
It is advanced option, which control number of colors to image pallete while quantization; this option make sence olny if ProcessColoredBackground is true; defalt value is 5
public void setRemoveNonText(boolean value)
Set this parameter to TRUE if image contains non-text components (e.g. pictures, graphics) that should be detected and ignored during recognition process
OcrEngine ocr = new OcrEngine(); ocr.setImage(ImageStream.fromFile("image.tiff")); ocr.getConfig().setRemoveNonText(true); if (ocr.Process()) { System.out.println(ocr.Text); }
public void setSavePreprocessedImages(boolean value)
value
- OcrEngine ocr = new OcrEngine(); ocr.Image = ImageStream.fromFile("image.tiff"); ocr.getConfig().setSavePreprocessedImages(true); if (ocr.process()) { BufferedImage im = ocr.getPreprocessedImages().getBinarizedImage(); }
public void setWhitelist(char[] value)
Sets a white list of characters. If it is not null and not empty, OCR recognizes only the white-listed characters.
OcrEngine ocr = new OcrEngine(); ocr.getConfig().setWhitelist(new char[] {'1', '2', '3', '4', '5', '6', '7', '8', '9', '0'});