Aspose::Pdf::Devices::TextDevice Class Referencefinal

Represents class for converting pdf document pages into text. More...

Inherits Aspose::Pdf::Devices::PageDevice.

Public Member Functions

System::SharedPtr< Aspose::Pdf::Text::TextExtractionOptionsget_ExtractionOptions () const
 Gets text extraction options. More...
 
void set_ExtractionOptions (System::SharedPtr< Aspose::Pdf::Text::TextExtractionOptions > value)
 Sets text extraction options. More...
 
System::SharedPtr< System::Text::Encodingget_Encoding () const
 Gets encoding of extracted text. More...
 
void set_Encoding (System::SharedPtr< System::Text::Encoding > value)
 Sets encoding of extracted text. More...
 
virtual void Process (System::SharedPtr< Page > page, System::SharedPtr< System::IO::Stream > output)
 Convert page and save it as text stream. More...
 
 TextDevice (System::SharedPtr< Aspose::Pdf::Text::TextExtractionOptions > extractionOptions)
 Initializes a new instance of the TextDevice with text extraction options. More...
 
 TextDevice ()
 Initializes a new instance of the TextDevice with the Raw text formatting mode and Unicode text encoding. More...
 
 TextDevice (System::SharedPtr< System::Text::Encoding > encoding)
 Initializes a new instance of the TextDevice for the specified encoding. More...
 
 TextDevice (System::SharedPtr< Aspose::Pdf::Text::TextExtractionOptions > extractionOptions, System::SharedPtr< System::Text::Encoding > encoding)
 Initializes a new instance of the TextDevice for the specified encoding with text extraction options. More...
 
- Public Member Functions inherited from Aspose::Pdf::Devices::PageDevice
void Process (System::SharedPtr< Page > page, System::String outputFileName)
 Perfoms some operation on the given page and saves results into the file. More...
 
- Public Member Functions inherited from System::Object
ASPOSECPP_SHARED_API Object ()
 Creates object. Initializes all internal data structures. More...
 
virtual ASPOSECPP_SHARED_API ~Object ()
 Destroys object. Frees all internal data structures. More...
 
ASPOSECPP_SHARED_API Object (Object const &x)
 Copy constructor. Doesn't copy anything, really, just initializes new object and enables copy constructing subclasses. More...
 
Objectoperator= (Object const &x)
 Assignment operator. Doesn't copy anything, really, just initializes new object and enables copy constructing subclasses. More...
 
ObjectSharedRefAdded ()
 Increments shared reference count. Shouldn't be called directly; instead, use smart pointers or ThisProtector. More...
 
int SharedRefRemovedSafe ()
 Decrements and returns shared reference count. Shouldn't be called directly; instead, use smart pointers or ThisProtector. More...
 
int RemovedSharedRefs (int count)
 Decreases shared reference count by specified value. More...
 
Detail::SmartPtrCounter * WeakRefAdded ()
 Increments weak reference count. Shouldn't be called directly; instead, use smart pointers or ThisProtector. More...
 
void WeakRefRemoved ()
 Decrements weak reference count. Shouldn't be called directly; instead, use smart pointers or ThisProtector. More...
 
Detail::SmartPtrCounter * GetCounter ()
 Gets reference counter data structure associated with the object. More...
 
int SharedCount () const
 Gets current value of shared refernce counter. More...
 
ASPOSECPP_SHARED_API void Lock ()
 Implements C# lock() statement locking. Call directly or use LockContext sentry object. More...
 
ASPOSECPP_SHARED_API void Unlock ()
 Implements C# lock() statement unlocking. Call directly or use LockContext sentry object. More...
 
virtual ASPOSECPP_SHARED_API bool Equals (ptr obj)
 Compares objects using C# Object.Equals semantics. More...
 
virtual ASPOSECPP_SHARED_API int GetHashCode () const
 Analog of C# Object.GetHashCode() method. Enables hashing of custom objects. More...
 
virtual ASPOSECPP_SHARED_API String ToString () const
 Analog of C# Object.ToString() method. Enables converting custom objects to string. More...
 
virtual ASPOSECPP_SHARED_API ptr MemberwiseClone () const
 Analog of C# Object.MemberwiseClone() method. Enables cloning custom types. More...
 
virtual ASPOSECPP_SHARED_API const TypeInfoGetType () const
 Gets actual type of object. Analog of C# System.Object.GetType() call. More...
 
virtual ASPOSECPP_SHARED_API bool Is (const TypeInfo &targetType) const
 Check if object represents an instance of type described by targetType. Analog of C# 'is' operator. More...
 
virtual ASPOSECPP_SHARED_API void SetTemplateWeakPtr (unsigned int argument)
 Set n'th template argument a weak pointer (rather than shared). Allows switching pointers in containers to weak mode. More...
 
template<>
bool Equals (float const &objA, float const &objB)
 
template<>
bool Equals (double const &objA, double const &objB)
 
template<>
bool ReferenceEquals (String const &str, std::nullptr_t)
 
template<>
bool ReferenceEquals (String const &str1, String const &str2)
 

Additional Inherited Members

- Public Types inherited from System::Object
typedef SmartPtr< Objectptr
 Alias for smart pointer type. More...
 
typedef System::Details::SharedMembersType shared_members_type
 structure to keep list of shared pointers contained in object. More...
 
- Static Public Member Functions inherited from System::Object
static bool ReferenceEquals (ptr const &objA, ptr const &objB)
 Compares objects by reference. More...
 
template<typename T >
static std::enable_if<!IsSmartPtr< T >::value, bool >::type ReferenceEquals (T const &objA, T const &objB)
 Compares objects by reference. More...
 
template<typename T >
static std::enable_if<!IsSmartPtr< T >::value, bool >::type ReferenceEquals (T const &objA, std::nullptr_t)
 Reference-compares value type object with nullptr. More...
 
template<typename T1 , typename T2 >
static std::enable_if< IsSmartPtr< T1 >::value &&IsSmartPtr< T2 >::value, bool >::type Equals (T1 const &objA, T2 const &objB)
 Compares reference type objects in C# style. More...
 
template<typename T1 , typename T2 >
static std::enable_if<!IsSmartPtr< T1 >::value &&!IsSmartPtr< T2 >::value, bool >::type Equals (T1 const &objA, T2 const &objB)
 Compares value type objects in C# style. More...
 
static const TypeInfoType ()
 Impleemnts C# typeof(System.Object) construct. More...
 
- Protected Member Functions inherited from Aspose::Pdf::Devices::PageDevice
virtual void Process (System::SharedPtr< Page > page, System::SharedPtr< System::Drawing::Graphics > gr)
 renders page on the graphics More...
 
- Protected Member Functions inherited from Aspose::Pdf::Devices::Device
System::SharedPtr< Aspose::Pdf::Documentget_Document () const
 Document which is processed by this device instance. More...
 
void set_Document (System::SharedPtr< Aspose::Pdf::Document > value)
 Document which is processed by this device instance. More...
 

Detailed Description

Represents class for converting pdf document pages into text.

The TextDevice object is basically used to extract text from pdf page.

The example demonstrates how to extract text on the first PDF document page.

Document doc = new Document(inFile);
string extractedText;
using (MemoryStream ms = new MemoryStream())
{
// create text device
TextDevice device = new TextDevice();
// convert the page and save text to the stream
device.Process(doc.Pages[1], ms);
// use the extracted text
ms.Close();
extractedText = Encoding.Unicode.GetString(ms.ToArray());
}

Constructor & Destructor Documentation

◆ TextDevice() [1/4]

Aspose::Pdf::Devices::TextDevice::TextDevice ( System::SharedPtr< Aspose::Pdf::Text::TextExtractionOptions extractionOptions)

Initializes a new instance of the TextDevice with text extraction options.

Parameters
extractionOptionsText extraction options.

◆ TextDevice() [2/4]

Aspose::Pdf::Devices::TextDevice::TextDevice ( )

Initializes a new instance of the TextDevice with the Raw text formatting mode and Unicode text encoding.

◆ TextDevice() [3/4]

Aspose::Pdf::Devices::TextDevice::TextDevice ( System::SharedPtr< System::Text::Encoding encoding)

Initializes a new instance of the TextDevice for the specified encoding.

Parameters
encodingEncoding of extracted text

◆ TextDevice() [4/4]

Aspose::Pdf::Devices::TextDevice::TextDevice ( System::SharedPtr< Aspose::Pdf::Text::TextExtractionOptions extractionOptions,
System::SharedPtr< System::Text::Encoding encoding 
)

Initializes a new instance of the TextDevice for the specified encoding with text extraction options.

Parameters
extractionOptionsText extraction options.
encodingEncoding of extracted text.

Member Function Documentation

◆ get_Encoding()

System::SharedPtr<System::Text::Encoding> Aspose::Pdf::Devices::TextDevice::get_Encoding ( ) const

Gets encoding of extracted text.

The example demonstrates how to represent extracted text in UTF-8 encoding.

Document doc = new Document(inFile);
string extractedText;
// create text device
TextDevice device = new TextDevice(Encoding.UTF8);
// convert the page and save text to the stream
device.Process(doc.Pages[1], outFile);
// use the extracted text
extractedText = File.ReadAllText(outFile, Encoding.UTF8);

◆ get_ExtractionOptions()

System::SharedPtr<Aspose::Pdf::Text::TextExtractionOptions> Aspose::Pdf::Devices::TextDevice::get_ExtractionOptions ( ) const

Gets text extraction options.

The example demonstrates how to extracted text in raw order.

Document doc = new Document(inFile);
string extractedText;
// create text device
TextDevice device = new TextDevice(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Raw));
// convert the page and save text to the stream
device.Process(doc.Pages[1], outFile);
// use the extracted text
extractedText = File.ReadAllText(outFile, Encoding.Unicode);

◆ Process()

virtual void Aspose::Pdf::Devices::TextDevice::Process ( System::SharedPtr< Page page,
System::SharedPtr< System::IO::Stream output 
)
virtual

Convert page and save it as text stream.

The example demonstrates how to extract text on the first PDF document page.

Document doc = new Document(inFile);
string extractedText;
using (MemoryStream ms = new MemoryStream())
{
// create text device
TextDevice device = new TextDevice();
// convert the page and save text to the stream
device.Process(doc.Pages[1], ms);
// use the extracted text
ms.Close();
extractedText = Encoding.Unicode.GetString(ms.ToArray());
}
Parameters
pageThe page to convert.
outputResult stream.

Implements Aspose::Pdf::Devices::PageDevice.

◆ set_Encoding()

void Aspose::Pdf::Devices::TextDevice::set_Encoding ( System::SharedPtr< System::Text::Encoding value)

Sets encoding of extracted text.

The example demonstrates how to represent extracted text in UTF-8 encoding.

Document doc = new Document(inFile);
string extractedText;
// create text device
TextDevice device = new TextDevice(Encoding.UTF8);
// convert the page and save text to the stream
device.Process(doc.Pages[1], outFile);
// use the extracted text
extractedText = File.ReadAllText(outFile, Encoding.UTF8);

◆ set_ExtractionOptions()

void Aspose::Pdf::Devices::TextDevice::set_ExtractionOptions ( System::SharedPtr< Aspose::Pdf::Text::TextExtractionOptions value)

Sets text extraction options.

The example demonstrates how to extracted text in raw order.

Document doc = new Document(inFile);
string extractedText;
// create text device
TextDevice device = new TextDevice(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Raw));
// convert the page and save text to the stream
device.Process(doc.Pages[1], outFile);
// use the extracted text
extractedText = File.ReadAllText(outFile, Encoding.Unicode);