Cheshire3 Object Model: Class PdfToTxtPreParser

Module preParser :: Class PdfToTxtPreParser

Class PdfToTxtPreParser
source code

Object Tree:
           object --+        
                    |        
configParser.C3Object --+    
                        |    
    baseObjects.PreParser --+
                            |
                           PdfToTxtPreParser

Convert PDF to text via pdftotext utility

Instance Methods

__init__(self, session, server, parent)
The constructor for all Cheshire3 objects take the same arguments: session: A Session object topNode: The <config> or <subConfig> domNode for the configuration parent: The object that provides the scope for this object.
process_document(self, session, doc)
Take a Document, transform it and return a new Document object.

Inherited from configParser.C3Object: auth_function, get_config, get_default, get_object, get_path, get_setting, log_function, unauth_function, unlog_function

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__


Class Variables

inMimeType  
outMimeType  

Inherited from configParser.C3Object: configStore, defaults, functionLogger, id, name, objectType, objects, parent, paths, permissionHandlers, settings, subConfigs, unresolvedObjects

Inherited from object: __class__


Method Details

__init__(self, session, server, parent)
(Constructor)

source code 
The constructor for all Cheshire3 objects take the same arguments: session: A Session object topNode: The <config> or <subConfig> domNode for the configuration parent: The object that provides the scope for this object.
Overrides: configParser.C3Object.__init__
(inherited documentation)

process_document(self, session, doc)

source code 
Take a Document, transform it and return a new Document object.
Overrides: baseObjects.PreParser.process_document
(inherited documentation)

Class Variable Details

inMimeType

Value:
'application/pdf'                                                      
      

outMimeType

Value:
'text/plain'