DynaPDF Manual - Page 61

Previous Page 60   Index   Next Page 62

Content parsing & editing
Page 61 of 860
ParsePage
Syntax:
LBOOL psrParsePage(
const PPDF* IPDF,
// PDF instance pointer
const IPSR* Ctx,
// Parser instance pointer
const void* UserData,
// Optional user data
struct TPDFParseCallbacks* Funcs, // Optional callback functions
UI32 PageNum,
// Page number to parse
TContentParseFlags Flags,
// See below
struct TContentParseParms* Parms, // Optional parameters
struct TContent* Out)
// Required output variable
typedef enum TContentParseFlags
{
cpfDefault
= 0x00000000, // Nothing special to do.
cpfComputeBBox
= 0x00000001, // Compute bounding boxes of all objects.
cpfFlattenLayers
= 0x00000002, // Flatten layers.
cpfSkipInvisibleObjects = 0x00000004, // Ignore invisible objects.
cpfFullRecursive
= 0x00000008, // Parse all objects recursively.
cpfNoInlineTemplate
= 0x00000010, // Do not resolve templates if reference count = 1.
cpfCalcDeviceColors
= 0x00000020, // Compute device colors of all colors which are set in
// the content streams.
cpfImidiateMode
= 0x00000040, // Internal. This flag is always set if the
// TPDFParseCallbacks structure is passed to
// ParsePage(). It disables certain optimisations.
cpfNewLinkNames
= 0x00000080, // Internal. Create new link names for all objects.
// Used by Optimize() and CheckConformance().
cpfEnableTextSelection
= 0x00000100, // This flag is required to enable text selection and
// text extraction.
cpfInitMatrix
= 0x00000200, // The transformation matrix must be set in the
// TContentParseParms structure.
cpfSkipClipPaths
= 0x00000400, // Useful for debugging purposes.
cpfSkipImages
= 0x00000800, // Ignore all images. This flag is useful for text
// extraction.
cpfSkipShadings
= 0x00001000, // Useful for debugging purposes.
cpfSkipText
= 0x00002000, // Useful for debugging purposes.
cpfSkipVector
= 0x00004000
// Useful for debugging purposes. Exclude vector
// graphics with exception of clipping paths.
}TContentParseFlags;
#pragma pack(1)
struct TContentOP
{
BYTE OP;
// Operator to excecute.
void* Param; // This pointer is set for operators which have parameters.
};
#pragma pack()
struct TContent
{
UI32
Count; // Number of available operators.
struct TContentOP* OP;
// Array of operators.
};
The function parses a page and stores the page contents in a C object structure. Once a page was
parsed various functions can be called, e.g. to extract the text of a page, to find and replace text,
or to delete arbitrary operators.
 

Previous topic: GetSelText

Next topic: ReplaceSelText, Font substitution