Go to Robotic Automation version 22.1 documentation

PdfConnector properties, methods, and events

Use the PdfConnector component in an automation to search for text, extract text and images, and annotate PDF files. You cannot use the PdfConnector component to view, create, or modify PDF files, other than to annotate them.

You can use this component to process PDF files with or without user interaction. For instance, you can use this component to process a PDF file before presenting it to the Runtime user.

Use this component with the PdfViewer component to bring annotations and highlights into the Runtime viewer’s attention.

Note: In PDF files, left and right coordinates are offsets from the left of the page. Bottom and top coordinates are offsets from the bottom of the page.

The PdfConnector component contains the following properties, methods, and events.

Properties

Property

Description

AnnotationCount

(Read-only) Displays the number of annotations in the document.

AutoSaveEnabled

(19.1.14 or later)

Enter True if changes to a PDF file should automatically be saved when the PDF file is closed, otherwise False.

DetectedDocumentType

(19.1.14 or later)

(Read-only) Displays the DocumentType object for the currently loaded document.

DetectedDocumentTypeName

(19.1.14 or later)

Read-only) Displays the name of the currently loaded document type.

FileName

Specifies the name of the PDF file.

LineCount

(19.1.14 or later)

(Read-only) Displays the number of lines in the PDF file.

LineThreshold

The system compares the amount of white space between these points when comparing two pieces of text to determine if the text is on the same line:

·         The amount of white space above the top of the line

·         The amount of white space below the bottom of the line

Your entry in this property sets the threshold. If the white space is less than or equal to your entry, the system considers the text to be on the same line. If it is more than your entry, it considers the text to be on different lines.

The default is 2.0 points, with a point being equal to 1/72 of an inch.

HasFormFields

Specifies an open PDF file that has form fields that can be written to.

HasSaved

(Read-only) Indicates that the PDF file has been saved.

IsDocXfaFormat

(Read-only) Indicates if the opened PDF file is in XFA (XML Forms Architecture) format. Support for XFA format PDF files is limited. You can use the PdfViewer component to display XFA-formatted files but you cannot edit them.

ImageCount

(Read-only) Indicates the number of images you can extract from the document.

IsOpen

(Read-only) Indicates if the PDF file has been successfully opened.

OutputName

Specifies the file name to assign to the output PDF file. Be sure to specify an output file name to avoid overwriting the original PDF during the design phase.

Pages

(Read-only) Provides a list of the PdfPage objects. These objects represent the pages in the document.

PageCount

(Read-only) Indicates the number of pages in the document.

SegmentCount

(19.1.14 or later)

(Read-only) Displays the number of segments in the PDF file.

SegmentThreshold

The system compares the amount of white space between these points when comparing two pieces  of text to determine if they are part of the same segment of text:

·         The amount of white space above the top of the segment

·         The amount of white space below the bottom of the segment

Your entry in this property sets the threshold. If the white space is less than or equal to your entry, the system considers the text to be part of the same segment. If it is more than your entry, it considers the text to be in different segments.

The default is 10 points, with a point being equal to 1/72 of an inch.

TableCount

(19.1.14 or later)

(Read-only) Displays the number of tables in the PDF file.

Text

(Read-only) Returns all of the text in the document as a single value. The system omits comments and annotation text.

WordCount

(19.1.14 or later)

(Read-only) Displays the number of words in the PDF file.

WordThreshold

The system looks at the amount of white space between pieces of text to determine if the text comprises a single word or if the white space indicates there are two words.

Your entry in this property sets the threshold. If the space is less than or equal to your entry, the system considers the text to part of the same word. If it is more than your entry, it considers the text to be different words.

The default is 2.2 points, with a point being equal to 1/72 of an inch.

Methods

Method

Description

Return type

Annotate(AnnotationType typ, int pg, string tx, float lf, float rt, float tp, float bt, Color clr)

Adds an annotation based on the position you specify, such as float left, right, top, or bottom.

Boolean

Annotate(PdfLine line, AnnotationType type, string annotationText, Color color)

Adds an annotation based on the ordinal line number you specify.

Boolean

Annotate(PdfSegment segment, AnnotationType type, string annotationText, Color color)

Adds an annotation based on the segment you specify.

Boolean

Annotate(PdfWord word, AnnotationType type, string annotationText, Color color)

Adds an annotation based on the word you specify.

Boolean

Annotate(PdfPhrase phrase, AnnotationType type, string annotationText, Color color)

Adds an annotation based on the phrase you specify.

Boolean

AppendPages[FiletoAppend]

(19.1.21 or later)

Appends the PDF file that you specify to the currently loaded PDF file.

Boolean

AppendPages[fileToAppend, inputFileName]

(19.1.21 or later)

Appends the PDF file that you specify using the fileToAppend parameter to the target PDF file, specified with the inputFileName parameter.

Boolean

Close()

Closes a PDF file.

Boolean

ConcatFiles[outputFileName, files]

(19.1.21 or later)

Combines the PDF files that you specify into a single PDF file.

You can identify the files with a string array, by typing the file names into the method block, or with a comma-separated list of the individual file names.

Boolean

CombineTables(DataTable inTable1, DataTable inTable2, out DataTable outTable)

(19.1.14 or later)

Combines two data tables into a single table. The tables are not required to have the same schema. Column names are not retained.

Boolean

DeleteAnnotation(PdfAnnotation annotation)

Deletes the annotation you specify.

Boolean

ExtractPages[outputFileName, singleFile, pageList]

(19.1.21 or later)

Extracts pages from the currently loaded PDF file and saves those pages as one or more PDF files, using the name that you specify in the outputFileName parameter.

singleFile - Enter True to combine all extracted pages into a single PDF file, specified in the outputFileName parameter. Enter False to create a separate PDF file for each extracted page. Each PDF file is appended with -Page{pagenumber}.

pageList - List the pages that you want to extract, separated by commas. You can also specify a range of pages. The following is an example: 1,23,6-8,10.

Boolean

ExtractPages[inputFileName, outputFileName, singleFile, pageList]

(19.1.21 or later)

Extracts pages from the PDF file that you specify in the inputFileName parameter and saves those pages as one or more PDF files, using the name that you specify in the outputFileName parameter.

singleFile - Enter True to combine all extracted pages into a single PDF file, specified in the outputFileName parameter. Enter False to create a separate PDF file for each extracted page. Each PDF file is appended with -Page{pagenumber}.

pageList - List the pages that you want to extract, separated by commas. You can also specify a range of pages. The following is an example: 1,23,6-8,10.

Boolean

ExtractPagesWithText[outputFileName, singleFile, textToFind, adjacent, pagesBefore, pagesAfter]

(19.1.21 or later)

Extracts pages from the currently loaded PDF file that contain the text that you specify. This method then saves those pages as one or more PDF files, using the name that you specify in the outputFileName parameter.

singleFile - Enter True to combine all extracted pages into a single PDF file, specified in the outputFileName parameter. Enter False to create a separate PDF file for each extracted page. Each PDF file is appended with -Page{pagenumber}.

textToFind - Enter the text that you want the system to find as it identifies the pages to extract. Case does not matter.

pagesBefore - Specify the number of pages before the textToFind to identify the page that you want to extract. For example, if the system finds the textToFind on page 5 and you enter 3, the system extracts pages starting with page 2.

pagesAfter - Specify the number of pages after the textToFind to identify the page that you want to extract. For example, if the system finds the textToFind on page 5 and you enter 3, the system extracts pages 6- 8.

Boolean

ExtractPagesWithText[inputFileName, outputFileName, singleFile, textToFind, adjacent, pagesBefore, pagesAfter]

(19.1.21 or later)

Extracts pages from the PDF file that you specify in the inputFileName parameter that contain the text that you specify. This method then saves those pages as one or more PDF files, using the name that you specify in the outputFileName parameter.

singleFile - Enter True to combine all extracted pages into a single PDF file, specified in the outputFileName parameter. Enter False to create a separate PDF file for each extracted page. Each PDF file is appended with -Page{pagenumber}.

textToFind - Enter the text that you want the system to find as it identifies the pages to extract. Case does not matter.

pagesBefore - Specify the number of pages before the textToFind to identify the page that you want to extract. For example, if the system finds the textToFind on page 5 and you enter 3, the system extracts 2.

pagesAfter - Specify the number of pages after the textToFind to identify the page that you want to extract. For example, if the system finds the textToFind on page 5 and you enter 3, the system extracts pages 6- 8.

Boolean

FindPage(string searchFor, out int pageNumber)

Finds the first page that contains the text you specify.

Boolean

FindPage(string searchFor, int startPage, out int pageNumber)

Finds the first page that contains the text you specify. The system starts the search on the page number you specify.

Boolean

FindPage(string searchFor, int startPage, int endPage, out int pageNumber)

Finds  the first page that contains the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindPage(string searchFor, PdfLine searchAfter, out int pageNumber)

Finds  the first page that contains the text you specify. The system starts the search after the line you specify.

Boolean

FindPage(string searchFor, PdfSegment searchAfter, out int pageNumber)

Finds the first page that contains the text you specify. The system starts the search after the segment you specify.

Boolean

FindPage(string searchFor, PdfWord searchAfter, out int pageNumber)

Finds the first page that contains the text you specify. The system starts the search after the word you specify.

Boolean

FindPages(string searchFor, out int[] pageNumbers)

Finds all pages that contain the text you specify.

Boolean

FindPages(string searchFor, int startPage, out int[] pageNumbers)

Finds all pages that contain the text you specify. The system starts the search at the page number you specify.

Boolean

FindPages(string searchFor, int startPage, int endPage, out int[] pageNumbers)

Finds all pages that contain the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindPages(string searchFor, PdfLine searchAfter, out int[] pageNumbers)

Finds all pages that contain the text you specify. The system starts the search after the line you specify.

Boolean

FindPages(string searchFor, PdfSegment searchAfter, out int[] pageNumbers)

Finds all pages that contain the text you specify text. The system starts the search after the segment you specify.

Boolean

FindPages(string searchFor, PdfWord searchAfter, out int[] pageNumbers)

Finds all pages that contain the text you specify. The system starts the search after the word you specify.

Boolean

FindLine(string searchFor, out PdfLine line)

Finds the first line that contains the text you specify.

Boolean

FindLine(string searchFor, int startPage, out PdfLine line)

Finds the first line that contains the text you specify, starting the search at the page number you specify.

Boolean

FindLine(string searchFor, int startPage, int endPage, out PdfLine line)

Finds the first line that contains the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindLine(string searchFor, PdfLine searchAfter, out PdfLine line)

Finds the first line that contains the text you specify. The system starts the search after the line you specify.

Boolean

FindLine(string searchFor, PdfSegment searchAfter, out PdfLine line)

Finds the first line that contains the text you specify. The system starts the search after the segment you specify.

Boolean

FindLine(string searchFor, PdfWord searchAfter, out PdfLine line)

Finds the first line that contains the text you specify. The system starts the search after the word you specify.

Boolean

FindLines(string searchFor, out PdfLine[] lines)

Finds all of the lines that contain the text you specify.

Boolean

FindLines(string searchFor, int startPage, out PdfLine[] lines)

Finds all of the lines that contain the text you specify, starting the search at the page number you specify.

Boolean

FindLines(string searchFor, int startPage, int endPage, out PdfLine[] lines)

Finds all of the lines that contain the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindLines(string searchFor, PdfLine searchAfter, out PdfLine[] lines)

Finds all of the lines that contain the text you specify. The system starts the search after the line you specify.

Boolean

FindLines(string searchFor, PdfSegment searchAfter, out PdfLine[] lines)

Finds all of the lines that contain the text you specify. The system starts the search after the segment you specify.

Boolean

FindLines(string searchFor, PdfWord searchAfter, out PdfLine[] lines)

Finds all of the lines that contain the text you specify. The system starts the search after the word you specify.

Boolean

FindPhrase(string searchFor, out PdfPhrase phrase)

Finds the first occurrence of the text you specify.

Boolean

FindPhrase(string searchFor, int startPage, out PdfPhrase phrase)

Finds the first occurrence of the text you specify, starting the search at the page number you specify.

Boolean

FindPhrase(string searchFor, int startPage, int endPage, out PdfPhrase phrase)

Finds the first occurrence of the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindPhrases(string searchFor, out PdfPhrase[] phrases)

Finds all occurrences of the text you specify.

Boolean

FindPhrases(string searchFor, int startPage, out PdfPhrase[] phrases)

Finds all occurrences of the text you specify, starting the search at the page number you specify.

Boolean

FindPhrases(string searchFor, int startPage, int endPage, out PdfPhrase[] phrases)

Finds all occurrences of the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindRelativeLine(string searchFor, int occurrence, int relativeLineOffset, out PdfLine line)

Finds a specific occurrence of a line The system returns a line relative to the line the system finds.

Boolean

FindRelativeSegment(string searchFor, int occur, int relSegOffset, out PdfSegment seg)

Finds a specific occurrence of a segment. The system returns a segment relative to the segment it found.

Boolean

FindSegment(string searchFor, out PdfSegment segment)

Finds the first segment that contains the text you specify.

Boolean

FindSegment(string searchFor, int startPage, out PdfSegment segment)

Finds the first segment that contains the text you specify, starting the search at the page number you specify.

Boolean

FindSegment(string searchFor, int startPage, int endPage, out PdfSegment segment)

Finds the first segment that contains the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindSegment(string searchFor, PdfLine searchAfter, out PdfSegment segment)

Finds the first segment that contains the text you specify. The system starts the search after the line you specify.

Boolean

FindSegment(string searchFor, PdfSegment searchAfter, out PdfSegment segment)

Finds the first segment that contains the text you specify. The system starts the search after the segment you specify.

Boolean

FindSegment(string searchFor, PdfWord searchAfter, out PdfSegment segment)

Finds the first segment that contains the text you specify. The system starts the search after the word you specify.

Boolean

FindSegments(string searchFor, out PdfSegment[] segments)

Finds all segments that contain the text you specify.

Boolean

FindSegments(string searchFor, int startPage, out PdfSegment[] segments)

Finds all segments that contain the text you specify, starting the search at the page number you specify.

Boolean

FindSegments(string searchFor, int startPage, int endPage, out PdfSegment[] segments)

Finds all segments that contain the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindSegments(string searchFor, PdfLine searchAfter, out PdfSegment[] segments)

Finds all segments that contain the text you specify. The system starts the search after the line you specify.

Boolean

FindSegments(string searchFor, PdfSegment searchAfter, out PdfSegment[] segments)

Finds all segments that contain the text you specify. The system starts the search after the segment you specify.

Boolean

FindSegments(string searchFor, PdfWord searchAfter, out PdfSegment[] segments)

Finds all segments that contain the text you specify. The system starts the search after the word you specify.

Boolean

FindWord(string searchFor, out PdfWord word)

Finds the first word that contains the text you specify.

Boolean

FindWord(string searchFor, int startPage, out PdfWord word)

Finds the first word that contains the text you specify, starting the search at the page number you specify.

Boolean

FindWord(string searchFor, int startPage, int endPage, out PdfWord word)

Finds the first word that contains the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindWord(string searchFor, PdfLine searchAfter, out PdfWord word)

Finds the first word that contains the text you specify. The system starts the search after the line you specify.

Boolean

FindWord(string searchFor, PdfSegment searchAfter, out PdfWord word)

Finds the first word that contains the text you specify. The system starts the search after the segment you specify.

Boolean

FindWord(string searchFor, PdfWord searchAfter, out PdfWord word)

Finds the first word that contains the text you specify. The system starts the search after the word you specify.

Boolean

FindWords(string searchFor, out PdfWord[] words)

Finds all words that contain the text you specify.

Boolean

FindWords(string searchFor, int startPage, out PdfWord[] words)

Finds all words that contain the text you specify. You must also specify the page numbers on which you want the search to start and end.

Boolean

FindWords(string searchFor, int startPage, int endPage, out PdfWord[] words)

Finds all words that contain the text you specify, searching from one page to another page.

Boolean

FindWords(string searchFor, PdfLine searchAfter, out PdfWord[] words)

Finds all words that contain the text you specify. The system starts the search after the line you specify.

Boolean

FindWords(string searchFor, PdfSegment searchAfter, out PdfWord[] words)

Finds all words that contain the text you specify. The system starts the search after the segment you specify.

Boolean

FindWords(string searchFor, PdfWord searchAfter, out PdfWord[] words)

Finds all words that contain the text you specify. The system starts the search after the word you specify.

Boolean

FindRelativeWord(string searchFor, int occur, int relativeWordOffset, out PdfWord word)

Finds a specific occurrence of a word. The system returns a word relative to the word the system finds.

Boolean

FlattenFields

(19.1.126 or later)

Flattens the form fields in the document, so that you cannot edit any of the fields.

Boolean

GetAnnotation(out PdfAnnotation annotation)

Retrieve the first annotation the system finds.

Boolean

GetAnnotation(int startPage, out PdfAnnotation annotation)

Retrieve the first annotation the system finds, beginning with the page number you specify.

Boolean

GetAnnotation(int startPage, int endPage, out PdfAnnotation annotation)

Retrieve the first annotation the system finds within a range of pages. You must specify the page numbers on which you want the search to start and end.

Boolean

GetAnnotation(AnnotationType type, out PdfAnnotation annotation)

Gets the first annotation the system finds of the annotation type you specified.

Boolean

GetAnnotation(AnnotationType type, int startPage, out PdfAnnotation annotation)

Gets the first annotation the system finds of the annotation type you specified. The system starts the search at the page number you specify.

Boolean

GetAnnotation(AnnotationType type, int startPage, int endPage, out PdfAnnotation annot)

Gets the first annotation the system finds of the annotation type you specified within a range of pages. You must specify the page numbers on which you want the search to start and end.

Boolean

GetAnnotations(out PdfAnnotation[] annotations)

Gets all of the annotations in the PDF file.

Boolean

GetAnnotations(int startPage, out PdfAnnotation[] annotations)

Gets all of the annotations, starting at the page number you specify.

Boolean

GetAnnotations(int startPage, int endPage, out PdfAnnotation[] annotations)

Gets all of the annotations within a range of pages. You must specify the page numbers on which you want the search to start and end.

Boolean

GetAnnotations(AnnotationType type, out PdfAnnotation[] annotations)

Gets all of the annotations of the type you specify.

Boolean

GetAnnotations(AnnotationType type, int startPage, out PdfAnnotation[] annotations)

Gets all of the annotations of the type you specify, starting at the page number you specify.

Boolean

GetAnnotations(AnnotationType type, int start, int end, out PdfAnnotation[] annots)

Gets all of the annotations of the type you specify within a range of pages. You must specify the page numbers on which you want the search to start and end.

Boolean

GetImage(out Image image)

Extracts the first image the system finds.

Boolean

GetImage(int startPage, out Image image)

Extracts the first image the system finds, starting the search at the page number you specify.

Boolean

GetImage(int startPage, int endPage, out Image image)

Extracts the first image the system finds within a range of pages. You must specify the page numbers on which you want the search to start and end.

Boolean

GetImages(out Image[] images)

Extracts all images.

Boolean

GetImages(int startPage, out Image[] images)

Extracts all images, starting at the page number you specify.

Boolean

GetImages(int startPage, int endPage, out Image[] images)

Extracts all images. You must specify the page numbers on which you want the search to start and end.

Boolean

GetTable(out DataTable table, TableFill tableFill)

(19.1.14 or later)

Gets the first table found in the document.

Boolean

GetTable(out DataTable table, TableFill tableFill, Int32 startPage)

(19.1.14 or later)

Gets the first table found, starting with the StartPage.

Boolean

GetTable(out DataTable table, TableFill tableFill, Int32 startPage, Int32 endPage)

(19.1.14 or later)

Gets the first table found between the StartPage and EndPage.

Boolean

GetTable(out DataTable table, TableFill tableFill, string startText, Boolean canSpanPages, Int32 distanceFromBottom, string[] endText)

(19.1.14 or later)

Gets the first table found after the startText is located in the document and before one of the endText items is found. Sets canSpanPages to True If the table spans pages.

Tables that span pages must be located consecutively in the document and be the last table on a page and the first table on the succeeding page.

The distanceFromBottom is the number of points (1/72 of an inch) where the table stops before continuing to the next page. Any data below this point is ignored. Entering zero (0) tells the system to ignore this setting and continue processing to the end of the page.

Boolean

GetTable(out DataTable table, TableFill tableFill, Int32 startPage, string startText, Boolean canSpanPages, Int32 distanceFromBottom, string[] endText)

(19.1.14 or later)

Gets the first table found starting from the startPage and searching after the location where the startText is located in the document and before one of the endText items is found.

Set canSpanPages to True If the table spans pages. Tables that span pages must be located consecutively in the document and be the last table on a page and the first table on the succeeding page.

 The distanceFromBottom is the number of points (1/72 of an inch) where the table stops before continuing to the next page. Any data below this point is ignored. Entering zero (0) tells the system to ignore this setting and continue processing to the end of the page.

Boolean

GetTable(out DataTable table, TableFill tableFill, Int32 startPage, Int32 endPage, string startText, Boolean canSpanPages, Int32 distanceFromBottom, string[] endText)

(19.1.14 or later)

Gets the first table found searching between the startPage and the endPage after the location where the startText is located in the document and before one of the endText items is found.

Set canSpanPages equal to true If the table spans pages. Tables that span pages must be located consecutively in the document and be the last table on a page and the first table on the succeeding page.

The distanceFromBottom is the number of points (1/72 of an inch) where the table stops before continuing to the next page. Any data below this point is ignored. Entering zero (0) tells the system to ignore this setting and continue processing to the end of the page.

Boolean

GetTables(out DataTable[] tables, TableFill tableFill)

(19.1.14 or later)

Gets all tables found in the document.

Boolean

GetTables(out DataTable[] tables, TableFill tableFill, Int32 startPage)

(19.1.14 or later)

Gets all the tables found starting on the startPage to the end of the document.

Boolean

GetTables(out DataTable[] tables, TableFill tableFill, Int32 startPage, Int32 endPage)

(19.1.14 or later)

Gets all the tables found between the startPage and the endPage in the document.

Boolean

GetValues(out DataTable resultTable, out string documentType)

(19.1.14 or later)

Gets a table with all Text and Optical Mark values in the current document type and returns the detected document type name.

Boolean

InsertPages[insertBeforePage, fileToInsert]

(19.1.21 or later)

Inserts pages in the currently loaded PDF file before the page that you specify.

insertBeforePage - Enter the page number. For example, if you enter 5, the system inserts the page that you specify using the fileToInsert parameter before page 5.

fileToInsert - Enter the name of the PDF file that you want to insert.

Boolean

InsertPages[inputFileName, insertBeforePage, fileToInsert]

(19.1.21 or later)

Inserts pages in the  PDF file that you specify using the inputFileName parameter before the page that you specify.

insertBeforePage - Enter the page number. For example, if you enter 5, the system inserts the page that you specify using the fileToInsert parameter before page 5.

fileToInsert - Enter the name of the PDF file that you want to insert.

Boolean

PdfPage GetPage(int pageNumber)

Gets the PdfPage object that corresponds to the page number you specify.

PdfPage

Reconcile(out DataTable reconciledTable)

(19.1.14 or later)

Displays a user interface with the current PDF file and the values from the detected document type side by side. This interface allows a user to confirm that the data extracted from the PDF is correct and make corrections to it if necessary.

This method outputs a data table with the original data and the corrected data.

Boolean

Reconcile(Double zoomFactorIn, out DataTable reconciledTable, out Double zoomFactorOut)

(19.1.14 or later)

Displays a user interface with the current PDF file and the values from the detected document type side by side. This interface allows a user to confirm that the data extracted from the PDF is correct and make corrections to it if necessary.

This method accepts a zoomFactorIn value which sets the zoom on the embedded PDF Viewer. This method outputs a data table with the original data and the corrected data plus a zoomFactorOut which is the zoom factor when the dialog was dismissed. Save this  value and use it as the zoomFactorIn on subsequent method calls.

If the user cancels the dialog, the result is False.

Boolean

Redact

(191.126 or later)

Redacts all the information defined by redaction landmarks and flattens the form fields in the document.

Boolean

RedactRectangle

(191.126 or later)

Redacts the information inside a rectangle that you define on a specific page and flattens the form fields in the document.

Boolean

Save()

Saves a PDF file.

Boolean

SplitByText[inputFileName, outputFileName, textToFind]

(19.1.21 or later)

Splits the PDF file that you specify in the inputFileName parameter into one or more PDF files. Each time the system finds the text that you specify in the textToFind parameter, it creates a new PDF file.

outputFileName -Enter the file name that you want to assign to the newly-created PDF files. Each PDF file is then appended with -[filenumber}.

textToFind - Enter the text that you want the system to find as it identifies where to split the PDF file. Case does not matter.

Boolean

Events

Event

Description

FileOpened

Occurs when a file is opened.

OutputSaved

Occurs when the PDF file is saved.

 

Related types

PdfPage

The PdfPage type has these properties and methods.

Properties

Property

Description

Text

(Read-only) Returns all of the text on the page as a single value.

PdfLines

(Read-only) Returns a list of the lines on the page.

PdfSegments

(Read-only) Returns a list of the segments on the page.

PdfWords

(Read-only) Returns a list of the words on the page.

Images

(Read-only) Returns a list of the images on the page.

Annotations

(Read-only) Returns a list of the annotations on the page.

PageNumber

(Read-only) Returns the page number of the page.

LineCount

(Read-only) Returns the number of lines on the page.

SegmentCount

(Read-only) Returns the number of segments on the page.

WordCount

(Read-only) Returns the number of words on the page.

ImageCount

(Read-only) Returns the number of images on the page.

AnnotationCount

(Read-only) Returns the number of annotations on the page.

Methods

Method

Description

Return type

FindLine(string searchFor, out PdfLine line)

Finds the number of the first line on the page that contains the text you specify.

Boolean

FindLines(string searchFor, out PdfLine[] lines)

Finds all of the lines on the page that contain the text you specify.

Boolean

FindSegment(string searchFor, out PdfSegment segment)

Finds the first segment on the page that contains the text you specify.

Boolean

FindSegments(string searchFor, out PdfSegment[] segments)

Finds all of the segments on the page that contain the text you specify.

Boolean

FindWord(string searchFor, out PdfWord word)

Finds the first word on the page that contains the text you specify.

Boolean

FindWords(string searchFor, out PdfWord[] words)

Finds all of the words on the page that contain the text you specify.

Boolean

PdfLine GetLine(int lineNumber)

Gets the line on the page that corresponds to the line number.

PdfLine

PdfSegment GetSegment(int segmentNumber)

Gets the segment on the page that corresponds to the segment number.

PdfSegment

PdfWord GetWord(int wordNumber)

Gets the word on the page that corresponds to the word number you specify.

PdfWord

GetImage(int imageNumber)

Gets the nth image on the page. You specify the ordinal number of the image you want.

Image

PdfAnnotation GetAnnotation(int annotationNumber)

Gets the nth annotation on the page. You specify the ordinal number of the image you want.

PdfAnnotation

Annotate(AnnotationType type, string tx, float lft, float rt, float tp, float bt, Color color)

Adds an annotation to the page. You specify the position of the annotation, such as float left, right, top, or bottom.

void

GetRelativePage(int relativePageOffset, out PdfPage resultPage)

Gets a page relative to this page.

Boolean

 

PdfLine

The PdfLine type has these properties and methods.

Properties

Property

Description

Text

(Read-only) Returns the text of the line.

PageNumber

(Read-only) Returns the page number where the line is located.

LineNumber

(Read-only) Returns the line’s ordinal line number.

Left

(Read-only) Returns the line’s left position on the page.

Right

(Read-only) Returns the line’s right position on the page.

Top

(Read-only) Returns the line’s top position on the page.

Bottom

(Read-only) Returns the line’s bottom position on the page.

 

Methods

Method

Description

Return type

GetRelativeLine(int relativeLineOffset, out PdfLine resultLine)

Gets a line that is relative to this line.

Boolean

 

PdfSegment

The PdfSegment type has these properties and methods.

Properties

Property

Description

Text

(Read-only) Returns the text of the segment.

PageNumber

(Read-only) Returns the page number where the segment is located.

SegmentNumber

(Read-only) Returns the segment’s ordinal line number.

Left

(Read-only) Returns the segment’s left position on the page.

Right

(Read-only) Returns the segment’s right position on the page.

Top

(Read-only) Returns the segment’s top position on the page.

Bottom

(Read-only) Returns the segment’s bottom position on the page.

Methods

Method

Description

Return Type

GetRelativeSegment(int relativeSegmentOffset, out PdfSegment resultSegment)

Gets a segment relative to this segment.

Boolean

 

PdfWord

The PdfWord type has these properties and methods.

Properties

Property

Description

Text

(Read-only) Returns the text of the word.

PageNumber

(Read-only) Returns the page number where the word is located.

WordNumber

(Read-only) Returns the word’s ordinal line number.

Left

(Read-only) Returns the word’s left position on the page.

Right

(Read-only) Returns the word’s right position on the page.

Top

(Read-only) Returns the word’s top position on the page.

Bottom

(Read-only) Returns the word’s left position on the page.

Methods

Method

Description

Return type

GetRelativeWord(int relativeWordOffset, out PdfWord resultWord)

Gets a word relative to this word.

Boolean

 

PdfPhrase

The PdfPhrase type has these properties.

Properties

Property

Description

Text

(Read-only) Returns the text of the phrase.

PageNumber

(Read-only) Returns the page number where the phrase is located.

Left

(Read-only) Returns the phrase’s left position on the page.

Right

(Read-only) Returns the phrase’s right position on the page.

Top

(Read-only) Returns the phrase’s top position on the page.

Bottom

(Read-only) Returns the phrase’s left position on the page.

 

PdfAnnotation

The PdfAnnotation type has these properties.

Properties

Property

Description

AnnotationType

(Read-only) Returns the type of annotation. The PdfAnnotation property has these types:

Unknown – The type of annotation is unknown.

Text – PDF viewers typically show a Text annotation as an icon at a specified position. There is typically a way to show the reader the associated comment text, such as by clicking on the icon.

Highlight - PDF viewers typically show a Highlight annotation as an area of the page highlighted in a specified color. You can choose the color by entering RGB values or color names predefined on the Windows color palette. A Highlight annotation can also include a comment.

Text

(Read-only) Returns the comment text for the annotation.

PageNumber

(Read-only) Returns the page number where the annotation is located.

Left

(Read-only) Returns the annotation’s left position on the page.

Right

(Read-only) Returns the annotation’s right position on the page.

Top

(Read-only) Returns the annotation’s top position on the page.

Bottom

(Read-only) Returns the annotation’s left position on the page.

Color

(Read-only) Returns a value of datatype Color. This value can be stored in a variable of datatype Color, assigned to a property of datatype Color (like a button’s BackColor), passed as a parameter to a method that expects a color, and so on.

 

DocumentType annotations

In build 19.1.21 and later, you can use DocumentType annotations with the Annotation methods.  These methods allow you to work with annotations that are associated with DocumentType values. You can annotate Document Type values directly using the following methods. Document Type values that support annotation include the following:

For example, you can use these methods to add an annotation to the specified value.

Method

Description

Return type

Annotate[annotationType, annotationText, color]

Use this method to add an annotation.

Boolean

DeleteAnnotation[index]

Use this method to delete an annotation.

Boolean

DeleteAnnotation[annotation]

Use this method to delete an annotation. A PdfAnnotation proxy is the input for this method. You can use either  GetAnnotation method to retrieve a PdfAnnotation proxy.

Boolean

GetAnnotation[index]

Use this method to retrieve an annotation.

Boolean

GetAnnotations[annotation]

Use this method to retrieve the annotation proxy that you specify in the Annotation parameter.

For example, if you did enter GetAnnontation(1), the system returns a PdfAnnotation proxy that you can then pass to the DeleteAnnontation method.

Boolean

 

 

 


Privacy | Trademarks | Terms of Use | Feedback

Updated: 01 July 2024

© 2016 - 2024 Pegasystems Inc.  Cambridge, MA All rights reserved.

 

OpenSpan data classification label