IdentifyPDFDocuments

A step based on the IdentifyPDFDocuments step template identifies the documents in one or more PDF files. For each document, the step determines page and sheet counts and the values of document properties. The step writes the properties and their values to a document properties file.
Input to the step can be:
  • A single PDF file.
  • Multiple PDF files packaged as a ZIP file.
  • One or more complete sets of files, with each set containing a PDF file and other supporting files.

    A set is a group of files that must be processed together, such as a data file, a job ticket (JDF file), and an overrides file.

When the step receives multiple PDF files as input, it combines them into a single PDF file.

When the step receives multiple sets of PDF and JDF files as input, it combines them into a single PDF file and a single JDF file. The step takes page exceptions for media, sides, and stapling in the JDF input files and adds them to the combined JDF file.

    Note:
  • For all documents in the job, the workflow sets the output bin and the finishing options for punching, folding, and binding.
  • When the step creates the combined JDF file, it includes only values that RICOH ProcessDirector supports. The step discards unsupported values.

You must specify a control file on the step. The default control file treats each PDF file as a document. If any PDF file has more than one document, you must provide a control file that you created with RICOH ProcessDirector Plug-in for Adobe Acrobat. The control file must contain a page group definition. If you need the step to extract the values of document properties, the control file must also map data in the documents to document properties.

IdentifyPDFDocuments sets the values of properties that are related to the document in its original job:

  • Sequence in child job
  • Original pages
  • Original sheets
  • Original input file for documents
  • Original first page
    Note:
  • Original input file for documents is not set if the input to the step Is a single PDF file.
  • Original first page is not displayed in the user interface, but other steps use that property.

Job property defaults

  • Duplex: Yes
  • Identify PDF control file: /aiw/aiw1/testfiles/Default.ctl (Linux) or C:\aiw\aiw1\testfiles\Default.ctl (Windows)
  • Auxiliary input file extension:
  • Headers file:
  • Page exceptions for sides: Replace with job value

Usage Notes

  • A step that is based on this step template cannot be used to process an encrypted PDF file.
  • If you submit multiple PDF files packaged as a ZIP file to a workflow with the IdentifyPDFDocuments step, the ZIP file must contain only PDF files. If it contains other files, the step goes into the error state.
  • When processing PDF files packaged as a ZIP file, the step adds the PDF files to the output PDF file based on their order or timestamps. For example:
    • You submit a ZIP file to the input device. The order of the PDF files in the output PDF file matches the order that the PDF files were placed in the ZIP file.
    • You specify the List batching method on the input device and set the Create .zip file property to Yes. The order of the PDF files in the output PDF file matches the order of the PDF file names in the list file.
    • You specify any batching method other than List on the input device and set the Create .zip file property to Yes. The order of the PDF files in the output PDF file is based on the timestamp of each PDF file in the ZIP file.
  • To submit one or more complete sets of files (with each set containing a PDF file and other supporting files) to a workflow with the IdentifyPDFDocuments step, specify one of these batching methods on the input device: Number of sets, Pages in sets, or Sets by time.
  • If the IdentifyPDFDocuments step produces a combined JDF file, we recommend that you run a step based on the OptimizeJDF step template to combine the page exceptions. Place the step after the BuildPDFFromDocuments step.
  • If you change the value of the Duplex property, the Total sheets property that RICOH ProcessDirector calculates might not match the total number of job sheets that actually stack at the printer.
  • If the PDF job has a JDF job ticket that specifies a combination of simplex and duplex pages, use the Page exceptions for sides property to set how the step combines the Duplex value for the job with the JDF sides settings.
  • If you plan to use the RICOH ProcessDirector viewer to search document properties to find specific documents in a PDF file, you must include an IdentifyPDFDocuments step in your workflow.
  • Place the IdentifyPDFDocuments step after all steps in the workflow that modify the PDF file. If you place the step before a step that modifies the PDF file, unexpected results can occur.
  • Do not modify the JDF file between the IdentifyPDFDocuments step and the BuildPDFFromDocuments step. Modifications made between those steps can cause the JDF file to be incorrect.

    Process a PDF job with only one of the IdentifyPDFDocuments or IdentifyPDFDocumentsFromZip steps, not both. We recommend that you use the IdentifyPDFDocuments step.

  • If you get unexpected results when you process a PDF 2.0 file with a step based on the IdentifyPDFDocuments step, do one of these:
    • Upgrade the control file to the latest version.
    • Place a step based on the OptimizePDF step template in the workflow before the IdentifyPDFDocuments step.
  • Use the version of the RICOH ProcessDirector Plug-in for Adobe Acrobat supplied with RICOH ProcessDirector version 3.6 and above to update the control file. Remember to copy the control file to the correct location for RICOH ProcessDirector to use it.