Optimizing BuildPDFFromDocuments steps

The BuildPDFFromDocuments step is a resource-intensive step that assembles a PDF job from documents that have been identified by RICOH ProcessDirector. If there are a large number of documents or if you use complex rules in your file, the processing duration for this step is measured in hours, not minutes. Advanced configuration options are available to reduce processing time in this step for very large or very complex jobs.
    Important:
  • This configuration is highly advanced. It is not recommended for all RICOH ProcessDirector implementations.

Your system is a good candidate for this configuration if all of these are true:

  • Processing a single PDF job faster is more important than total throughput.
  • Your PDF jobs contain many documents or use complex rules that cause processing to take a very long time.
  • You have (or can acquire) adequate hardware resources for the system that the step uses. The system must have many CPU cores, as each thread used for processing requires three or four background threads to manage memory usage and other activities. The system must also have at least 16GB RAM available for this processing on your primary computer or additional servers to set up as secondary servers.
  • Members of your team have the skills and tools necessary to monitor system performance. To tune the system correctly, you must be able to monitor:
    • Memory used by Java processes
    • Java garbage collection
    • System memory usage
    • CPU utilization
    • Resource swapping
    • System I/O

In this configuration, you set up your system to run the BuildPDFFromDocuments step using multiple threads. The configuration requires the step to run in a separate Java Virtual Machine (JVM). To run the step in its own JVM, create a secondary server and tune the BuildPDFFromDocuments step to run on that server. No other steps should be configured to run on that server.

You can create the secondary server on the primary computer (called a local secondary) or on a different system (called a remote secondary).

    Note:
  • To use a remote secondary, you must purchase the Secondary Server feature and install it on that system.

To optimize BuildPDFFromDocuments steps:

  1. Decide which configuration is best for your environment: creating a local secondary or using a remote secondary server.

    To determine whether using a local secondary is feasible, review the topic Tuning Java memory allocation. We recommend allocating at least 16GB of RAM to the local secondary server. If your calculations from that topic indicate that you have 16GB RAM available or you can install additional memory on the system, you can use a local secondary server. Continue with step .

    If you need to use a remote secondary, make sure you have 16GB RAM available on the system you plan to use, then continue with the next step.

  2. To set up a remote secondary server, complete the procedures in the section Installing remote secondary servers.
      Note:
    • You must export the /aiw file system so you can access it from the remote secondary server. These procedures provide instructions for configuring NFS and using it to do the export. If you prefer to use a different method, follow your internal procedures to configure the export.

    When you define the secondary server in RICOH ProcessDirector, set the In general server pool property to No and leave the Maximum resource intensive step set to 1.

  3. Create a local secondary server using this procedure: Defining secondary servers on the primary computer.
    Set the In general server pool property to No and leave the Maximum resource intensive step set to 1.
      Note:
    • Stop after you create the secondary server. Do not tune the BuildPDFFromDocuments step template yet.
  4. Update the JVM settings for this secondary server.
    1. Open $AIWDATA/config/jvmsettings.cfg in a text editor.
      By default, $AIWDATA is /aiw/aiw1.
    2. Copy the line for the primary server that looks like this:
      primary=-Xmx2048m -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true

      Paste it into a new line in the file.

      The value after primary=-Xmx is the maximum amount of heap memory the RICOH ProcessDirector Java run time environment is allowed to use for the RICOH ProcessDirector primary process. In this example, the primary server can use 2048MB (2GB) of RAM for its heap.

    3. Update the line that you copied to change primary to the value of the Server name property for the secondary server.
    4. Update the value after secondary_server_name=-Xmx to the amount of memory that you have available for this process. The value should be at least 16g.
    5. If the line for your secondary server does not include this setting: -XX:+UseG1GC, update the line to add it.
    6. Save and close the file.
  5. Restart RICOH ProcessDirector using startaiw to apply the new settings.
  6. Enable the secondary server in the RICOH ProcessDirector interface.
  7. Tune the BuildPDFFromDocuments step template.
    We recommend making a copy of the step template to use in the multithreaded configuration.
    1. Open Workflow Step Templates.
    2. Find the BuildPDFFromDocuments step template, right-click and choose Copy.
    3. On the General tab, enter a value for the Name property.

      Choose a name that clearly shows that this is the multithreaded version of the step.

    4. On the Tuning tab:
      • Set Advanced tuning to Multithread this step.
      • Set Servers to use to Run on specific servers, then select only the secondary server you created.
    5. On the PDF tab, look at all of the Build PDF control file properties.
      • If your step uses one or more control files, verify that the Build library is set to Use default or control file.
      • If your step does not use any control files, set the Build library to PDF Java Toolkit.
        Note:
      • The values on the PDF tab can be different for each workflow. Update these values as needed after you add this step template to a workflow.
    6. Update any other settings as needed for your installation.
    7. Click OK.
  8. Enable the step template.
  9. Set the maximum number of threads that the step template is permitted to use.
    • Start with a number of threads that is approximately 25% of the cores in the machine. Each thread used by this step requires other threads to support it, so you cannot use all of the threads for the step. For example, if you have a 16 core server, allocate 4 threads to this step.
    • You might have to experiment to determine the optimal number of threads for your installation.
    1. Open $AIWDATA/config/product.cfg in a text editor.
    2. Add this line to the bottom of the file, substituting the number of threads you want to use for number_of_threads:
      useBuildPdfMT_MaxThreads=number_of_threads
        Note:
      • We recommend starting with 4 threads.
    3. Save and close the file.
  10. Restart RICOH ProcessDirector using startaiw to apply the new setting.
  11. Test a workflow with the new step.
    We recommend copying the workflow whose performance you are trying to improve, then replacing the current BuildPDFFromDocuments step with the multithreaded version. Run a representative job through the workflow.

    During your testing, monitor the following:

    • Java process memory usage

      If the process reached the maximum configured for java memory and the performance is still unacceptable, consider increasing the memory allocation. Stay within the guidelines outlined in Tuning Java memory allocation. Restart the secondary server every time you update the jvmsettings.cfg file.

    • System memory usage
    • CPU usage
    • Resource swapping
    • Java garbage collection

      If you see a lot of garbage collection threads, enable logging for garbage collection. Update jvmsettings.cfg to add these parameters:

      • -verbose:gc
      • -XX:+PrintGCDetails
      • -XX:+PrintGCTimeStamps
      • -XX:+PrintGCDateStamps
      • -Xloggc:/aiw/aiw1/trace/gc_secondary.log
      • -XX:+UseGCLogFileRotation
      • -XX:NumberOfGCLogFiles=10

        Adjust this setting to meet your needs.

      • -XX:GCLogFileSize=8m

        Adjust this setting to meet your needs.

      You must restart RICOH ProcessDirector every time you update jvmsettings.cfg.

      Review the logs to determine the frequency and duration of full garbage collection. If the process takes more than five to ten seconds to complete, it could affect performance.
    • I/O use

  12. After each test, adjust settings to optimize processing. When you finish adjusting the step, update your production workflows to use it instead of the current BuildPDFFromDocuments step.