About the Author

Saturday, 2 February 2013

Implementing Parallel processing in Peoplesoft - Part I/III

What is parallel processing?

You might have seen some batch jobs which updates/creates millions of data taking much time to finish the job. It is taking time due to the amount of data that needs to be processed (apart from the bad design/sqls). Parallel processing is a method by which we could finish the job in a matter of minutes. It is a process by which we divide the data into smaller logical sets and running the job for each set at the same time. Peoplesoft support the parallel processing in application engines by means of temporary tables and inbuilt mechanism to launch multiple instances.

Suppose I have a process which updates the salary record of all my employees (say 100000). The employees belong to 5 different business units (BU) with 20000 employees in each business units.

If I run the update process for all the employees without parallel processing, the job has to update all the employees’ record and say it finishes in 10 minutes. What if I divide my employees based on BU and run each instance of the same program simultaneously? Each individual process will take 2 minutes and since all the process is run at the same time, the entire record will be finished at the same time. So I reduced my processing time from 10 minutes to 2 minutes (in real scenario, it may take time more than 2 minutes but still it will be better than the initial 10 minutes time). The following diagram illustrates running of process in parallel.




There will be a delay between the actual processing start of the first instance and second instance (applicable for all instances). This is due to the fact that, each individual set of process will lock the data to be processed in the base table. Since one process is updating the base table, the other one has to wait until the database frees the base table.

Similarly there will be delay when we update back the base table with updated data. But this delay will be negligible when compared to the overall gain in performance.

Read Related:Implementing Parallel processing in Peoplesoft - Part II/III

5 comments:

  1. Hi dear friend,

    i obey what you are saying above .but if 5 users running same process (i given instance count 10)
    with data of 50000 data haves each user what will happen ?
    and may i tell is this parallel processing

    ReplyDelete
  2. Hi dear friend,

    i obey what you are saying above .but if 5 users running same process (i given instance count 10)
    with data of 50000 data haves each user what will happen ?
    and may i tell is this parallel processing

    ReplyDelete
    Replies

    1. Hi Mr.Reddy,

      Could you please elaborate on your question? As per what is understood from your question, the answer will be Yes.

      Splitting up of data and running it parallelly is the basic concept of parallel processing. In your example, running of 50000 rows of data per user will be far better than running 250000 (50000 * 5) rows of data at a stretch.

      Hope you understood.

      Delete
  3. Hello Sir, Nice article. A quick question: If I were to process 100 K rows in an application Engine, do I need to run different instances of application Engine for parallel processing of the process or data could automatically be spread out into number of Temp tables (Say 10000 rows among 10 temp tables where 10000 is arbitrary) in single run of program.

    My objective is to divide data among instances of temp tables in a single run without running many instances of the same program as our program will be scheduled and must not require any user intervention therefore no one will be there to enter run control parameters manually (In your example: BU IDs).

    Our expected setting: One process -> one instance run of the program -> Uses 5 temp tables instances to insert huge data simultaneously in a single run using parallel processing to achieve performance. How we can achieve this? Any help is greatly appreciated.

    Thanks in advance,

    ReplyDelete
    Replies
    1. Hi Maverick,

      Yes, to utilize the parallel processing feature delivered by PeopleSoft, you have to run different instances of the same program by providing different runcontrol id's. You can schedule the five instances in any scheduler like Tidal, Process Scheduler etc. This is how it is implemented in most of the organizations.

      However if you are adamant that you need only one process to be scheduled, then you can create a wrapper process which will in turn create different run control id's and schedule the desired process.

      Delete

Note: only a member of this blog may post a comment.