Thursday, April 17, 2014

How can we control a concurrent program to Run In Specific RAC/PCP node?

How can we control a concurrent program to Run In Specific RAC/PCP node?

There is an interesting feature in R12 for this to achieve.

In concurrent program define window , we can define which node and instance a concurrent program should connect and run, That is through concurrent program session control.

Navigation: System Administrator Responsibility > Concurrent > Program > Define > Click 'Session Control' 

You can define in this screen Target Node and Target Instance for that particler concurrent program. 

Irrespective of the managers running on the node and connected to which ever RAC instance, This definition makes sure the program is run in the specified RAC/PCP node.

What is Target Node 

If you specify the target node on which requests for this program will run. When requests for this program are submitted, They run on this node if that is node is available, 

If no specification is made for the target node of a concurrent program, a request for it will be picked up by any manager available to run it. 

If a node specification is made for a concurrent program and the node is up, only available managers running on the specified node will pick up the request. 

What is Target Instance 

When requests for this program are submitted, they run on this database instance node If it is available.

If no specification is made for the target instance of a concurrent program, a request for it will be picked up by the first manager available to run it and will be run in the instance where the manager is already connected.

If an instance specification is made for a concurrent program and the instance is up, it will be picked up by the first manager available to run it and the manager will run the request in the specified instance. 

However, if the target RAC instance is down, the manager will run the request in the instance where it is already connected and log an appropriate message.

Wednesday, April 16, 2014

All About Oracle Parallel Concurrent Processing (PCP)

All About Oracle Parallel Concurrent Processing (PCP)

1) What is PCP

   - Parallel Concurrent Processing (PCP) is an extension of the Concurrent Processing architecture. 

   - PCP allows concurrent processing activities to be distributed across multiple nodes, maximizing throughput and providing resilience to node failure.

2) How to Configure Parallel Concurrent Processing (PCP)

  Below are steps to configure the PCP in Oracle Applications.

  A) Set Up PCP
  - Edit the applications context file via Oracle Applications Manager, and set the value of the variable APPLDCP to ON.

  - Execute AutoConfig by running the following command on all concurrent processing nodes:

  - $ $INST_TOP/admin/scripts/

  - Source the Applications environment.

  - Check the tnsnames.ora and listener.ora configuration files, located in $INST_TOP/ora/10.1.2/network/admin. Ensure that the required FNDSM and FNDFS entries are present for all other concurrent nodes.

  - Restart the Applications listener processes on each application tier node. 

  - Log on to Oracle E-Business Suite Release 12 using the SYSADMIN account, and choose the System Administrator Responsibility. Navigate to Install > Nodes screen, and ensure that each node in the cluster is registered.

  - Verify that the Internal Monitor for each node is defined properly, with correct primary node specification, and work shift details. For example, Internal Monitor: Host1 must have primary node as host1. Also ensure that the Internal Monitor manager is activated: this can be done from Concurrent > Manager > Administrator. 

  - Set the $APPLCSF environment variable on all the Concurrent Processing nodes to point to a log directory on a shared file system.

  - Set the $APPLPTMP environment variable on all the CP nodes to the value of the UTL_FILE_DIR entry in init.ora on the database nodes. (This value should be pointing to a directory on a shared file system.)

  - Set profile option 'Concurrent: PCP Instance Check' to OFF if database instance-sensitive failover is not required (In case of Non RAC Database). By setting it to 'ON', a concurrent manager will fail over to a secondary Application tier node if the database instance to which it is connected becomes unavailable for some reason.

  B) Set Up Transaction Managers  (Only R12)
  If you are already using the transnational managers and If you wish to have transnational managers fail over, Perform the below steps

  - Shut down the application services (servers) on all nodes

  - Shut down all the database instances cleanly in the Oracle RAC environment, using the command: 
  - SQL>shutdown immediate;
  - Edit the $ORACLE_HOME/dbs/<context_name>_ifile.ora and add the following parameters:

  - Start the instances on all database nodes.

  - Start up the application services (servers) on all nodes.

  - Log on to Oracle E-Business Suite Release 12 using the SYSADMIN account, and choose the System Administrator responsibility. Navigate to Profile > System, change the profile option ‘Concurrent: TM Transport Type' to ‘QUEUE', and verify that the transaction manager works across the Oracle RAC instance.

  - Navigate to Concurrent > Manager > Define screen, and set up the primary and secondary node names for transaction managers.

  - Restart the concurrent managers.

  - If any of the transaction managers are in a deactivated status, activate them from Concurrent > Manager > Administrator.

 C) Set Up Load Balancing on Concurrent Processing Nodes (Only Applicable in case of RAC)

  If you wish to have PCP to use the load balancing capability of RAC, You can perform the below, Connections will load balanced using SID_BALANCE value and they will connect to all the RAC nodes.

  - Edit the applications context file through the Oracle Applications Manager interface, and set the value of Concurrent Manager TWO_TASK (s_cp_twotask) to the load balancing alias (<service_name>_balance>).

  - Execute AutoConfig by running $INST_TOP/admin/scripts/ on all concurrent nodes.

3) Is RAC Mandatory to Implement PCP?

  - No, RAC is not manadatory for PCP, If you have two or more applications nodes, You can enable PCP, But PCP works better in conjunction with RAC to handle all the failover scenarious.

4) How PCP Works with RAC?

 - In RAC Enabled env, PCP uses cp_two_task env variable to connect to DB RAC node, This can be set one CM node to one RAC node or you can set to connect to all the RAC nodes in the cluster.

5) What happens when one of the RAC node goes down when PCP enabled?

 - When Concurrent: PCP Instance Check is set to ON and cp_two_task value set to SID (i.e One CM node connects to only one RAC node always), If one DB node goes down, PCP identifies the DB failure and shifts all the CM managers to other applications node where Database is available.

6)What happen when one of the PCP node goes down?

 - IMON identifies the failure and through FNDSM (service Manager) It initiates ICM to start in surviving node (If ICM is is running on Failed node), ICM will start all the managers.

7) What is primary and Secondary Nodes in PCP?

 - It is requirement to define the primary and secondary node to distribute load on the servers, If this is not defined,All the managers will start on the node where ICM is running by default.

8) How Fail Back happens in PCP?

 - Once failed node comes online, IMON detects and ICM will fail back all the managers defined on that node. 

9) What happens to requests running during failover in PCP?

 - It is important to note RAC and PCP does not support any DML commands and TAF and FAN are not supported with E-Bussiness Suite.
 - When a request is running, If CM goes down it is having status running normal and it will not have any associated process ID, When ICM start in other node, It     
   verifies for all the running normal requests and verifies the OS process ID, If it did not find the process ID, It will resubmit the request to start.

 -  This behavior is normal even in NON PCP env.

 - The Internal Concurrent Manager (ICM) will only restart a request if the following conditions are met

The ICM got the manager's database lock for the manager that was running the request
The phase of the request is "running" (phase_code = 'R')
The program for this request is set to "restart on failure"
All of the above requirements have been met AND at least one of the following:
         a.  The ICM is just starting up, (ie. it has just spawned on a given node and going through initial code before the main loop)
         b.  The node of the concurrent manager for which we got the lock is down
         c.  The database instance (TWO_TASK) defined for the node of that concurrent  manager is down (this is not applicable if one is using some "balance" @ TWO_TASK on that node)

10) How PCP identifies when node goes down?

  - There are two types of failures that PCP recognizes.

a.) Is the node pingable ? 
Issues an operating system ping on the machine name - timeout or available.

b.) Is the database available? 
Query on V$threads and V$instance for value of open or close.

 - When any of the two above failures occur, the following example will illustrate the failover and failback of managers.

Primary node = HOST1 - Managers assigned to primary node are ICM (FNDLIBR-cpmgr) , FNDCRM
Secondary node = HOST2 - Manager assigned to secondary node is STandard Manager (FNDLIBR)

When HOST1 becomes unavailable, both ICM and FNDCRM are migrated over to HOST2.
This is viewable from Administer Concurrent Manager form in System Administrator Responsibility.
The $APPLCSF/log/.mgr logfile will also reflect that HOST1 is being added to unavailable list.

On HOST2, after pmon cycle, FNDICM, FNDCRM, and FNDLIBR are now migrated and running.
(Note: FNDIMON and FNDSM run independently on each concurrent processing node. FNDSM
is not a persistent process, and FNDIMON is a persistent process local to each node)

Once HOST1 becomes available, FNDICM and FNDCRM are migrated back to the original primary 
node for successful failback.

In summary, in a successful fail over and failback scenario, all managers should failover to their secondary node, and once node or instance becomes available; then all managers should failback to primary node.