Hi Sundials Users,
I'm having some problems running CVODE in parallel when NLOCAL is not the same for all processors.
I'm running the cvode/fcmix_parllel/fcvDiag_kry_p.f example with pre-conditioning removed as a test case for developing a parallel code. I've removed much of the original source to simplify the problem as much as possible and am only running for a single time step (again for simplicity). I've attached the source code if it's helpful.
I have a fixed number of equations, NEQ = 10, and want to run on a variety of processor counts (NPES) which may or may not divide exactly into NEQ. I set the value of NLOCAL (the no. of equations each proc gets) as follows:
NLOCAL = NEQ/NPES for processors 0, 1, ..., NPES-1
NLOCAL = (NEQ-((NPES-1)*int(NEQ/NPES))) for processor NPES
So if NEQ/NPES divides exactly all processors get the same value of NLOCAL, e.g. if NPES=2, NLOCAL = 5 for all processors. If NEQ/NPES does not divide exactly the last processor gets the largest number of equations, e.g. if NPES = 3 then NLOCAL = 3 for processors 0 & 1 and NLOCAL = 4 for processor 2.
Looking at the values of Y which the FCVODE call returns I find the following:
* If NEQ/NPES divides exactly (i.e. NLOCAL is the same for all processors) then Y is identical to the single processor result, e.g. on 1 processor my Y values are:
On proc 0 Y = 0.36787956E+00 0.13533570E+00 0.49787809E-01 0.18316531E-01 0.67387952E-02 0.24794402E-02 0.91238078E-03 0.33579533E-03 0.12361789E-03 0.45523600E-04
On 5 processors my Y values are:
On proc 0 Y = 0.36787956E+00 0.13533570E+00
On proc 1 Y = 0.49787809E-01 0.18316531E-01
On proc 2 Y = 0.67387952E-02 0.24794402E-02
On proc 3 Y = 0.91238078E-03 0.33579533E-03
On proc 4 Y = 0.12361789E-03 0.45523600E-04
* However, if NEQ/NPES does not divide exactly then the resulting Y is not the same as the single processor result. E.g on 3 processors my Y values are:
On proc 0 Y = 0.36787952E+00 0.13533550E+00 0.49787415E-01
On proc 1 Y = 0.18316039E-01 0.67383183E-02 0.24790478E-02
On proc 2 Y = 0.12349553E-03 0.45450266E-04 0.16730071E-04 0.61596683E-05
Is this correct? I had thought that the resulting Y should be the same regardless of the number of CPU's I run on?
Can anyone tell me what has gone wrong or suggest a solution to this?
Many thanks,
Fiona
fcvDiag_kry_p_noprec.f