PETSC CSR Matrix Generation Error when running in parallel

Hi everybody,

I’m trying to generate highly sparse matrix for my eigenvalue problem. My script is running perfectly for serial case. Furthermore, it is also running okay up to 4 cores.

However when I try -np>4, e.g. [mpirun -np 6 python3 mwe.py], it throws an error;

ValueError: size(I) is 1131, expected 905
  File "PETSc/petscmat.pxi", line 961, in petsc4py.PETSc.matsetvalues_csr
  File "PETSc/petscmat.pxi", line 961, in petsc4py.PETSc.matsetvalues_csr
  File "PETSc/petscmat.pxi", line 961, in petsc4py.PETSc.matsetvalues_csr
  File "PETSc/petscmat.pxi", line 961, in petsc4py.PETSc.matsetvalues_csr
  File "PETSc/petscmat.pxi", line 923, in petsc4py.PETSc.matsetvalues_ijv
  File "PETSc/petscmat.pxi", line 961, in petsc4py.PETSc.matsetvalues_csr
  File "PETSc/petscmat.pxi", line 923, in petsc4py.PETSc.matsetvalues_ijv
  File "PETSc/petscmat.pxi", line 923, in petsc4py.PETSc.matsetvalues_ijv
ValueError: size(I) is 1131, expected 871
  File "PETSc/petscmat.pxi", line 923, in petsc4py.PETSc.matsetvalues_ijv
  File "PETSc/petscmat.pxi", line 923, in petsc4py.PETSc.matsetvalues_ijv
ValueError: size(I) is 1131, expected 889
ValueError: size(I) is 1131, expected 905
ValueError: size(I) is 1131, expected 907
ValueError: size(I) is 1131, expected 891

Here is my MWE:

import dolfin as dolf
import numpy as np
from petsc4py import PETSc
import time 
start_time = time.time()

mesh = dolf.Mesh()
with dolf.XDMFFile('rijke.xdmf') as infile:
    infile.read(mesh)


degree = 2
CG = dolf.FiniteElement('CG', mesh.ufl_cell(), degree)
W = dolf.FunctionSpace(mesh, dolf.MixedElement([CG, CG]))

function_space = W

 
global_size = function_space.dim()
istart, iend = function_space.dofmap().ownership_range()
local_size = iend - istart  # local size

D_kj = dict()

for k in range(2):
    for j in range(2):

        row = [1019, 1019, 1019, 1029, 1029, 1029, 1031, 1031, 1031, 1033, 1033,
               1033, 1035, 1035, 1035, 1041, 1041, 1041, 1043, 1043, 1043, 1045,
               1045, 1045, 1047, 1047, 1047, 1049, 1049, 1049, 1051, 1051, 1051,
               1053, 1053, 1053, 1055, 1055, 1055, 1057, 1057, 1057, 1059, 1059,
               1059, 1061, 1061, 1061, 1063, 1063, 1063, 1065, 1065, 1065, 1067,
               1067, 1067, 1069, 1069, 1069, 1071, 1071, 1071, 1073, 1073, 1073,
               1075, 1075, 1075, 1077, 1077, 1077, 1079, 1079, 1079, 1081, 1081,
               1081, 1083, 1083, 1083, 1085, 1085, 1085, 1087, 1087, 1087, 1089,
               1089, 1089, 1091, 1091, 1091, 1093, 1093, 1093, 1095, 1095, 1095,
               1097, 1097, 1097, 1099, 1099, 1099, 1101, 1101, 1101, 1103, 1103,
               1103, 1105, 1105, 1105, 1107, 1107, 1107, 1113, 1113, 1113, 1115,
               1115, 1115, 1117, 1117, 1117, 1119, 1119, 1119, 1129, 1129, 1129]

        col = [1163, 1151, 1149, 1163, 1151, 1149, 1163, 1151, 1149, 1163, 1151,
               1149, 1163, 1151, 1149, 1149, 1163, 1151, 1163, 1151, 1149, 1163,
               1151, 1149, 1163, 1151, 1149, 1149, 1163, 1151, 1149, 1151, 1163,
               1163, 1151, 1149, 1163, 1151, 1149, 1163, 1151, 1149, 1163, 1151,
               1149, 1151, 1149, 1163, 1163, 1151, 1149, 1163, 1151, 1149, 1151,
               1163, 1149, 1163, 1151, 1149, 1151, 1149, 1163, 1151, 1149, 1163,
               1163, 1151, 1149, 1163, 1151, 1149, 1163, 1151, 1149, 1163, 1151,
               1149, 1151, 1149, 1163, 1163, 1151, 1149, 1163, 1151, 1149, 1151,
               1163, 1149, 1163, 1151, 1149, 1151, 1149, 1163, 1149, 1163, 1151,
               1163, 1151, 1149, 1163, 1151, 1149, 1163, 1151, 1149, 1163, 1151,
               1149, 1149, 1163, 1151, 1163, 1151, 1149, 1163, 1151, 1149, 1163,
               1151, 1149, 1151, 1163, 1149, 1151, 1163, 1149, 1151, 1163, 1149]
        
        val = [-0.99487415, -0.07232158,  1.06719573, -1.52841215, -0.11110669,
                1.63951884, -1.38100109, -0.10039076,  1.48139185, -2.3082298 ,
               -0.16779491,  2.47602471, -1.6869314 , -0.12263012,  1.80956152,
                1.58222925, -1.47500495, -0.1072243 , -1.62432548, -0.11807903,
                1.74240451, -2.97269167, -0.21609743,  3.18878909, -2.69805372,
               -0.19613284,  2.89418656,  4.23912078, -3.95184461, -0.28727616,
                2.19057301, -0.14845046, -2.04212256, -1.01884757, -0.0740643 ,
                1.09291187, -2.37983601, -0.17300026,  2.55283627, -3.17011761,
               -0.23044915,  3.40056675, -3.60327431, -0.26193713,  3.86521144,
               -0.23383981,  3.45060021, -3.2167604 , -2.28580025, -0.16616441,
                2.45196466, -4.02181434, -0.29236256,  4.3141769 , -0.14901075,
               -2.04983008,  2.19884083, -1.68775348, -0.12268988,  1.81044337,
               -0.29771   ,  4.3930851 , -4.0953751 , -0.18359276,  2.70914188,
               -2.52554912, -2.30055344, -0.16723688,  2.46779032, -3.37568394,
               -0.24539263,  3.62107657, -4.03339481, -0.29320439,  4.3265992 ,
               -1.69607425, -0.12329475,  1.819369  , -0.14557236,  2.1481031 ,
               -2.00253074, -3.83564025, -0.27882878,  4.11446903, -4.861238  ,
               -0.35338379,  5.21462179, -0.18475935, -2.54159701,  2.72635635,
               -2.74115057, -0.19926574,  2.94041631, -0.16786457,  2.47705273,
               -2.30918815,  1.03134417, -0.96145218, -0.06989199, -1.84270365,
               -0.13395386,  1.97665751, -3.51037475, -0.25518387,  3.76555862,
               -3.29470161, -0.23950568,  3.53420729, -3.2573685 , -0.23679178,
                3.49416028,  1.63611565, -1.52523958, -0.11087606, -1.35084021,
               -0.09819824,  1.44903845, -1.4815967 , -0.10770348,  1.58930018,
               -2.17569276, -0.15816023,  2.333853  , -0.10098036, -1.38911177,
                1.49009214, -0.11618485, -1.59826857,  1.71445342, -0.06687615,
               -0.91996547,  0.98684162]
                
        # Getting index pointer for CSR matrix
        indptr = np.bincount(row, minlength=local_size)
        indptr = np.insert(indptr, 0, 0).cumsum()
        indptr = indptr.astype(dtype='int32')
        
        # Matrix Generation
        mat = PETSc.Mat().create()
        mat.setSizes([(local_size, global_size), (local_size, global_size)])
        mat.setType('aij') 
        mat.setUp()
        mat.setValuesCSR(indptr, col, val)
        mat.assemblyBegin()
        mat.assemblyEnd()

        D_kj['{0}{1}'.format(k + 1, j + 1)] = mat

end_time = time.time()

print("Total execution time is: ",end_time-start_time)
 
D = [D_kj['11'], D_kj['12'], D_kj['21'], D_kj['22']]

The mesh file also can be obtained by this github link.

Somehow, the indexpointer(indptr) for CSR matrix is calculating wrongly when I run it in parallel. It is working greatly for built-in mesh generators for any number of cores.

I have no clue about how to proceed. Any help would be much appreciated.

Thanks for your replies in advance.

What are you expecting to happen when you run this in parallel? Do you want copies of the same CSR matrix stored locally on each process? Or do you want one matrix distributed across processes?

If the former, you should work with COMM_SELF, and the latter investigate the creation of mpiaij type matrices with PETSc.

I have tried COMM_SELF, but couldn’t get stable matrix generations.

As you can guess, I want to make my calculations as fast as I can. This is simple geometry, but I have more complex cylindrical geometries which contains multiple subdomains.

At this stage, do you think which matrix generation process would be most efficient and stable during parallel runs, between your two recommendations?