Mesh Indices mapping with the DG1 dofs

Hi all,

I want to transfer the shape function of the discontinuous Galerkin first order elements (DG,1) to the indices of an actual mesh i.e. in the matrix form having the size (number of nodes X number of elements). I noticed I can do this, by making use of c2v function mentioned in this post. Application of point forces, mapping vertex indices to corresponding DOFs

For the MWE shown below it works seemlessly sequentially, although very slow for the large scale problems which I understand can be improved by using the numba JIT.

# Scaled variable
import pyvista
from mpi4py import MPI
from dolfinx import mesh, fem, plot, io, default_scalar_type
import ctypes
import meshio
from petsc4py import PETSc
import numpy as np
import dolfinx
import ufl
from basix.ufl import element, mixed_element
from dolfinx import la
from dolfinx.fem import (Expression, Function, FunctionSpace, dirichletbc,
                         form, functionspace, locate_dofs_topological, Constant, assemble_scalar, assemble)
from dolfinx.fem.petsc import (apply_lifting, assemble_matrix, assemble_vector,
                               set_bc)
from dolfinx.io import XDMFFile
from dolfinx.mesh import (CellType, GhostMode, create_box,create_rectangle,
                          locate_entities_boundary,meshtags)

from ufl import dx, grad, dot, inner, SpatialCoordinate, Measure, inv
from ufl import as_matrix, as_vector
from ufl import TrialFunction, TestFunction
from dolfinx.fem.petsc import LinearProblem
from ufl import sqrt
from contextlib import ExitStack
import matplotlib.pyplot as plt
import csv
import os
import time

dtype = PETSc.ScalarType  
comm = MPI.COMM_WORLD
rank = comm.rank

Len = 3;  Wid = 1
nn = 1
nx = 75*nn; ny = 25*nn

# Create a box Mesh:
domain = create_rectangle(MPI.COMM_WORLD, [np.array([0.0, 0.0]), np.array([Len, Wid])], [nx, ny],
                             CellType.triangle, ghost_mode=GhostMode.shared_facet)

# Save the mesh to XDMF file
with XDMFFile(MPI.COMM_WORLD, "mesh.xdmf", "w") as xdmf_file:
    xdmf_file.write_mesh(domain)

# Number of non-overlapping local elements
num_elements = domain.topology.index_map(2).size_local
num_nodes = domain.topology.index_map(0).size_local
# Summing up the counts from all processes in MPI_COMM_WORLD
num_elements_global = MPI.COMM_WORLD.allreduce(num_elements, op=MPI.SUM)
num_nodes_global = MPI.COMM_WORLD.allreduce(num_nodes, op=MPI.SUM)
# Number of local elements including the ghost elements
num_cells = domain.topology.index_map(domain.topology.dim).size_local + domain.topology.index_map(domain.topology.dim).num_ghosts
num_vertices = domain.topology.index_map(0).size_local + domain.topology.index_map(0).num_ghosts

print(f"Number of local non-overlapping elements: {num_elements}")
print(f"Number of local non-overlapping nodes: {num_nodes}")
print(f"Number of global elements: {num_elements_global}")
print(f"Number of global nodes: {num_nodes_global}")
print(f"Number of local elements including ghosts: {num_cells}")
print(f"Number of local nodes including ghosts: {num_vertices}")

# FEspace
V = functionspace(domain, ("CG", 1, (domain.geometry.dim,)))
Vh1 = functionspace(domain, ("CG", 1))
Vh1dc = functionspace(domain, ("DG", 1))
Vh0 = functionspace(domain, ("DG", 0))

# volume of the element
varea = TestFunction(Vh0)
cell_area_form = form(varea*dx)
cell_area = assemble_vector(cell_area_form)
cell_area.assemble()

# shape functions of discontinuous elements
etaf = TestFunction(Vh1dc)
Rhdc_form = form(etaf*dx)
Rhdc = assemble_vector(Rhdc_form)
Rhdc.assemble()
print(f"Length of Rhdc array : {len(Rhdc.array)}")

#connectivity
c2v = domain.topology.connectivity(domain.topology.dim, 0)
print(f"Length of c2v : {len(c2v)}")

Trf = np.zeros((num_nodes_global, num_elements_global))

num_non_zeros = 0
for cell in range(num_cells):
    vertices = c2v.links(cell)
    dofs = Vh1dc.dofmap.cell_dofs(cell)
    Trf[vertices[0]][cell] = Rhdc[dofs[0]]
    Trf[vertices[1]][cell] = Rhdc[dofs[1]]
    Trf[vertices[2]][cell] = Rhdc[dofs[2]]
    num_non_zeros += np.count_nonzero(Trf[:, cell])
print(num_non_zeros)

But my major concern is when I am parallelize, I am getting an error.
For example when I am running on 2 procs, here is the reported error.

Number of local non-overlapping elements: 1873
Number of local non-overlapping nodes: 989
Number of global elements: 3750
Number of global nodes: 1976
Number of local elements including ghosts: 1898
Number of local nodes including ghosts: 1023
Length of Rhdc array : 5619
Length of c2v : 1898
    Trf[vertices[0]][cell] = Rhdc[dofs[0]]
                             ~~~~^^^^^^^^^
  File "petsc4py/PETSc/Vec.pyx", line 117, in petsc4py.PETSc.Vec.__getitem__
  File "petsc4py/PETSc/petscvec.pxi", line 446, in petsc4py.PETSc.vec_getitem
  File "petsc4py/PETSc/petscvec.pxi", line 398, in petsc4py.PETSc.vecgetvalues
petsc4py.PETSc.Error: error code 63
[1] VecGetValues() at /home/conda/feedstock_root/build_artifacts/petsc_1709043572817/work/src/vec/vec/interface/rvector.c:957
[1] VecGetValues_MPI() at /home/conda/feedstock_root/build_artifacts/petsc_1709043572817/work/src/vec/vec/impls/mpi/pdvec.c:823
[1] Argument out of range
[1] Can only get local values, trying 0

After analyzing the code, I think the issue is arising from the incompatibility of Rhdc (outputs corresponding to the local elements only) and c2v (outputs corresponding to local elements + ghost elements).
So, is there any way to get the c2v only for the local elements, gets the local matrix and convert to the global Trf matrix?
I would really appreciate any help. Thanks!

Check if the vertex number is less than domain.topology.index_map(0).size_local, and if it is not simply continue the loop.

Thanks Francesco for your reply.

It seems my issue was in calling the Rhdc. I had to use Rhdc.array[dofs[0]] instead of Rhdc[dofs[0]] as Rhdc is an PETSc vector but still I am not getting the correct global matrix. I rewrote it in very simple form to spot an issue.

import dolfinx
from mpi4py import MPI
import numpy as np
from dolfinx.mesh import create_unit_square
from dolfinx.fem import FunctionSpace, functionspace, Function, form
from dolfinx.fem.petsc import assemble_vector
from ufl import TestFunction,TrialFunction, dx

comm = MPI.COMM_WORLD
rank = comm.rank

mesh = create_unit_square(MPI.COMM_WORLD, 2, 1)
Vh1dc = functionspace(mesh, ("DG", 1))
num_elements = mesh.topology.index_map(mesh.topology.dim).size_local
num_elements_global = MPI.COMM_WORLD.allreduce(num_elements, op=MPI.SUM)
num_nodes = mesh.topology.index_map(mesh.topology.dim-2).size_local
num_nodes_global = MPI.COMM_WORLD.allreduce(num_nodes, op=MPI.SUM)

print(f"Number of local non-overlapping elements: {num_elements}")
print(f"Number of local non-overlapping nodes: {num_nodes}")
print(f"Number of global elements: {num_elements_global}")
print(f"Number of global nodes: {num_nodes_global}")

block_size = Vh1dc.dofmap.index_map_bs
print(block_size)

c2v = mesh.topology.connectivity(mesh.topology.dim, 0)
Trfloc = np.zeros((num_nodes_global, num_elements_global))

etaf = TestFunction(Vh1dc)
Rhdc = assemble_vector(form(etaf*dx))

for i in range(num_elements):
    vertices = c2v.links(i)
    dofs = Vh1dc.dofmap.cell_dofs(i)
    Trfloc[vertices[0]][i] = Rhdc.array[dofs[0]]
    Trfloc[vertices[1]][i] = Rhdc.array[dofs[1]]
    Trfloc[vertices[2]][i] = Rhdc.array[dofs[2]]

print(f"Local matrix:\n{Trfloc}")

# Gather local matrices to the root process
Trf = np.zeros((num_nodes_global, num_elements_global))
comm.Allreduce(Trfloc, Trf, op=MPI.SUM)

# Now global_Trf contains the assembled global matrix Trf
print(f"Global matrix:\n{Trf}")

For the single proc, the results are

Local matrix:
[[0.08333333 0.08333333 0.         0.        ]
 [0.08333333 0.         0.         0.        ]
 [0.08333333 0.08333333 0.08333333 0.        ]
 [0.         0.08333333 0.08333333 0.08333333]
 [0.         0.         0.08333333 0.08333333]
 [0.         0.         0.         0.08333333]]
Global matrix:
[[0.08333333 0.08333333 0.         0.        ]
 [0.08333333 0.         0.         0.        ]
 [0.08333333 0.08333333 0.08333333 0.        ]
 [0.         0.08333333 0.08333333 0.08333333]
 [0.         0.         0.08333333 0.08333333]
 [0.         0.         0.         0.08333333]]

whereas the 2 procs, the results are

Local matrix:
[[0.08333333 0.08333333 0.         0.        ]
 [0.08333333 0.         0.         0.        ]
 [0.         0.08333333 0.         0.        ]
 [0.08333333 0.08333333 0.         0.        ]
 [0.         0.         0.         0.        ]
 [0.         0.         0.         0.        ]]

Local matrix:
[[0.08333333 0.         0.         0.        ]
 [0.08333333 0.08333333 0.         0.        ]
 [0.         0.08333333 0.         0.        ]
 [0.08333333 0.08333333 0.         0.        ]
 [0.         0.         0.         0.        ]
 [0.         0.         0.         0.        ]]

Global matrix:
[[0.16666667 0.08333333 0.         0.        ]
 [0.16666667 0.08333333 0.         0.        ]
 [0.         0.16666667 0.         0.        ]
 [0.16666667 0.16666667 0.         0.        ]
 [0.         0.         0.         0.        ]
 [0.         0.         0.         0.        ]]

Looks like, the issue is coming from

dofs = Vh1dc.dofmap.cell_dofs(i)

As for 1 proc, the indices are

[0 1 2]
[3 4 5]
[6 7 8]
[ 9 10 11]

Wheras for the 2 procs

[0 1 2]
[3 4 5]

[0 1 2]
[3 4 5]

Any suggestions are highly appreciated.

The dofs you get out of cell_dofs are numbered locally to the process. If you want to obtain the corresponding global index you need to query the local_to_global function of the IndexMap object.

1 Like

I can see that, but I wouldn’t mind to keep the Trf matrix completely local as of now, rather I would prefer that. So, that I can perform the matrix-vector multiplication locally and covert the resulting vector from local to global.

But, the thing that is hurting my problem from keeping Trf matrix local is the incompatibility of
c2v = mesh.topology.connectivity(mesh.topology.dim, 0) and

etaf = TestFunction(Vh1dc)
Rhdc = assemble_vector(form(etaf*dx))

on MPI, as c2v is distributed on local + ghost elements, whereas Rhdc on local only (no ghost elements)

----------------- RANK  0 ----------------- 
Number of local elements: 2
Number of local nodes: 3
Number of global elements: 4
Number of global nodes: 6
Number of local + ghost elements: 3
Number of local + ghost nodes: 5
Number of c2v: 3
<AdjacencyList> with 3 nodes
  0: [3 0 1 ]
  1: [3 2 1 ]
  2: [4 3 2 ]

Size of Rhdc array: 6
[0.08333333 0.08333333 0.08333333 0.08333333 0.08333333 0.08333333]
----------------- RANK  1 ----------------- 
Number of local elements: 2
Number of local nodes: 3
Number of global elements: 4
Number of global nodes: 6
Number of local + ghost elements: 3
Number of local + ghost nodes: 5
Number of c2v: 3
<AdjacencyList> with 3 nodes
  0: [0 1 3 ]
  1: [0 2 3 ]
  2: [1 3 4 ]

Size of Rhdc array: 6
[0.08333333 0.08333333 0.08333333 0.08333333 0.08333333 0.08333333]

Note: Rhdc array are the basis functions on the DG1 local triangular elements.

Now either when I run the loop for local+ghost or completely local, it goes out of bound

Trfloc = np.zeros((num_vertices, num_cells))
for i in range(num_cells):
    vertices = c2v.links(i)
    dofs = Vh1dc.dofmap.cell_dofs(i)
    for i, vertex in enumerate(vertices):
        Trfloc[vertex, i] = Rhdc[dofs[i]]

So, is it possible to get Rhdc on local+ghosts

etaf = TestFunction(Vh1dc)
Rhdc = assemble_vector(form(etaf*dx))

Or is it possible to c2v on just local elements
c2v = mesh.topology.connectivity(mesh.topology.dim, 0)

I would use the DOLFINx assemblers, instead of the petsc assemblers, i.e.

from dolfinx.fem import assemble_vector

as this returns a dolfinx.la.Vector, whose underlying array (dolfinx.la.Vector.array) contain both local and ghost data.
It is possible to get these with PETSc as well (I do not remember how off the top of my head though).

1 Like

Thanks @dokken, can you please tell me similar process for the function spaces also which would result in local + ghost data, in order to maintain the consistency throughout.

As now, when I am looking at the fespaces are limited locally.
MWE shown below

Vh0 = functionspace(mesh, (“DG”, 0))
density = Function(Vh0)
density.vector.set(1.0)
print("Size of Density array: ",len(density.vector.array))

Vh1 = functionspace(mesh, (“CG”, 1))
phi = Function(Vh1)
phi.vector.set(0.5)
print("Size of phi array: ",len(phi.vector.array))

Resulted to 2 procs:

----------------- RANK 0 -----------------
Number of local elements: 2
Number of local nodes: 3
Number of global elements: 4
Number of global nodes: 6
Number of local + ghost elements: 3
Number of local + ghost nodes: 5
Number of c2v: 3
Size of Rhdc array: 9

Size of Density array: 2
Size of phi array: 3
----------------- RANK 1 -----------------
Number of local elements: 2
Number of local nodes: 3
Number of global elements: 4
Number of global nodes: 6
Number of local + ghost elements: 3
Number of local + ghost nodes: 5
Number of c2v: 3
Size of Rhdc array: 9

Size of Density array: 2
Size of phi array: 3

Further, if you can direct me to right direction to determine the global data from these local+ghosts data using the IndexMap, would be really helpful. Thanks

I would suggest reading through this first: Parallel Computations with Dolfinx using MPI — MPI tutorial then try to refine your question a bit, as I do not have time to repeat all the various posts I have on the forum about this, including: What's the best way to assemble a form? - #3 by dokken, DofMap and ghost values - #3 by dokken and many others

Thanks @dokken, this is really very helpful.

@dokken, Can you please suggest me the procedure of converting the (local+ghost) vector or function space array to just local vector?