Periodic Solver Divergence

I am trying to solve a flow problem on a complicated mesh. Occasionally my nonlinear solver will say the solution failed to converge in 0 iterations. I have only observed this on meshes with a periodicity condition when using over a certain number of threads. A MWE is pasted below:


from dolfin import *

mesh = UnitCubeMesh(20, 20, 20)

#Setting up periodicity on y, z faces
class PeriodicDomain(SubDomain):
        def inside(self, x, on_boundary):
            return bool((near(x[1], 0) or near(x[2], 0)) and (not ((near(x[1], 0) and near(x[2], 1.0)) or (near(x[1], 1.0) and near(x[2], 0.0)))) and on_boundary)
        def map(self, x, y):
            if near(x[1], 1.0) and near(x[2], 1.0):
                y[0] = x[0]
                y[1] = x[1] - 1.0
                y[2] = x[2] - 1.0
            elif near(x[1], 1.0):
                y[0] = x[0]
                y[1] = x[1] - 1.0
                y[2] = x[2]
            elif near(x[2], 1.0):
                y[0] = x[0]
                y[1] = x[1]
                y[2] = x[2] - 1.0
            else:
                y[0] = -1000
                y[1] = -1000
                y[2] = -1000
pbc = PeriodicDomain()

#Spaces, function definitions
lin_space2 = FunctionSpace(mesh, "CG", 2, constrained_domain = pbc)

u2 = Function(lin_space2)
v2 = TestFunction(lin_space2)

f = Constant(1.0)

#W.F.
F2 = inner(grad(u2), grad(v2))*dx + inner(f, v2)*dx
J2 = derivative(F2, u2)
pde2 = NonlinearVariationalProblem(F2, u2, [], J2)
solver2 = NonlinearVariationalSolver(pde2)
solver2.solve()

I am using a 28-core Ubuntu 18.04 system on version 2018.1.0 with 512Gb of memory. Using 20 or more threads results in the error


*** -------------------------------------------------------------------------
*** Error:   Unable to solve linear system using PETSc Krylov solver.
*** Reason:  Solution failed to converge in 0 iterations (PETSc reason DIVERGED_PCSETUP_FAILED, residual norm ||r|| = 0.000000e+00).
*** Where:   This error was encountered inside PETScKrylovSolver.cpp.
*** Process: 19
***
*** DOLFIN version: 2018.1.0
*** Git changeset:  unknown
*** -------------------------------------------------------------------------

In this MWE, increase or decreasing the number of elements can remove the error, as does removing the periodicity condition. However, I only observe this error using the periodic condition. Does anyone know what might be the issue?

Are you sure this problem is well posed?

Hi Nate,
Thank you for the reply. The MWE I showed was not well-posed, the problem was a watered-down version of my own for the purposes of illustration. For a similar well-posed problem which still returns the error above, please see:


from dolfin import *

mesh = UnitCubeMesh(20, 20, 20)

class Top(SubDomain):
        def inside(self, x, on_boundary):
                return on_boundary and x[0] > 1.0 - DOLFIN_EPS

class Bottom(SubDomain):
        def inside(self, x, on_boundary):
                return on_boundary and x[0] < DOLFIN_EPS

#Setting up periodicity on y, z faces
class PeriodicDomain(SubDomain):
        def inside(self, x, on_boundary):
            return bool((near(x[1], 0) or near(x[2], 0)) and (not ((near(x[1], 0) and near(x[2], 1.0)) or (near(x[1], 1.0) and near(x[2], 0.0)))) and on_boundary)
        def map(self, x, y):
            if near(x[1], 1.0) and near(x[2], 1.0):
                y[0] = x[0]
                y[1] = x[1] - 1.0
                y[2] = x[2] - 1.0
            elif near(x[1], 1.0):
                y[0] = x[0]
                y[1] = x[1] - 1.0
                y[2] = x[2]
            elif near(x[2], 1.0):
                y[0] = x[0]
                y[1] = x[1]
                y[2] = x[2] - 1.0
            else:
                y[0] = -1000
                y[1] = -1000
                y[2] = -1000
pbc = PeriodicDomain()

#Spaces, function definitions
lin_space2 = FunctionSpace(mesh, "CG", 2, constrained_domain = pbc)

BC1 = DirichletBC(lin_space2, Constant(5.0), Top())
BC2 = DirichletBC(lin_space2, Constant(0.0), Bottom())
BCs = [BC1, BC2]

u2 = Function(lin_space2)
v2 = TestFunction(lin_space2)

f = Constant(0.0)

#W.F.
F2 = inner(grad(u2), grad(v2))*dx + inner(f, v2)*dx
J2 = derivative(F2, u2)
pde2 = NonlinearVariationalProblem(F2, u2, BCs, J2)
solver2 = NonlinearVariationalSolver(pde2)
solver2.solve()

As before, when removing the periodicity of the subspace, there is no error.

I don’t quite understand your problem. The new code snippet you’ve provided with DirichletBCs works fine for me…

Thank you for testing it, it also works for me. My problem is when I call the script with 20 or more threads on my system, i.e. I run the script with mpirun -n 20 python3 FileName.py, the solver instantly returns the ‘diverged in 0 iterations’ error, despite serial and 1 to 19 threads working fine.

This only occurs with some meshes and when I am use the periodic boundary condition. I am unsure why this is, and whether or not it is something that only occurs on my system (as I do not have another cluster to test it on). However, if you don’t observe this thread-dependency error then I will just assume it’s an issue with my system.

Running with up to 30 MPI processes (not concurrent threads) with the 20^3 mesh, I see no issues. I’ve attached an image of the mesh partitioning if that helps. Additionally I’m using the linear solver MUMPS.

Sorry about this late reply but thank you for your help, Nate. I’m unsure what the issue is as I’m using MUMPS too. Updating to the latest stable release hasn’t fixed the problem for me either, and it’s becoming a much bigger issue in my full problem.

If you have the time to point me in the direction of where I can find out how I can probe the mesh to see how it’s partitioned across processes (like you’ve shown), I would appreciate it. Otherwise, I can just assume it’s a problem with the system I’m on and try to find ways around the error when it appears.