Cannot access a non-const vector for huge mesh problem

Hello all,

When I ran a simple linear elastic problem with huge mesh (63 million nodes and 127 million elements), I met below error with “mpirun -n 40 python fenics_mpi.py”:


Traceback (most recent call last):
  File "fenics_mpi.py", line 47, in <module>
    fe.solve(a == L, u, bc)
  File "/users/miniconda/envs/fenics/lib/python3.7/site-packages/dolfin/fem/solving.py", line 220, in solve
    _solve_varproblem(*args, **kwargs)
  File "/users/miniconda/envs/fenics/lib/python3.7/site-packages/dolfin/fem/solving.py", line 247, in _solve_varproblem
    solver.solve()
RuntimeError: 
*** -------------------------------------------------------------------------
*** DOLFIN encountered an error. If you are not able to resolve this issue
*** using the information listed below, you can ask for help at
***
***     fenics-support@googlegroups.com
***
*** Remember to include the error message listed below and, if possible,
*** include a *minimal* running example to reproduce the error.
***
*** -------------------------------------------------------------------------
*** Error:   Unable to access vector of degrees of freedom.
*** Reason:  Cannot access a non-const vector from a subfunction.
*** Where:   This error was encountered inside Function.cpp.
*** Process: 39
*** 
*** DOLFIN version: 2019.1.0
*** Git changeset:  77c758717da1cb8017bf95109363bb1a91a4f8e5
*** -------------------------------------------------------------------------

Here is the my script that produces this error

import fenics as fe
from ufl import nabla_div
model_name = 'modelname'  # name of model
domain = ((0, 0), (2560, 2560))  # the domain size
# read the mesh
fe_mesh = fe.Mesh()
with fe.XDMFFile(model_name + '.xdmf') as infile: infile.read(fe_mesh)

E, nu = fe.Constant(50), fe.Constant(0.3)
model = "plane_stress"
mu = E/2/(1+nu)
lmbda = E*nu/(1+nu)/(1-2*nu)
if model == "plane_stress":
    lmbda = 2*mu*lmbda/(lmbda+2*mu)
def eps(u):
    return 0.5*(fe.nabla_grad(u) + fe.nabla_grad(u).T)
def sigma(u):
    return lmbda*nabla_div(u)*fe.Identity(2) + 2 * mu * eps(u)

# boundaries
bounds = fe.MeshFunction('size_t', fe_mesh, fe_mesh.geometry().dim()-1, 9999)
bd_left = fe.CompiledSubDomain('near(x[0], ' + str(domain[0][0]) + ', 1e-8)').mark(bounds, 0)
bd_right = fe.CompiledSubDomain('near(x[0], ' + str(domain[1][0]) + ', 1e-8)').mark(bounds, 1)
bd_bot = fe.CompiledSubDomain('near(x[1], ' + str(domain[0][1]) + ', 1e-8)').mark(bounds, 2)
bd_top = fe.CompiledSubDomain('near(x[1], ' + str(domain[1][1]) + ', 1e-8)').mark(bounds, 3)

VFS = fe.VectorFunctionSpace(fe_mesh, 'P', 1)
du = fe.TrialFunction(VFS)
u_ = fe.TestFunction(VFS)
dx = fe.Measure('dx', domain=fe_mesh)
ds = fe.Measure('ds', domain=fe_mesh)
dsn = fe.Measure('ds', domain=fe_mesh, subdomain_data=bounds)
a = fe.inner(sigma(du), eps(u_))*dx

# Boundary conditions 
# Numenn BC
L = fe.inner(fe.Constant((0, 0)), u_)*dx
# DirichletBC
expr = fe.Expression((0, 'x[1]/' + str(domain[1][1]-domain[0][1])), degree=1)
bc = [fe.DirichletBC(VFS, fe.Constant((0., 0.)), bounds, 2)]
bc += [fe.DirichletBC(VFS, fe.Constant((0., 1.)), bounds, 3)]
bc += [fe.DirichletBC(VFS, expr, bounds, 0)]
bc += [fe.DirichletBC(VFS, expr, bounds, 1)]

# Resolution
u = fe.Function(VFS, name="Displacement")
fe.solve(a == L, u, bc)

I first thought it may has some issue with my mesh, then I subdivided the huge mesh into four and solved them individually. By that way, I have no issue.
Adding “, solver_parameters={‘linear_solver’: ‘mumps’}” into the solve cannot help.

Any suggestions are welcome.
Thank you.

Although it could be a matter of convention, but you don’t need nabla_grad and nabla_div. For almost all elasticity applications: grad and div would work just fine, so:

def eps(u):
    return sym(grad(u))

def sigma(u):
    return lmbda * tr(eps(u)) * Identity(len(u)) + 2*mu*eps(u)

should work. As for the solver, for large problems try using an iterative solver with a preconditioner. Direct solvers could run out of memory.

Edit: printing a more verbose error log could help in pinning down the source of error:

set_log_level(10)

for instance. I don’t have the mesh so cannot reproduce your code. If your problem works on a coarser mesh, then try using a Krylov solver (cg) with a suitable preconditioner such as amg

Hi Bhavesh,
Really appreciate your help. I had modified my eps and sigma functions. The error log with level(10) is shown below

Process 0: Building distributed mesh
Process 0: Compute partition of cells across processes
Process 0: Build mesh dual graph
Process 0: Build local part of mesh dual graph
Process 1: Building distributed mesh
Process 1: Compute partition of cells across processes
Process 1: Build mesh dual graph
Process 1: Build local part of mesh dual graph
Process 0: Elapsed wall, usr, sys time: 32.8091, 30.24, 2.49 (Compute local part of mesh dual graph)
Process 0: Build nonlocal part of mesh dual graph
Process 1: Elapsed wall, usr, sys time: 33.1491, 30.35, 2.72 (Compute local part of mesh dual graph)
Process 1: Build nonlocal part of mesh dual graph
Process 1: Elapsed wall, usr, sys time: 0.0276186, 0.03, 0 (Compute non-local part of mesh dual graph)
Process 0: Elapsed wall, usr, sys time: 0.373477, 0.35, 0.01 (Compute non-local part of mesh dual graph)
Process 0: Compute graph partition using PT-SCOTCH
Process 1: Compute graph partition using PT-SCOTCH
Process 1: Elapsed wall, usr, sys time: 0.0247914, 0.02, 0 (SCOTCH: call SCOTCH_dgraphBuild)
Process 0: Elapsed wall, usr, sys time: 0.0247842, 0.02, 0 (SCOTCH: call SCOTCH_dgraphBuild)
Process 0: Elapsed wall, usr, sys time: 30.3988, 28.61, 1.73 (SCOTCH: call SCOTCH_dgraphPart)
Process 1: Elapsed wall, usr, sys time: 30.6138, 29.15, 1.39 (SCOTCH: call SCOTCH_dgraphPart)
Process 0: Elapsed wall, usr, sys time: 0.427098, 0.35, 0.08 (SCOTCH: call SCOTCH_dgraphHalo)
Process 0: Elapsed wall, usr, sys time: 1.571e-06, 0, 0 (Get SCOTCH graph data)
Process 1: Elapsed wall, usr, sys time: 0.415835, 0.33, 0.09 (SCOTCH: call SCOTCH_dgraphHalo)
Process 1: Elapsed wall, usr, sys time: 1.798e-06, 0, 0 (Get SCOTCH graph data)
Process 1: Elapsed wall, usr, sys time: 0.369449, 0.37, 0 (Extract partition boundaries from SCOTCH graph)
Process 0: Elapsed wall, usr, sys time: 0.376192, 0.37, 0 (Extract partition boundaries from SCOTCH graph)
Process 1: Elapsed wall, usr, sys time: 33.3443, 31.7, 1.56 (Compute graph partition (SCOTCH))
Process 0: Elapsed wall, usr, sys time: 33.7291, 31.58, 2.08 (Compute graph partition (SCOTCH))
Process 0: Distribute mesh (cell and vertices)
Process 1: Distribute mesh (cell and vertices)
Process 0: Distribute cells during distributed mesh construction
Process 1: Distribute cells during distributed mesh construction
Process 1: Elapsed wall, usr, sys time: 8.43924, 5.84, 2.57 (Distribute cells)
Process 0: Elapsed wall, usr, sys time: 8.63252, 6.12, 2.49 (Distribute cells)
Process 1: Distribute vertices during distributed mesh construction
Process 0: Distribute vertices during distributed mesh construction
Process 1: Build shared vertices during distributed mesh construction
Process 0: Build shared vertices during distributed mesh construction
Process 1: Elapsed wall, usr, sys time: 28.5588, 26.44, 2.05 (Distribute vertices)
Process 1: Elapsed wall, usr, sys time: 62.9803, 57.41, 5.41 (Distribute mesh (cells and vertices))
Process 1: Build local mesh during distributed mesh construction
Process 0: Elapsed wall, usr, sys time: 27.7168, 25.4, 2.26 (Distribute vertices)
Process 0: Elapsed wall, usr, sys time: 63.4013, 57.78, 5.47 (Distribute mesh (cells and vertices))
Process 0: Build local mesh during distributed mesh construction
Process 1: Elapsed wall, usr, sys time: 22.2812, 21.79, 0.44 (Build local part of distributed mesh (from local mesh data))
Process 0: Elapsed wall, usr, sys time: 22.4858, 22.03, 0.39 (Build local part of distributed mesh (from local mesh data))
Process 0: Computing mesh entities of dimension 1
Process 1: Computing mesh entities of dimension 1
Process 1: Elapsed wall, usr, sys time: 37.4393, 35.86, 1.48 (Compute entities dim = 1)
Process 1: Requesting connectivity 1 - 2.
Process 1: Requesting connectivity 2 - 1.
Process 1: Computing mesh connectivity 1 - 2 from transpose.
Process 0: Elapsed wall, usr, sys time: 38.9781, 37.42, 1.46 (Compute entities dim = 1)
Process 0: Requesting connectivity 1 - 2.
Process 0: Requesting connectivity 2 - 1.
Process 0: Computing mesh connectivity 1 - 2 from transpose.
Process 1: Elapsed wall, usr, sys time: 3.23042, 3.22, 0.01 (Compute connectivity 1-2)
Process 1: Number mesh entities for distributed mesh (for specified vertex ids).
Process 0: Elapsed wall, usr, sys time: 3.60629, 3.53, 0.08 (Compute connectivity 1-2)
Process 0: Number mesh entities for distributed mesh (for specified vertex ids).
Process 1: Compute ownership for mesh entities of dimension 1.
Process 0: Compute ownership for mesh entities of dimension 1.
Process 1: Elapsed wall, usr, sys time: 11.0693, 9.84, 1.19 (Compute mesh entity ownership)
Process 0: Elapsed wall, usr, sys time: 6.97748, 5.89, 1.07 (Compute mesh entity ownership)
Process 0: Elapsed wall, usr, sys time: 60.7043, 56.04, 4.51 (Number mesh entities for distributed mesh (for specified vertex ids))
Process 0: Elapsed wall, usr, sys time: 60.9086, 56.24, 4.51 (Number distributed mesh entities)
Process 1: Elapsed wall, usr, sys time: 62.9243, 57.82, 4.95 (Number mesh entities for distributed mesh (for specified vertex ids))
Process 1: Elapsed wall, usr, sys time: 63.2047, 58.05, 5 (Number distributed mesh entities)
Process 0: Elapsed wall, usr, sys time: 263.355, 245.48, 17.23 (Build distributed mesh from local mesh data)
Process 1: Elapsed wall, usr, sys time: 263.765, 245.52, 17.62 (Build distributed mesh from local mesh data)
Process 0: Computing sub domain markers for sub domain 0.
Process 1: Computing sub domain markers for sub domain 0.
Process 1: Computing sub domain markers [=======>                                ] 17.7%
Process 0: Computing sub domain markers [======>                                 ] 17.4%
Process 0: Computing sub domain markers [=============>                          ] 34.7%
Process 1: Computing sub domain markers [==============>                         ] 35.5%
Process 0: Computing sub domain markers [====================>                   ] 52.1%
Process 1: Computing sub domain markers [=====================>                  ] 53.2%
Process 1: Computing sub domain markers [============================>           ] 70.9%
Process 0: Computing sub domain markers [===========================>            ] 69.4%
Process 1: Computing sub domain markers [===================================>    ] 88.7%
Process 0: Computing sub domain markers [==================================>     ] 86.8%
Process 1: Computing sub domain markers [========================================] 100.0%
Process 0: Computing sub domain markers [========================================] 100.0%
Process 0: Computing sub domain markers for sub domain 1.
Process 1: Computing sub domain markers for sub domain 1.
Process 0: Computing sub domain markers [=======>                                ] 19.5%
Process 1: Computing sub domain markers [=======>                                ] 19.9%
Process 0: Computing sub domain markers [===============>                        ] 39.1%
Process 1: Computing sub domain markers [===============>                        ] 39.9%
Process 0: Computing sub domain markers [=======================>                ] 58.6%
Process 1: Computing sub domain markers [=======================>                ] 59.8%
Process 0: Computing sub domain markers [===============================>        ] 78.1%
Process 1: Computing sub domain markers [===============================>        ] 79.8%
Process 1: Computing sub domain markers [=======================================>] 99.7%
Process 0: Computing sub domain markers [=======================================>] 97.6%
Process 1: Computing sub domain markers [========================================] 100.0%
Process 0: Computing sub domain markers [========================================] 100.0%
Process 0: Computing sub domain markers for sub domain 2.
Process 1: Computing sub domain markers for sub domain 2.
Process 1: Computing sub domain markers [=======>                                ] 19.9%
Process 0: Computing sub domain markers [=======>                                ] 19.5%
Process 1: Computing sub domain markers [===============>                        ] 39.9%
Process 0: Computing sub domain markers [===============>                        ] 39.1%
Process 1: Computing sub domain markers [=======================>                ] 57.6%
Process 0: Computing sub domain markers [======================>                 ] 56.4%
Process 1: Computing sub domain markers [===============================>        ] 77.6%
Process 0: Computing sub domain markers [==============================>         ] 75.9%
Process 1: Computing sub domain markers [=======================================>] 97.5%
Process 0: Computing sub domain markers [======================================> ] 95.5%
Process 1: Computing sub domain markers [========================================] 100.0%
Process 0: Computing sub domain markers [========================================] 100.0%
Process 0: Computing sub domain markers for sub domain 3.
Process 1: Computing sub domain markers for sub domain 3.
Process 1: Computing sub domain markers [=======>                                ] 17.7%
Process 0: Computing sub domain markers [=======>                                ] 19.5%
Process 1: Computing sub domain markers [==============>                         ] 35.5%
Process 0: Computing sub domain markers [===============>                        ] 39.1%
Process 1: Computing sub domain markers [=====================>                  ] 53.2%
Process 0: Computing sub domain markers [=======================>                ] 58.6%
Process 1: Computing sub domain markers [============================>           ] 70.9%
Process 0: Computing sub domain markers [===============================>        ] 78.1%
Process 1: Computing sub domain markers [===================================>    ] 88.7%
Process 0: Computing sub domain markers [=======================================>] 97.6%
Process 0: Computing sub domain markers [========================================] 100.0%
Process 1: Computing sub domain markers [========================================] 100.0%
Process 0: Elapsed wall, usr, sys time: 1.307e-06, 0, 0 (Number distributed mesh entities)
Process 1: Elapsed wall, usr, sys time: 1.347e-06, 0, 0 (Number distributed mesh entities)
Process 0: Elapsed wall, usr, sys time: 6.54318, 6.3, 0.22 (Init dofmap from UFC dofmap)
Process 1: Elapsed wall, usr, sys time: 6.62935, 6.21, 0.4 (Init dofmap from UFC dofmap)
Process 0: Determining node ownership for parallel dof map
Process 1: Determining node ownership for parallel dof map
Process 1: Finished determining dof ownership for parallel dof map
Process 0: Finished determining dof ownership for parallel dof map
Process 1: Elapsed wall, usr, sys time: 0.0117398, 0.01, 0 (SCOTCH: call SCOTCH_graphBuild)
Process 0: Elapsed wall, usr, sys time: 0.0123547, 0.02, 0 (SCOTCH: call SCOTCH_graphBuild)
Process 0: Elapsed wall, usr, sys time: 7.4089, 7.35, 0.05 (SCOTCH: call SCOTCH_graphOrder)
Process 0: Elapsed wall, usr, sys time: 8.93795, 8.54, 0.38 (Compute SCOTCH graph re-ordering)
Process 1: Elapsed wall, usr, sys time: 8.4239, 8.36, 0.04 (SCOTCH: call SCOTCH_graphOrder)
Process 1: Elapsed wall, usr, sys time: 9.90392, 9.53, 0.36 (Compute SCOTCH graph re-ordering)
Process 1: Elapsed wall, usr, sys time: 48.7861, 46.76, 1.91 (Init dofmap)
Process 0: Elapsed wall, usr, sys time: 49.4835, 47.52, 1.84 (Init dofmap)
Process 0: Elapsed wall, usr, sys time: 0.000212399, 0, 0 (Apply (PETScVector))
Process 1: Elapsed wall, usr, sys time: 0.0412292, 0.02, 0.02 (Apply (PETScVector))
Process 0: Elapsed wall, usr, sys time: 0.000161234, 0, 0 (Apply (PETScVector))
Process 1: Elapsed wall, usr, sys time: 0.000541833, 0, 0 (Apply (PETScVector))
Process 0: Elapsed wall, usr, sys time: 1.73358, 1.7, 0.02 (Init dof vector)
Process 1: Elapsed wall, usr, sys time: 1.73167, 1.7, 0.02 (Init dof vector)
Process 0: Solving linear variational problem.
Process 1: Solving linear variational problem.
Traceback (most recent call last):
  File "fenics_mpi.py", line 142, in <module>
    fe.solve(a == L, u, bc, solver_parameters={'linear_solver': 'mumps'})
  File "/users/miniconda/envs/fenics/lib/python3.7/site-packages/dolfin/fem/solving.py", line 220, in solve
    _solve_varproblem(*args, **kwargs)
  File "/users/miniconda/envs/fenics/lib/python3.7/site-packages/dolfin/fem/solving.py", line 247, in _solve_varproblem
    solver.solve()
RuntimeError: 
*** -------------------------------------------------------------------------
*** DOLFIN encountered an error. If you are not able to resolve this issue
*** using the information listed below, you can ask for help at
***
***     fenics-support@googlegroups.com
***
*** Remember to include the error message listed below and, if possible,
*** include a *minimal* running example to reproduce the error.
***
*** -------------------------------------------------------------------------
*** Error:   Unable to access vector of degrees of freedom.
*** Reason:  Cannot access a non-const vector from a subfunction.
*** Where:   This error was encountered inside Function.cpp.
*** Process: 0
*** 
*** DOLFIN version: 2019.1.0
*** Git changeset:  77c758717da1cb8017bf95109363bb1a91a4f8e5
*** -------------------------------------------------------------------------
Traceback (most recent call last):
  File "fenics_mpi.py", line 142, in <module>
    fe.solve(a == L, u, bc, solver_parameters={'linear_solver': 'mumps'})
  File "/users/miniconda/envs/fenics/lib/python3.7/site-packages/dolfin/fem/solving.py", line 220, in solve
    _solve_varproblem(*args, **kwargs)
  File "/users/miniconda/envs/fenics/lib/python3.7/site-packages/dolfin/fem/solving.py", line 247, in _solve_varproblem
    solver.solve()
RuntimeError: 
*** -------------------------------------------------------------------------
*** DOLFIN encountered an error. If you are not able to resolve this issue
*** using the information listed below, you can ask for help at
***
***     fenics-support@googlegroups.com
***
*** Remember to include the error message listed below and, if possible,
*** include a *minimal* running example to reproduce the error.
***
*** -------------------------------------------------------------------------
*** Error:   Unable to access vector of degrees of freedom.
*** Reason:  Cannot access a non-const vector from a subfunction.
*** Where:   This error was encountered inside Function.cpp.
*** Process: 1
*** 
*** DOLFIN version: 2019.1.0
*** Git changeset:  77c758717da1cb8017bf95109363bb1a91a4f8e5
*** -------------------------------------------------------------------------

Any comments on this log? Feels like everything works well but suddenly stop in solver. I hope to solve this problem with direct solver and I can offer enough memory. If there is nothing wrong with my code and mesh, but just solver limitation. I may switch to iterative solver.

Thank you.

Did you try to solve on a coarser mesh? That should help you make sure that the formulation is correct.
Also, are you sure that your mesh is marked properly? I would avoid marking with 0 in general

import dolfin as fe 
bounds = fe.MeshFunction("size_t", fe_mesh, fe_mesh.geometry().dim() - 1)
bounds.set_all(0)
bd_left = fe.CompiledSubDomain("near(x[0], left_pt)",left_pt=domain[0][0]).mark(bounds, 1)
bd_right= fe.CompiledSubDomain("near(x[0], right_pt)",left_pt=domain[1][0]).mark(bounds, 2)
...

Your formulation seems correct to me:

from dolfin import *
import numpy as np
from matplotlib import cm
import matplotlib.pyplot as plt
set_log_level(30)
plt.rc("text", usetex=True)

msh  = RectangleMesh(Point((0,0)),Point((10,10)), 10, 10)
dmns = ((0,0), (10, 10))
bounds = MeshFunction("size_t", msh, msh.geometry().dim() - 1)
bounds.set_all(0)
left = CompiledSubDomain("near(x[0], 0) && on_boundary").mark(bounds, 1)
right = CompiledSubDomain("near(x[0], 10) && on_boundary").mark(bounds, 2)
top = CompiledSubDomain("near(x[1], 10) && on_boundary").mark(bounds, 3)
bottom = CompiledSubDomain("near(x[1], 0) && on_boundary").mark(bounds, 4)

V = VectorFunctionSpace(msh, "CG", 1)
du =TestFunction(V)
u_ = TrialFunction(V)
u = Function(V)

E, nu = Constant(50), Constant(0.3)
mu = E/2/(1+nu)
lmbda = 2*mu*nu/(1-2*nu)

eps_u = lambda x: sym(grad(x))
sigma_u = lambda x: 2*mu*eps_u(x) + lmbda * tr(eps_u(x)) * Identity(len(x))

vload = Expression((0, "x[1]/val"),val=dmns[1][1] - dmns[0][1], degree=1)
bcb = DirichletBC(V, Constant((0,0)), bounds, 4)
bct = DirichletBC(V, Constant((0,1)), bounds, 3)
bcl = DirichletBC(V, vload, bounds, 2)
bcr = DirichletBC(V, vload, bounds, 1)

bcs = [bcb, bct, bcl, bcr]

a = inner(sigma_u(u_), eps_u(du))*dx
L = inner(Constant((0, 0)), du)*dx

solve(a==L, u, bcs)

run fine.

Much thanks for your help. Now I highly suspect that the error came from my mesh. I will check it first.