Extreme Memory Usage for Legacy Fenics Mixed Formulation Code

Hi everyone,
I’m trying to run some code using a mixed finite element for incompressible flow in Fenics, and although the code works fine, it is using up excessive memory. The code uses Taylor-Hood elements for velocity and pressure, another “Real” element type for used for Lagrange multipliers, and an implicit RK3 formulation for the time discretization (BDF3). For a 1 million element 3D mesh, around 900 GB of RAM is needed on an HPC, which is excessive. I wanted to see if anyone here knows why so much memory is being used and how I can fix it. I have tried switching to a few iterative solvers, as I am currently using a direct solver, but still excessive memory is required and the results were completely different for some of the ones I tried. A simple use of the code for low Reynolds number flow through a rectangular domain is shown below for the sake of reproducibility. Even for this case, I got a segmentation fault due to running out of memory and had to run this case on a cluster. The code is as follow:
Thank you!


from dolfin import *

from math import log

import numpy

parameters["std_out_all_processes"] = False;

parameters['ghost_mode'] = 'shared_facet'

########################

tol=1E-10

#######################

K=1 # Polynomial Order

################

dt=0.1 # Time Step

Time=30 # Total Time

n_u = 0.001568 # Increased viscosity from air to get low Re

rho = 1 # Appr Density of air

V_in = 0.05; # Inlet Velocity

V_initial = V_in # Initial Velocity

P_out = 0.0 # Outlet (atmospheric pressure)

P_initial = 1.0 # Initial Pressure

########################

count = 0 # File output counter

###################### Read Mesh in and define boundaries

el_x = 50;

el_y = el_x

mesh=RectangleMesh(Point(0.0, 0.0), Point(4.0, 1.0), el_x, el_y)

inlet = 'near(x[0],0)'

outlet = 'near(x[0], 4)'

walls = 'near(x[1], 0) || near(x[1], 1)'

###################### Define Function Spaces and Elements

U=VectorFunctionSpace(mesh,'CG',K+1) # for velocity

P=FunctionSpace(mesh,'CG',K) # for pressure

NU=FunctionSpace(mesh, 'Real', 0) # for lagrange multiplier

U_el=VectorElement("CG", mesh.ufl_cell(), K+1) ### CG element is choosen for velocity

P_el= FiniteElement("CG", mesh.ufl_cell(), K) ### CG space is choosen for pressure

Nu_el=FiniteElement("Real", mesh.ufl_cell(), 0) # Lagrange Multiplier

W_1=MixedElement([U_el, P_el, Nu_el])

W=FunctionSpace(mesh, W_1)

du=TrialFunction(W)

upnu=Function(W)

u,p,nu=split(upnu)

lg_n=Function(NU)

vqnv=TestFunction(W)

v,q,nv=split(vqnv)

####################### Initial Conditions

Pressure=Expression("P", P = P_initial, degree=1)

Velocity=Expression(("Vx", "0.0"), Vx = V_initial, degree=1)

u_1=Function(U)

u_2=Function(U)

u_3=Function(U)

p_1=Function(P)

p_2=Function(P)

p_3=Function(P)

lg_n= interpolate(Constant(1.0),NU)

u_1=interpolate(Velocity,U)

p_1=interpolate(Pressure,P)

u_2=interpolate(Velocity,U)

p_2=interpolate(Pressure,P)

u_3=interpolate(Velocity,U)

p_3=interpolate(Pressure,P)

############################################# BCs

UNoslip=DirichletBC(W.sub(0),Constant((0.0, 0.0)),walls) # Wall no-slip

UbcIn=DirichletBC(W.sub(0),Constant((V_in, 0.0)),inlet) # Inlet Velocity

PbcOut = DirichletBC(W.sub(1), Constant(P_out), outlet)# Outlet Pressure

bcs=[UNoslip, UbcIn, PbcOut]

############################################# Formulation Variables

h = CellDiameter(mesh) #if want to use edge diameter use FacetArea

h_avg = (h('+') + h('-'))/2.0

n = FacetNormal(mesh)

C_a=6.0*(K+1.0)*(K+2.0)/2.0

#############################################

cons=6.0/11.0*dt # BDF3 hf Constant

iden=Identity(2) # Identity Matrix

########################

L_1=dot(u,v)*dx-18.0/11.0*dot(u_3,v)*dx+9.0/11.0*dot(u_2,v)*dx-2.0/11.0*dot(u_1,v)*dx

########################

L_2=-cons*div(u)*q*dx+cons*nv*(1/rho)*p*dx+cons*nu*q*dx

#########################

L_3=cons*dot(dot(u,nabla_grad(u)),v)*dx\

+cons*1/2*dot(div(u)*u,v)*dx\

-cons*dot(u('+'),n('+'))*dot(u('+')-u('-'),avg(v))*dS\

+0.5*cons*abs(dot(u('+'),n('+')))*dot(u('+')-u('-'),v('+')-v('-'))*dS\

-(1/rho)*cons*p*div(v)*dx ########### upwind flux

###########################

L_4=cons*n_u*inner(nabla_grad(u)+nabla_grad(u).T-2/3*div(u)*iden,nabla_grad(v))*dx\

-cons*n_u*dot(dot(avg(nabla_grad(u)+nabla_grad(u).T-2/3*div(u)*iden),n('+')),v('+')-v('-'))*dS\

-cons*n_u*dot(dot(avg(nabla_grad(v)+nabla_grad(v).T-2/3*div(v)*iden),n('+')),u('+')-u('-'))*dS\

+cons*n_u*C_a/h_avg*dot(u('+')-u('-'),v('+')-v('-'))*dS

##############################

L=L_1+L_2+L_3+L_4

t_i=2*dt

while(t_i<Time):

  t_i+=dt

  print(t_i)

  J= derivative(L,upnu,du)

  problem = NonlinearVariationalProblem(L,upnu,bcs,J)

  solver = NonlinearVariationalSolver(problem)

  prm = solver.parameters

  prm["newton_solver"]["absolute_tolerance"] = tol

  prm["newton_solver"]["relative_tolerance"] = tol

  prm["newton_solver"]["maximum_iterations"] = 25
 
  prm["newton_solver"]["relaxation_parameter"] = 1.0

  #prm["newton_solver"]["linear_solver"] = "gmres"

  #prm["newton_solver"]["linear_solver"] = "petsc"

  #prm["newton_solver"]["preconditioner"] = "jacobi"

  prm["newton_solver"]["linear_solver"] = "mumps" #most stable

  solver.parameters["newton_solver"]["error_on_nonconvergence"] = False

  solver.solve()

  _u,_p,_lg=upnu.split(True)

  u_1.assign(u_2)

  u_2.assign(u_3)

  u_3.assign(_u)

  p_1.assign(p_2)

  p_2.assign(p_3)

  p_3.assign(_p)

  lg_n=_lg

  if (count > 0):

    File("Velocity_"+"Time_"+str(t_i)+"_.pvd") << _u

    File("Pressure_" +"Time_"+str(t_i)+"_.pvd") << _p

  count = count + 1

quit()

Move your solver definition outside of the time loop. You are recreating matrices, ksp solvers etc at every timestep. With modern petsc (>~3.16) these do not get destroyed properly by the Python garbage collector, leading to an accumulation in memory usage over time.

Hi @dokken,

Thanks for the help, I updated my code. Unfortunately I am still running out of memory. I should have specified earlier but I typically have been running out of memory at time step 1 and not at later time steps. Here is my updated code that I am still having trouble with:


L=L_1+L_2+L_3+L_4
t_i=2*dt

################ Solver set-up
J= derivative(L,upnu,du)
problem = NonlinearVariationalProblem(L,upnu,bcs,J)
solver = NonlinearVariationalSolver(problem)

prm = solver.parameters
prm["newton_solver"]["absolute_tolerance"] = tol
prm["newton_solver"]["relative_tolerance"] = tol
prm["newton_solver"]["maximum_iterations"] = 25
#prm["newton_solver"]["relaxation_parameter"] = 1.0
#prm["newton_solver"]["linear_solver"] = "bicgstab"
#prm["newton_solver"]["linear_solver"] = "petsc"
#prm["newton_solver"]["preconditioner"] = "icc"
prm["newton_solver"]["linear_solver"] = "mumps" #most stable
solver.parameters["newton_solver"]["error_on_nonconvergence"] = False

################## Loop over Time
while(t_i<Time):

   t_i+=dt
   print(t_i)
   
   solver.solve()
   
   _u,_p,_lg=upnu.split(True)
   
   u_1.assign(u_2)
   u_2.assign(u_3)
   u_3.assign(_u)
   p_1.assign(p_2)
   p_2.assign(p_3)
   p_3.assign(_p)
   lg_n=_lg
   
   if (count > 0):
     File("Velocity_"+"Time_"+str(t_i)+"_.pvd") << _u
     File("Pressure_" +"Time_"+str(t_i)+"_.pvd") << _p
    
   count = count + 1
   
quit()

Were you able to cure this issue? I am running into something similar with a moderately sized (128x128x128) elasticity problem.

@jzcrow Unfortunately, I never figured out a resolution in the legacy FEniCS. However, I reimplemented my solver in FEniCSX and it worked fine there. Perhaps try that if you haven’t yet.

1 Like

Alright thank you. Wasn’t the answer I was hoping for, but it is what it is lol