Implementing Gradient Descent in FEniCS

I want to implement a Gradient Descent Algorithm in FEniCS. I am working in 1D (if this works out I can move onto 2D).

My expression for the energy is ( I choose a simple one for now)
E[u]=\int_0^{L} (1-u)^2 dx

I present the MWE below, where i am trying to do the gradient descent step, ie,
u^{n+1}=u^n-\gamma \partial_u E[u],
where \gamma is the learning rate.

The MWE is presented below:

import dolfin
from dolfin import *
import ufl
import matplotlib.pyplot as plt
from petsc4py import PETSc

L=10
mesh = IntervalMesh(100,0,L)
element = FiniteElement("Lagrange", mesh.ufl_cell(), 1)
V = FunctionSpace(mesh, "CG", 1)
gamma = float(0.1)# Learning rate.

u = Function(V) 
u_up = Function(V)

phi = (1-u)**2  

# Total potential energy
Pi = phi*dx

#Defining derivative
F = ufl.derivative(Pi,u)

# Set up the non-linear problem
u_up = u - gamma*F

plot(u)
plt.title(r"$u(x)$",fontsize=26)
plt.show()
plot(u_up)
plt.title(r"$u_up(x)$",fontsize=26)
plt.show()

The error is

Traceback (most recent call last):
  File "/Users/DigoSen/Desktop/Books/Codes/Python/FEniCS/MWE/Discourse-Post4.py", line 25, in <module>
    u_up = u - gamma*F
           ~~^~~~~~~~~
  File "/Users/DigoSen/anaconda3/envs/fenics2019/lib/python3.12/site-packages/ufl/form.py", line 304, in __rsub__
    return other + (-self)
           ~~~~~~^~~~~~~~~
TypeError: unsupported operand type(s) for +: 'Function' and 'Form'

I understand that F and u are not the same type. But I don’t know how to convert them to the same type.

I am using dolfin version 2019.1.0, which i installed using conda-forge.

You Need to assemble Pi. Then you should choose an appropriate Riesz representation of the gradient to put it in V, and assign the sum to a function.

Hi, I don’t quite follow the series of commands that you mentioned.

  1. When you say assemble Pi, I’m guessing its something like
    Pi = assemble(phi*dx)

  2. “Then you should choose an appropriate Riesz representation of the gradient to put it in V.”

    Im not sure i understand your suggestion. Are you saying I should supply the variational derivative? I was hoping I could get the code to compute it itself. Please correct me if i misunderstood your suggestion.

Well, I wrote this a bit quickly on my phone. What you should do is first:

  1. Convert F from a symbolic representation into a numerical representation.
    This can be done with F_vector = assemble(F)
    However, F is in the dual space of u, and you can either use the values of the “vector” directly, or use a Riesz representation. For a Riesz representation choose a coersive bilinear form such that a(u, v) == F and solve that variational problem.
    If you do not want to do that, you can simply do
from dolfin import *

L = 10
mesh = IntervalMesh(100, 0, L)
element = FiniteElement("Lagrange", mesh.ufl_cell(), 1)
V = FunctionSpace(mesh, "CG", 1)
gamma = float(0.1)  # Learning rate.

u = Function(V)
u_up = Function(V)

phi = (1-u)**2

# Total potential energy
Pi = phi*dx

# Defining derivative
F = derivative(Pi, u)

F_vector = assemble(F)

# Set up the non-linear problem
u_up.vector()[:] = u.vector()[:] - gamma*F_vector[:]

1 Like

Worked like a charm. I have completed the “Loop” and used a separate example. I pasted the complete code here incase it helps anyone.

I have two examples below.

  1. \min_u\int_0^L(1-u)^2dx
  2. \min_u\int_0^L u^2 + (u')^2dx
    Note that both functions are convex with corresponding min u=1 and u=0 respectively.

import dolfin
print(f"DOLFIN version: {dolfin.__version__}")
from dolfin import *
import ufl
print(f" UFL version: {ufl.__version__}")
import matplotlib.pyplot as plt
from petsc4py import PETSc

L=10
mesh = IntervalMesh(100,0,L)
x = SpatialCoordinate(mesh)
element = FiniteElement("Lagrange", mesh.ufl_cell(), 1)
V = FunctionSpace(mesh, "CG", 1)
gamma = float(0.01)# Learning rate.
tol = float(input('tolerance?\n')) # Learning rate.
NN = int(input('Number of iterations?\n')) # Learning rate.

u = Function(V)
u_up = Function(V)

#phi = (1-u)**2  
phi = ( u**2 + inner( ufl.grad(u), ufl.grad(u)) )

# Total potential energy
Pi = phi*dx

#Defining derivative
F = derivative(Pi,u)


#Initializing the intial conditions
u0 = interpolate( Expression("x[0]/L",L=L, degree=1), V)#initial cond.
u_up.vector()[:] = u0.vector()[:]


for t in range(NN):
 u.vector()[:] = u_up.vector()[:]
 F_vec = assemble(F)
 u_up.vector()[:] = u.vector()[:] - gamma*F_vec[:]
 if (norm(F_vec)) < tol:
  break
 #print(F_vec.get_local()) # use this for ∆u when u blows up


print(F_vec.get_local()) # prints the vector.

plot(u0)
plt.title(r"$u_{0}(x)$",fontsize=26)
plt.show()
plot(u)
plt.title(r"$u(x)$",fontsize=26)
plt.show()
1 Like

This is a continuation of the question. Should I add this as a separate post?
I want to implement the Nesterov momentum method. Where instead of computing the gradient at the current point, the gradient at a different point is used.
To talk briefly about the method
u_{i+1}=u_i+v_{i+1},
v_{i+1}=\tau v_i -\eta\nabla f(u_i+\tau v_i),
\text{with}~v_0=0,~~\tau\in(0,1).
I was wondering what changes i would have to do to the line below.

# Defining derivative
F = derivative(Pi, u)
F_vector = assemble(F)

In the second line above, I end up with F_vector being evaluated as the gradient at the current iteration step. How do i enforce that it be evaluated at u_i+\tau v_i?

You should be replacing the values currently stored in u with those of the updated solution u + tau v. For instance, using assign(u, function_with_u_plus_tau_u) or, as you do in your previous code u.vector()[:] = u.vector()[:] + tau*v.vector()[:]

1 Like

Im trying to convert all my codes into FEniCSx. Is there a nice way to convert the above code into FEniCSx code?

It is not very hard to convert this.
Here is how I would do it:

from mpi4py import MPI
import dolfinx
import ufl
import basix.ufl

L = 10
mesh = dolfinx.mesh.create_interval(MPI.COMM_WORLD, 100, [0, L])
x = ufl.SpatialCoordinate(mesh)
V = dolfinx.fem.functionspace(mesh, ("Lagrange", 1))
gamma = float(0.01)  # Learning rate.
tol = float(input("tolerance?\n"))  # Learning rate.
NN = int(input("Number of iterations?\n"))  # Learning rate.

u = dolfinx.fem.Function(V)
u_up = dolfinx.fem.Function(V)

# phi = (1-u)**2
phi = u**2 + ufl.inner(ufl.grad(u), ufl.grad(u))

# Total potential energy
Pi = phi * ufl.dx

# Defining derivative
F = ufl.derivative(Pi, u)


# Initializing the intial conditions
u0 = dolfinx.fem.Function(V)
u0.interpolate(lambda x: x[0] / L)
u_up.x.array[:] = u0.x.array

F_compiled = dolfinx.fem.form(F)
for t in range(NN):
    u.x.array[:] = u_up.x.array
    F_vec = dolfinx.fem.assemble_vector(F_compiled)
    F_vec.scatter_reverse(dolfinx.la.InsertMode.add)
    F_vec.scatter_forward()
    u_up.x.array[:] = u.x.array[:] - gamma * F_vec.array[:]
    if (dolfinx.la.norm(F_vec)) < tol:
        break
    print(t, dolfinx.la.norm(F_vec))  # use this for ∆u when u blows up

import matplotlib.pyplot as plt
import numpy as np

coords = V.tabulate_dof_coordinates()[:, 0]
sort_dir = np.argsort(coords)

plt.plot(coords[sort_dir], u0.x.array[sort_dir], label="Initial")
plt.plot(coords[sort_dir], u.x.array[sort_dir], label="Updated")
plt.legend()
plt.grid()
plt.savefig("mwe.png")

I would recommend looking at the various demos and tutorials on DOLFINx to understand the fundamental changes in syntax.