Implementing Gradient Descent in FEniCS

I want to implement a Gradient Descent Algorithm in FEniCS. I am working in 1D (if this works out I can move onto 2D).

My expression for the energy is ( I choose a simple one for now)
E[u]=\int_0^{L} (1-u)^2 dx

I present the MWE below, where i am trying to do the gradient descent step, ie,
u^{n+1}=u^n-\gamma \partial_u E[u],
where \gamma is the learning rate.

The MWE is presented below:

import dolfin
from dolfin import *
import ufl
import matplotlib.pyplot as plt
from petsc4py import PETSc

L=10
mesh = IntervalMesh(100,0,L)
element = FiniteElement("Lagrange", mesh.ufl_cell(), 1)
V = FunctionSpace(mesh, "CG", 1)
gamma = float(0.1)# Learning rate.

u = Function(V) 
u_up = Function(V)

phi = (1-u)**2  

# Total potential energy
Pi = phi*dx

#Defining derivative
F = ufl.derivative(Pi,u)

# Set up the non-linear problem
u_up = u - gamma*F

plot(u)
plt.title(r"$u(x)$",fontsize=26)
plt.show()
plot(u_up)
plt.title(r"$u_up(x)$",fontsize=26)
plt.show()

The error is

Traceback (most recent call last):
  File "/Users/DigoSen/Desktop/Books/Codes/Python/FEniCS/MWE/Discourse-Post4.py", line 25, in <module>
    u_up = u - gamma*F
           ~~^~~~~~~~~
  File "/Users/DigoSen/anaconda3/envs/fenics2019/lib/python3.12/site-packages/ufl/form.py", line 304, in __rsub__
    return other + (-self)
           ~~~~~~^~~~~~~~~
TypeError: unsupported operand type(s) for +: 'Function' and 'Form'

I understand that F and u are not the same type. But I don’t know how to convert them to the same type.

I am using dolfin version 2019.1.0, which i installed using conda-forge.

You Need to assemble Pi. Then you should choose an appropriate Riesz representation of the gradient to put it in V, and assign the sum to a function.

Hi, I don’t quite follow the series of commands that you mentioned.

  1. When you say assemble Pi, I’m guessing its something like
    Pi = assemble(phi*dx)

  2. “Then you should choose an appropriate Riesz representation of the gradient to put it in V.”

    Im not sure i understand your suggestion. Are you saying I should supply the variational derivative? I was hoping I could get the code to compute it itself. Please correct me if i misunderstood your suggestion.

Well, I wrote this a bit quickly on my phone. What you should do is first:

  1. Convert F from a symbolic representation into a numerical representation.
    This can be done with F_vector = assemble(F)
    However, F is in the dual space of u, and you can either use the values of the “vector” directly, or use a Riesz representation. For a Riesz representation choose a coersive bilinear form such that a(u, v) == F and solve that variational problem.
    If you do not want to do that, you can simply do
from dolfin import *

L = 10
mesh = IntervalMesh(100, 0, L)
element = FiniteElement("Lagrange", mesh.ufl_cell(), 1)
V = FunctionSpace(mesh, "CG", 1)
gamma = float(0.1)  # Learning rate.

u = Function(V)
u_up = Function(V)

phi = (1-u)**2

# Total potential energy
Pi = phi*dx

# Defining derivative
F = derivative(Pi, u)

F_vector = assemble(F)

# Set up the non-linear problem
u_up.vector()[:] = u.vector()[:] - gamma*F_vector[:]

1 Like

Worked like a charm. I have completed the “Loop” and used a separate example. I pasted the complete code here incase it helps anyone.

I have two examples below.

  1. \min_u\int_0^L(1-u)^2dx
  2. \min_u\int_0^L u^2 + (u')^2dx
    Note that both functions are convex with corresponding min u=1 and u=0 respectively.

import dolfin
print(f"DOLFIN version: {dolfin.__version__}")
from dolfin import *
import ufl
print(f" UFL version: {ufl.__version__}")
import matplotlib.pyplot as plt
from petsc4py import PETSc

L=10
mesh = IntervalMesh(100,0,L)
x = SpatialCoordinate(mesh)
element = FiniteElement("Lagrange", mesh.ufl_cell(), 1)
V = FunctionSpace(mesh, "CG", 1)
gamma = float(0.01)# Learning rate.
tol = float(input('tolerance?\n')) # Learning rate.
NN = int(input('Number of iterations?\n')) # Learning rate.

u = Function(V)
u_up = Function(V)

#phi = (1-u)**2  
phi = ( u**2 + inner( ufl.grad(u), ufl.grad(u)) )

# Total potential energy
Pi = phi*dx

#Defining derivative
F = derivative(Pi,u)


#Initializing the intial conditions
u0 = interpolate( Expression("x[0]/L",L=L, degree=1), V)#initial cond.
u_up.vector()[:] = u0.vector()[:]


for t in range(NN):
 u.vector()[:] = u_up.vector()[:]
 F_vec = assemble(F)
 u_up.vector()[:] = u.vector()[:] - gamma*F_vec[:]
 if (norm(F_vec)) < tol:
  break
 #print(F_vec.get_local()) # use this for ∆u when u blows up


print(F_vec.get_local()) # prints the vector.

plot(u0)
plt.title(r"$u_{0}(x)$",fontsize=26)
plt.show()
plot(u)
plt.title(r"$u(x)$",fontsize=26)
plt.show()
1 Like

This is a continuation of the question. Should I add this as a separate post?
I want to implement the Nesterov momentum method. Where instead of computing the gradient at the current point, the gradient at a different point is used.
To talk briefly about the method
u_{i+1}=u_i+v_{i+1},
v_{i+1}=\tau v_i -\eta\nabla f(u_i+\tau v_i),
\text{with}~v_0=0,~~\tau\in(0,1).
I was wondering what changes i would have to do to the line below.

# Defining derivative
F = derivative(Pi, u)
F_vector = assemble(F)

In the second line above, I end up with F_vector being evaluated as the gradient at the current iteration step. How do i enforce that it be evaluated at u_i+\tau v_i?

You should be replacing the values currently stored in u with those of the updated solution u + tau v. For instance, using assign(u, function_with_u_plus_tau_u) or, as you do in your previous code u.vector()[:] = u.vector()[:] + tau*v.vector()[:]

1 Like