Where to put control(u)

christinaz · November 9, 2021, 10:08am

Hi I am solving the inverse problem using fenics, I have a question about what would be a proper way of setting this up. Do you know if I should put control=Control(u) before define J and forward(), or after those?

#control parameter u, which is also an initial state for the dynamic model
u = Function(V)
u.vector()[:] = 0
#run forward simulation, "storage" store the intermediate field value that will be use to calculate objective function 
forward(u,storage)

#objective function and run optimization
j = Assemble(inner(storage[''u_final"],storage["u_final"])*dx)
Jhat = ReducedFunctional(J, control)
problem = MinimizationProblem(Jhat)
solver = IPOPTSolver(problem, parameters=parameters)
u_opt = solver.solve()

dokken · November 9, 2021, 10:30am

You only need to define control = Control(parameter) before you create your reduced functional. It does not matter if the control is defined before or after the forward solve or definition of J

christinaz · November 9, 2021, 6:56pm

Hi Dokken, thanks for the help! In my case, it seems that if I place control = Control(parameter) at different places leads to very different optimization results.

I think the thing might be that I update the parameter in the forward function. Do you think it would matter in that case?

dokken · November 9, 2021, 7:41pm

It should not matter. Please make a minimal example that illustrates the differences you observe

christinaz · November 9, 2021, 7:51pm

I am sorry that it’s hard to come up with a minimal example in my case, I try to attach the code here and make it clear:

The forward model uses Chorin’s projection method to solve the NS equation:

def forward(u0,V,total_steps,objective,storage,points,mask,mesh,loss_fn):

    T = 10   # final time
    num_steps = 10000    # number of time steps
    dt = T / num_steps # time step size
    mu = 0.001
    nu = mu
    rho = 1            # density

    inflow   = 'near(x[0], -1)'
    outflow  = 'near(x[0], 4)'
    walls    = 'near(x[1], -0.5) || near(x[1], 0.5)'
    cylinder = 'on_boundary && not(near(x[0], -1)) && not(near(x[0], 4)) && not(near(x[1], -0.5)) && not(near(x[1], 0.5))'
    
    Q = FunctionSpace(mesh, "Lagrange", 1)
    V1 = FunctionSpace(mesh, "Lagrange", 1)
    
    # Define inflow profile
    inflow_profile = ('1', '0')

    # Define boundary conditions
    bcu_inflow = DirichletBC(V, Expression(inflow_profile, degree=2), inflow)
    bcu_walls = DirichletBC(V, Constant((0, 0)), walls)
    bcu_cylinder = DirichletBC(V, Constant((0, 0)), cylinder)

    bcp_outflow = DirichletBC(Q, Constant(0), outflow)
    bcu = [bcu_inflow,bcu_walls, bcu_cylinder]
    bcp = [bcp_outflow]

    # Define trial and test functions
    u = TrialFunction(V)
    v = TestFunction(V)
    p = TrialFunction(Q)
    q = TestFunction(Q)

    # Define functions for solutions at previous and current time steps
    u1 = Function(V)
    p1 = Function(Q)


    # Define expressions used in variational forms
    n  = FacetNormal(mesh)
    f  = Constant((0, 0))
    k  = Constant(dt)
    nu = Constant(nu)
    rho = Constant(rho)

    # Define variational problem for step 1
    F1 = inner(u - u0, v)*dx + k*inner(grad(u0)*u0, v)*dx + \
         k*nu*inner(grad(u), grad(v))*dx - inner(f, v)*dx
    a1 = lhs(F1)
    L1 = rhs(F1)

    # Define variational problem for step 2
    a2 = inner(grad(p), grad(q))*dx
    L2 = -(1/k)*div(u1)*q*dx

    # Define variational problem for step 3
    a3 = inner(u, v)*dx
    L3 = inner(u1, v)*dx - k*inner(grad(p1), v)*dx

    # Assemble matrices
    A1 = assemble(a1)
    A2 = assemble(a2)
    A3 = assemble(a3)

    # Use amg preconditioner if available
    prec = "amg" if has_krylov_solver_preconditioner("amg") else "default"

    # Use nonzero guesses - essential for CG with non-symmetric BC
    parameters['krylov_solver']['nonzero_initial_guess'] = True

    # Time-stepping
    t = 0
    for n in range(total_steps):

        # Update current time
        t += dt

        # Step 1: Tentative velocity step
        b1 = assemble(L1)
        [bc.apply(A1, b1) for bc in bcu]
        solve(A1, u1.vector(), b1, 'gmres', 'default')

        # Step 2: Pressure correction step
        b2 = assemble(L2)
        [bc.apply(A2, b2) for bc in bcp]
        [bc.apply(p1.vector()) for bc in bcp]
        solve(A2, p1.vector(), b2, 'gmres', 'default')

        # Step 3: Velocity correction step
        b3 = assemble(L3)
        [bc.apply(A3, b3) for bc in bcu]
        solve(A3, u1.vector(), b3, 'gmres', 'default')
        # Plot solution

        # Update previous solution
        u0.assign(u1)
        if n%10==0:
            d_observe = Function(V, name="data")
            d_observe.vector()[:] = storage['observation'][n+1,:]
            if objective is not None:
                objective(u0, d_observe, n,points,loss_fn=loss_fn)

I have this storage function and objective function, but it can be ignored, attached here in case is needed:


def objective_point_wise(storage, u=None, d_observe=None,idx=None,points=None,finalize=False,obversation_steps=None):
    if finalize:
        count = 0
        for idx in range(obversation_steps.shape[0]):
            step = obversation_steps[idx]
            if idx==0:
                loss = storage['functional'][step]
                count = count + 1
            else:
                if step>=len(storage['functional']):
                    continue
                else:
                    loss += storage['functional'][step]
                    count = count + 1
        storage["loss"] = loss/count
        return loss/count
    if idx==0:
        storage["functional"] = []
        storage["estimate_field"]=[]
        storage["gt_observation"]=[]
        storage["gt_field"]=[]
        storage["loss"] = 0
        loss = 0
    observation = [d_observe(p) for p in points]
    prediction = [u(p) for p in points]
    storage["functional"].append(sum([(observation[i][0]-prediction[i][0])**2+(observation[i][1]-prediction[i][1])**2 for i in range(len(points))])/len(points))

Then the optimization goes like this:

storage = {"functional": [],'solution':solution,'observation':observation, "time": 0, "estimate_field": [], "gt_observation": [],'gt_field':[],'loss':0}
obj = lambda u, d, idx,observed_coords,loss_fn, Q: objective_point_wise(storage,u, d, idx,observed_coords,loss_fn, Q)
observation_setup = {"start_time":500,"timestep":10,"endtime":580}
obversation_steps = np.arange(observation_setup["endtime"]-observation_setup["start_time"])[0:observation_setup["endtime"]-observation_setup["start_time"]:observation_setup["timestep"]]
obversation_steps = np.floor(obversation_steps/observation_setup["timestep"]).astype(np.int)


# initialize u at 0
V = VectorFunctionSpace(mesh, 'P', 2)
u= Function(V)
u.vector()[:] = 0

forward(u,V=V,total_steps=observation_setup["endtime"]-observation_setup["start_time"],
            objective=obj,storage=storage,points=observed_coords,mask=mask,mesh=mesh,loss_fn=loss_fn)
control = Control(u)

J = objective_point_wise(storage, finalize=True,obversation_steps=obversation_steps)
print("J:{}".format(J))
Jhat = ReducedFunctional(J, control, eval_cb_post=eval_cb)
problem = MinimizationProblem(Jhat)
parameters = {'maximum_iterations': 100}

solver = IPOPTSolver(problem, parameters=parameters)
rho_opt = solver.solve()

christinaz · November 9, 2021, 7:53pm

Depending on where I put control = Control(u), I get different results. I am also wondering if it’s the correct way for doing this initial state optimization task.

dokken · November 9, 2021, 8:36pm

I cannot really help you if I cannot execute the code.
There is always ways of making minimal examples, start by solving our problem on a built in mesh,where you create some dummy data for your observations.
Then solve for only one time step, instead of multiple steps.

You could even check that your functional is working using a coupled Stokes solver instead of a Chorin scheme. All these things would vastly simplify your code.

I also note that you are using an unstable finite element pair, P1-P1.

christinaz · November 9, 2021, 9:08pm

I came up with a minimal example! Here is the code. If I place control = Control(u_init) before or after forward, I got the different optimization results.


from dolfin import *
from dolfin_adjoint import *
import numpy as np
def forward(u_init,V,total_steps,objective,storage,points,mesh,timesteps=1,store=False):

    dt = 0.001
    mu = 0.1
    nu = mu
    rho = 1            # density

    inflow   = 'near(x[0], 0)'
    outflow  = 'near(x[0], 1)'
    walls    = 'near(x[1], 0) || near(x[1],1)'

    Q = FunctionSpace(mesh, "Lagrange", 1)
    
    # Define inflow profile
    inflow_profile = ('1', '0')

    # Define boundary conditions
    bcu_inflow = DirichletBC(V, Expression(inflow_profile, degree=2), inflow)
    bcu_walls = DirichletBC(V, Constant((0, 0)), walls)

    bcp_outflow = DirichletBC(Q, Constant(0), outflow)
    bcu = [bcu_inflow,bcu_walls]
    bcp = [bcp_outflow]

    # Define trial and test functions
    u = TrialFunction(V)
    v = TestFunction(V)
    p = TrialFunction(Q)
    q = TestFunction(Q)

    # Define functions for solutions at previous and current time steps
    u0 = u_init
    u1 = Function(V)
    p1 = Function(Q)


    # Define expressions used in variational forms
    n  = FacetNormal(mesh)
    f  = Constant((0, 0))
    k  = Constant(dt)
    nu = Constant(nu)
    rho = Constant(rho)

    # Define variational problem for step 1
    F1 = inner(u - u0, v)*dx + k*inner(grad(u0)*u0, v)*dx + \
         k*nu*inner(grad(u), grad(v))*dx - inner(f, v)*dx
    a1 = lhs(F1)
    L1 = rhs(F1)

    # Define variational problem for step 2
    a2 = inner(grad(p), grad(q))*dx
    L2 = -(1/k)*div(u1)*q*dx

    # Define variational problem for step 3
    a3 = inner(u, v)*dx
    L3 = inner(u1, v)*dx - k*inner(grad(p1), v)*dx

    # Assemble matrices
    A1 = assemble(a1)
    A2 = assemble(a2)
    A3 = assemble(a3)

    # Use nonzero guesses - essential for CG with non-symmetric BC
    parameters['krylov_solver']['nonzero_initial_guess'] = True

    # Time-stepping
    t = 0
    u_solution_dof = np.zeros([total_steps+1,int(V.dim())])
    if store:
        u_solution = np.zeros([total_steps+1,2, Q.dim()])
        u_solution_dof = np.zeros([total_steps+1,int(V.dim())])
        flux_x, flux_y = u1.split(deepcopy=True)
        u_solution[0,0, :] = flux_x.compute_vertex_values()
        u_solution[0,1, :] = flux_y.compute_vertex_values()
        u_solution_dof[0,:] = u1.vector().get_local()
        p_solution = np.zeros([total_steps+1, Q.dim()])
        p_solution[0, :] = p1.compute_vertex_values()
        
    for n in range(total_steps):
        # Update current time
        t += dt

        # Step 1: Tentative velocity step
        b1 = assemble(L1)
        [bc.apply(A1, b1) for bc in bcu]
        solve(A1, u1.vector(), b1, 'gmres', 'default')

        # Step 2: Pressure correction step
        b2 = assemble(L2)
        [bc.apply(A2, b2) for bc in bcp]
        [bc.apply(p1.vector()) for bc in bcp]
        solve(A2, p1.vector(), b2, 'gmres', 'default')

        # Step 3: Velocity correction step
        b3 = assemble(L3)
        [bc.apply(A3, b3) for bc in bcu]
        solve(A3, u1.vector(), b3, 'gmres', 'default')
        # Plot solution
        
        # Update previous solution
        u0.assign(u1)
        if not store:
            if n%timesteps==0 and (n!=0 or total_steps==1):
                print("save:{} timestep".format(n))
                d_observe = Function(V, name="data")
                d_observe.vector()[:] = storage['observation'][n+1,:]
                if objective is not None:
                    objective(u0, d_observe, n,points)

        if store:
            flux_x, flux_y = u1.split(deepcopy=True)
            u_solution[n+1,0, :] = flux_x.compute_vertex_values()
            u_solution[n+1,1, :] = flux_y.compute_vertex_values()     
            u_solution_dof[n+1, :] = u1.vector().get_local()
            p_solution[n+1,:] = p1.compute_vertex_values()
    
    return u_solution_dof



def objective_point_wise(storage, u=None, d_observe=None,idx=None,points=None,finalize=False):
    if finalize:
        for loss in storage['functional']:
            storage["loss"] = storage["loss"]+loss
        loss = storage["loss"]
        return loss
    if idx==0:
        storage["functional"] = []
        storage["gt_field"]=[]
        storage["loss"] = 0
    observation = [d_observe(p) for p in points]
    prediction = [u(p) for p in points]
    
    storage["functional"].append(sum([(observation[i][0]-prediction[i][0])**2+(observation[i][1]-prediction[i][1])**2 for i in range(len(points))])/len(points))
    
    storage["time"]=idx

def eval_cb(j, u):
    print(j)

    

## get observatons
mesh = UnitSquareMesh(10, 10)
V = VectorFunctionSpace(mesh, 'P', 2)
u0 = Function(V)
u0.vector()[:] = 1
u_solution_dof = forward(u0,V,total_steps=2,objective=None,storage=None,points=None,mesh=mesh,store=True)
observed_coords = (V.tabulate_dof_coordinates())[::2,:]

storage = {"functional": [],'solution':u_solution_dof,'observation':u_solution_dof, "time": 0, "estimate_field": [], "gt_observation": [],'gt_field':[],'loss':0}

obj = lambda u, d, idx,observed_coords: objective_point_wise(storage,u, d, idx,observed_coords)
observation_setup = {"start_time":0,"timestep":1,"endtime":1}
obversation_steps = np.arange(observation_setup["endtime"]-observation_setup["start_time"])[0:observation_setup["endtime"]-observation_setup["start_time"]:observation_setup["timestep"]]
obversation_steps = np.floor(obversation_steps/observation_setup["timestep"]).astype(np.int)
print(obversation_steps)

## start recover init state
V = VectorFunctionSpace(mesh, 'P', 2)
u_init = Function(V)
control = Control(u_init)
forward(u_init,V=V,total_steps=observation_setup["endtime"]-observation_setup["start_time"],
            objective=obj,storage=storage,points=observed_coords,mesh=mesh)
# control = Control(u_init)

J = objective_point_wise(storage, finalize=True)
print("J:{}".format(J))
Jhat = ReducedFunctional(J, control, eval_cb_post=eval_cb)
problem = MinimizationProblem(Jhat)
parameters = {'maximum_iterations': 100}

solver = IPOPTSolver(problem, parameters=parameters)
u_init_estimates = solver.solve()

christinaz · November 9, 2021, 9:25pm

Besides, I use the Chorin scheme as I don’t want the initial state to depend on p. I am using P2-P1 now.

dokken · November 9, 2021, 9:37pm

The whole point it is rather tricky for anyone to help you when your code contains so many different aspects.
As I said, this is not intended behaviour, but all the different aspects of your code makes it tricky to find the culprit.
For instance, the following minimal example does not cause the same issues:

from dolfin import *
from dolfin_adjoint import *

def forward(uh, f):
    V = uh.function_space()
    u = TrialFunction(V)
    v = TestFunction(V)
    a = inner(grad(u), grad(v)) * dx + inner(u, v) * dx
    L = inner(f, v) * dx
    solve(a==L, uh)
    return assemble(inner(uh,uh)*dx + inner(grad(uh), grad(uh))*dx)

mesh = UnitSquareMesh(10, 10)
V = FunctionSpace(mesh, "CG", 1)
u = Function(V)
f = Function(V)
f.vector()[:] = 1

control = Control(f)
J = forward(u, f)
#control = Control(f)


Jhat = ReducedFunctional(J, control)
problem = MinimizationProblem(Jhat)
parameters = {'maximum_iterations': 100}
solver = IPOPTSolver(problem, parameters=parameters)
u_init_estimates = solver.solve()
print(u_init_estimates.vector().norm("l2"))

sebastkm · November 9, 2021, 10:06pm

So this problem illustrates why we use ReducedFunctional(J, Control(f)) and not just ReducedFunctional(J, f). Basically, some operations are done in-place, such as solve and Function.assign. When using these functions you change the value of your function (say f) and so you might be interested in controlling either the first or the second “version” of this function.
While we could have just always chosen the leaf node, i.e. the first “version” of f (which was how it behaved before Control was introduced), but that would make things like this invalid:

f = Function(V) # This is the first version of f
f.assign(Constant(1)) # This is implicitly creates a new version of f that is the result of assigning 1 to it.
forward(f)
...

If you have a time-dependent problem, where u is overwritten for every time-step, then the difference between

control = Control(u)
forward(u)

and

forward(u)
control = Control(u)

is that in the first case you control (or compute gradients with respect to) the initial condition, and in the second case you control the solution at the last time step. Effectively this means you just compute the partial derivative of the functional J with respect to u at the last time step (i.e. no PDEs).

christinaz · November 9, 2021, 10:22pm

I see, thanks sebastkm, so basically I should have

control = Control(u)
forward(u)

since I want to optimize the initial state.

sebastkm · November 9, 2021, 10:24pm

Yes. That would be correct.

christinaz · November 9, 2021, 10:28pm

This is awesome, thanks sebastkm. Just a follow-up question, this won’t matter if the parameter doesn’t get updated during the forward() modeling right?

sebastkm · November 9, 2021, 10:31pm

Correct. Which is why most examples will just write ReducedFunctional(J, Control(f)).

christinaz · November 11, 2021, 7:55am

Hi sebastkm, I found that the using the code to solve inverse problem is super slow. Do you think this is expected here, or the gradient calculation here should be as fast as forward modeling speed hence shouldn’t be that slow.

sebastkm · November 22, 2021, 1:37pm

In this case you could get a bit of a speedup by defining a (or three) KrylovSolver outside of your time-loop. But that will only be a general speedup, and won’t fix your problem.

The reason your gradient computations are slow is that you are using the pointwise evaluation feature (i.e. u(p)). The adjoint implementation for this is really slow at the moment.
You could look here for some suggestions, but there are currently no very efficient ways that are not quite restrictive in the choice of points of measurement:

github.com/dolfin-adjoint/pyadjoint

Inverse parameter identification fails - how to use discrete measurements?

opened 05:44PM - 08 Mar 21 UTC

touste

Hi, I'm working on inverse methods for hyperelastic parameter identification. Th…e objective is to minimize the difference between experimental measurements captured by a camera (discrete displacement field) and the output of the finite element simulation to obtain the hyperelastic parameters. I have set up a MWE to validate the method: the forward problem is a 2D hyperelastic square subject to uniaxial tension, and the experimental displacement is measured at the 4 corners of the domain. For this purpose, I use a coarser mesh on which I assign the values of the experimental displacement field, and I project the result of the forward problem on this coarser mesh. For validation purposes, I generate the displacement field and traction so that it corresponds to a given material parameter value, and I then try to find this parameter again using the inverse method. Everything works fine until the optimization, which does not converge. I suspect there is something wrong in the way the experimental displacement field should be compared to the numerical one, but I couldn't find an example in the documentation that used discrete measurements and not analytical functions. Is this the recommended way? It seems a bit cumbersome. Thanks for your help! Here is the MWE: ``` from dolfin import * from dolfin_adjoint import * import numpy as np parameters["form_compiler"]["cpp_optimize"] = True parameters["form_compiler"]["representation"] = "uflacs" # -------------------------------------- # Targets for validation # -------------------------------------- # Compute traction to reach the given stretch lam based on material parameter C10v # This C10v parameter will then be found by inverse identification C10v = 1. lam = 1.5 traction = 2*C10v*(lam-1./lam**2) # -------------------------------------- # Forward problem: 2D hyperelastic material in uniaxial tension # -------------------------------------- mesh = UnitSquareMesh(6, 6) # Boundary condition domains class Top(SubDomain): def inside(self, x, on_boundary): return near(x[1], 1.) and on_boundary def bottom(x, on_boundary): return near(x[1], 0.) and on_boundary def left(x, on_boundary): return near(x[0], 0.) and on_boundary facets = MeshFunction("size_t", mesh, 1) facets.set_all(0) Top().mark(facets, 1) dx = Measure("dx") ds = Measure('ds', subdomain_data=facets) # Limit quadrature degree dx = dx(degree=3) ds = ds(degree=3) # Displacement Vu = VectorFunctionSpace(mesh, "CG", 2) u = Function(Vu) u_ = TestFunction(Vu) # Boundary conditions bcs = [DirichletBC(Vu.sub(1), Constant(0.), bottom), DirichletBC(Vu.sub(0), Constant(0.), left)] T = Constant((0., traction)) # Traction # Neo Hookean constant (initial value) C10 = Constant(5.) # First Piola-Kircchoff tensor def stress(u): I = Identity(len(u)) F = I + grad(u) C = F.T*F C33 = 1/det(C) # Plane stress p = 2*C10*C33 # Plane stress PK2 = 2*C10*I - p*inv(C) PK1 = F*PK2 return PK1 # First Piola-Kircchoff tensor S = stress(u) # Variational formulation F = inner(S, grad(u_))*dx - dot(T, u_)*ds(1) # Derivative J = derivative(F, u) # Solver parameters problem = NonlinearVariationalProblem(F, u, bcs=bcs, J=J) solver = NonlinearVariationalSolver(problem) # Solve once the forward problem solver.solve() # -------------------------------------- # Experimental measurements # -------------------------------------- # These measurements are defined at the 4 corners of the domain, so here # a coarser mesh is used meshe = UnitSquareMesh(1, 1) Vue = VectorFunctionSpace(meshe, "CG", 1) ue = Function(Vue) # Experimental displacement corresponding to the material parameter C10v u_exp = np.array([[0., 0.], [-0.5*(lam-1.), 0.], [0., lam-1.], [-0.5*(lam-1.), lam-1.]]) # Assign these displacements to nodal values v2d = dof_to_vertex_map(Vue) ue.vector()[:] = u_exp.flatten()[v2d] # Project the results of the forward computation on the coarser mesh unum = project(u, Vue) # -------------------------------------- # Optimization problem # -------------------------------------- # Want to minimize the difference between numerical and experimental # displacement at the defined points J = assemble((0.5*inner(unum-ue, unum-ue))*dx) Jhat = ReducedFunctional(J, Control(C10)) m_opt = minimize(Jhat) # Note : this does not converge float(m_opt) # This should give C10v ```

Topic		Replies	Views
Optimizing Parameters in Initial Condition Expression dolfin-adjoint	3	1279	May 26, 2020
Error with Optimal control problem General	3	354	March 27, 2023
NotImplementedError when using 2D Expression as Control dolfin-adjoint	4	706	November 26, 2020
Boundary condition variational formulation	1	395	February 13, 2020
Initial condition and desired solution in Opt control of heat eq dolfin-adjoint	17	1431	September 30, 2020

Where to put control(u)

Related topics