How to improve performance on assemble problem

Dear all,

in the code posted below I try to use (mis-use?) Fenics as a deluxe numerical integration tool. I try to integrate some expressions over a given geometry. The attached code is functional and does not throw any errors (on my maschine).

However as I need to execute this operation rather often on changing geometries I am looking for perfomance improvements.

My experience with Fenics is limited so I am grateful for suggestions on the given code as well as the most appropriate installation. Currently I am running Fenics 2019.1.0 installed via conda-forge on Mac. Admin access is not an issue and linux or docker would be options as well.

Please let me know, if relevant information is missing.

Thanks!

import dolfin as df
from mshr import Rectangle, generate_mesh, Circle

def create_mu_pq_ex(p, q, x_cg, y_cg):
    return "pow(x[0]-({}), {})*pow(x[1]-({}), {})".format(x_cg, p, y_cg, q)

domain = Rectangle(df.Point(-0.1,-0.2), df.Point(1.1, 0.2))-Circle(df.Point(0.1,0.1), 0.2, 20)
mesh = generate_mesh(domain, 30)
                
P1 = df.FiniteElement("P", "triangle", 1)
f = df.Expression("x[0]", element=P1)
x_cg = df.assemble(f*df.dx(mesh))
f = df.Expression("x[1]", element=P1)
y_cg = df.assemble(f*df.dx(mesh))
f = df.Expression("1", element=P1)
A = df.assemble(f*df.dx(mesh))
x_cg /= A
y_cg /= A

pqs = [(0,0),
   (1,0),
   (0,1),       
   (2,0),
   (0,2),
   (2,1),
   (1,2),
   (3,0),
   (0,3),
  ]

mus = []
for p, q in pqs:
    f = df.Expression(create_mu_pq_ex(p, q, x_cg, y_cg), element=P1)
    mus.append(df.assemble(f*df.dx(mesh)))

print(mus)

Are you wanting to reduce compliation time (indicated by message similar to “calling JIT compiler”)? or assembly speed?

Honestly, every improvement is valuable.

Currently, my code does not produce any notifications that it is calling JIT. However, I have seen them in the past.

Are there any suggestions to better understand whats happening under the hood? Tests I can run?

I guess that assembly speed is the issue… If you can point out which sections to distinguish I will be happy to measure some runtimes.