Heat flux over boundary

Hey,

I have a heat equation and want to calculate the heat flux over a single boundary (in this example the right boundary) after every time-step.

I want to calculate the heat-flux over the (right) boundary by assembling

G = (inner((unew - uold)/dt, vtest)*dx + inner(grad(unew), grad(vtest))*dx),

rather than calculating grad(unew)*n, since this lacks the time-dependency. The final goal is to have a vector that evaluates the heat-flux over the boundary in a set of points, which I currently do via

x = Function(V)
x.vector()[:] = assemble(G)
y = np.array([x(1, i*dxx) for i in range(N+1)]).

This does work, but it seems like a computational overkill (the assembly over the whole domain) and I would like to reduce this, since this flux computation needs to be done very many times.

I tried to define a suitable subdomain and measure to reduce the region over which G is assembled, but this does not appear to reduce computational time, it is actually a little slower.

Any help or pointers would be appreciated.
Full minimal example below (v.2018.1.0):

import numpy as np
import dolfin as dol

N = 20
dt = 0.1
dx = 1/N

mesh = dol.UnitSquareMesh(N, N)
V = dol.FunctionSpace(mesh, 'CG', 1)

unew = dol.TrialFunction(V)
uold = dol.Function(V)
uold.interpolate(dol.Expression("sin(x[0]*M_PI)*sin(x[1]*M_PI)", degree = 2))
vtest = dol.TestFunction(V)

def BC_DIR(x, on_boundary):
    return on_boundary and (x[0] > 1 - dol.DOLFIN_EPS or x[0] < dol.DOLFIN_EPS and x[1] < dol.DOLFIN_EPS or x[1] > 1 - dol.DOLFIN_EPS)
bc = dol.DirichletBC(V, dol.Constant(1.0), BC_DIR)

F = (dol.inner((unew - uold)/dt, vtest)*dol.dx + dt*dol.inner(dol.grad(unew), dol.grad(vtest))*dol.dx)

usol = dol.Function(V)
dol.solve(dol.lhs(F) == dol.rhs(F), usol, bc)

G1 = (dol.inner((usol - uold)/dt, vtest)*dol.dx + dol.inner(dol.grad(usol), dol.grad(vtest))*dol.dx)

class subdom(dol.SubDomain):
    def inside(self, x, on_boundary):
        return x[0] >= 1 - dx - dol.DOLFIN_EPS
subdomains = dol.MeshFunction('size_t', mesh, dim = mesh.topology().dim())
subdomains.set_all(0)
subdom().mark(subdomains, 1)
dxsub = dol.Measure('dx')[subdomains]

G2 = (dol.inner((usol - uold)/dt, vtest)*dxsub + dol.inner(dol.grad(usol), dol.grad(vtest))*dxsub)

sol1 = dol.assemble(G1).get_local()
sol2 = dol.assemble(G2).get_local()
print(np.linalg.norm(sol1 - sol2, 2)) # 0.0

x = dol.Function(V)
x.vector()[:] = sol1
y = np.array([x(1, i*dx) for i in range(N+1)])
print(y)

M = 1000
import time
tstart = time.time()
for i in range(M):
    dol.assemble(G1)
print(time.time() - tstart)

tstart = time.time()
for i in range(M):
    dol.assemble(G2)
print(time.time() - tstart)