Mass Lumping via Quadrature Rule in a TensorFunctionSpace

Hi, I would like to assemble the form

a(Q, \tilde Q) = \int_\Omega \mathcal{I}_h(Q:\tilde Q)\, dx := \sum_{z\in\mathcal{N}_h} Q(z):\tilde Q(z)\int_\Omega \varphi_z\, dx

where Q, \tilde Q: \Omega \to \mathbb{R}^{2\times2} are my trial and test functions, \mathcal{N}_h are the nodes in my mesh, and \varphi_z is the corresponding piecewise linear basis function (strictly speaking, the basis functions for the space are E_{ij}\varphi_z where E_{ij} = \delta_{ij}). Also, Q(z):\tilde Q(z) := \sum_{i,j=1}^2 Q(z)_{ij}\tilde Q(z)_{ij}. In other words, this is the integral of a linear interpolation of the Frobenius inner product of the trial and test functions evaluated at the nodes in \mathcal{N}_h.

I’ve been having trouble defining this in FEniCS. I tried defining it as

A = inner(Q, Q_tilde)*dx(metadata={'quadrature_degree': 1, 'quadrature_scheme': 'vertex'})

but this does not seem correct because assemble(A).array() is not diagonal as one would expect. What would be the best way (or any way) to define this in FEniCS? Here is a minimal working example:

from dolfin import *
import numpy as np

mesh = RectangleMesh(Point(0, 0), Point(2, 2), 30, 30)
V = TensorFunctionSpace(mesh, "Lagrange", 1, shape=(2,2))
Q = TrialFunction(V)
Q_tilde = TestFunction(V)

A = inner(Q, Q_tilde)*dx(metadata={'quadrature_degree': 1, 'quadrature_scheme': 'vertex'})

arr = assemble(A).array()
print(np.sum(np.abs(arr - np.diag(np.diagonal(arr)))), arr, arr - np.diag(np.diagonal(arr)))