I am not sure this is related to the cache issue, see if the following post helps:
I used FEniCSx and once submitted 100 jobs simultaneously on HPC. This resulted that the running speed of each job got much slower than that of a single submission. If I gradually killed the jobs, the resting jobs gradually speeded up. This is because HPC uses a shared a hard drive (HOME folder). The default cache is stored in the home folder, which is shared and used by all runs. If redirect the cache to a unique location for each job, then it has no negative influence on running speed. But I am not sure if this also can alleviate the suspected memory issue.