Thinking about it more, I’m increasingly certain your BLAS is behind it. PETSc builds against pthreads, but there are the lower level libraries to consider as well, MUMPS, Scalapack, BLAS.
There are several implementations of BLAS, OpenBLAS, BLIS and others (ATLAS, and others still). OpenBLAS is alternatively built against pthreads or openmp (or without threading). There are also 64-bit builds (64-bit pointers), but the rest of the stack is not using them yet.
If you haven’t specified which blas to install on your system then you probably have the reference implementation, libblas-dev which performs poorly. Looks like a thread-optimised implementation has been installed on your workstation. You’ll want to choose one that works well for your system. OpenBLAS is probably a good generic choice. ATLAS only performs well when compiled with the specific flags that properly optimise for your CPU. BLIS is new, maybe its ok. Intel’s MKL is also possible.
The various BLAS alternatives can be installed at the same time, and linking to them is dynamic (runtime) via libblas.so.3. You can choose your preferred alternative with sudo update-alternatives --config libblas.so-x86_64-linux-gnu
(and sudo update-alternatives --config libblas.so.3-x86_64-linux-gnu
. The Debian BLAS developer hasn’t chosen to use simple names like “blas” for the alternative, alas).
tl-dr; sudo apt-get install libopenblas-pthread-dev
on your laptop, might help.