Assembling in Parallel and Solving in Serial

I am working with a problem where it takes a long time to assemble the vector u in Ax==u. Running the assemble in parallel does a good job with decreasing the time it takes to assemble, but doing so increases the time it takes to solve the problem. Is it possible for me to assemble u vector in parallel but then solve it with one core or is there something else I should be trying?

Does the solution from this thread help?

That did help with the solver not increasing as much, although it still does increase by a little bit. This is very helpful but I still wonder if there is a way to assemble in parallel and solve in serial?