Of course a standard answer would be not to use GDI+ since it isn't meant for rapid game type drawing, but if you want to speed up GDI+ the answers can be quite involved and subjective.
My personal testing of GDI+ methods of drawing have led me to conclude that using texture brushes and the BufferedGraphics class gave the best boost to sprite type, i.e. arcade style games within the limits of GDI+.
On the other hand, if you were doing a lot of pixel manipulation, such as a paint application, or computer aided design (cad), or simulating fireworks, or fractals, etc... processing pixels values in a memory array and rendering from there would probably be the quickest approach.
As for your second question, I did do some experiments with two threads rendering a number of items in two different backbuffers, but found that total number of items that could be rendered per second did not go up. It appears that even if two thread are using two different graphic objects to write into two different backbuffers, that the GDI+ calls themselves might be serialized to some extent, at least within a given process, so that the net result was the same or less than doing all the rendering in one thread. I assume that if you had a large amount of calculation to be done in relation to the drawing, then there could be a substantial gain with those two tasks being separated into two threads.
One series of experiments with GDI+ drawing can be found in the first 12 posts of this thread.
It starts off fairly traditionally, drawing from the Paint event, but quickly moves to using the BufferedGraphics object and texture brushes to increase drawing speed, and backbuffer "flipping" speed.