I'm specially interested in Cave SH3 games, some of which are not very well optimized so I don't get constant 60 fps in my old Core2Duo.
This statement is a little offensive.
When I got the driver to work on it was 'not very well optimized' and ran at 5-10% (maximum) speed on a 3ghz C2D. I spent a week+ carefully optimizing a lot of it to make it run at in excess of 300% in places, dipping a bit below 100 in some situations where it was basically unavoidable.
As drivers go it is heavily optimized and optimizing it was an interesting challenge, analysing the generated code, understanding how GCC handles certain things and in some cases outputs far less efficient code when you'd expect it to actually output better code.
You've cross-posted this question elsewhere anyway, and already got an answer there. You also seem dismissive of the advice to run a 64-bit OS and 64-bit MAME which gives a significant boost to this driver (~25%) without doing anything else.