There's nothing that says it has to be one way or the other.
In reality, I'd think it'd be easiest to make the screensaver react to a windows message, DDE, COM call etc, and simply play a new animation at command.
That way, when/If someone gets around to building an actual music detection/timing app (not trivial), it can hook into the screensaver using the exact same method.
That way, you have something that people can make use of right away, and something that can be expanded upon in the future.
Just my 2c