Tuesday, 5 February 2008

Re: Bach fugue project. Madsen and Widmer 06: Separating voices in MIDI

This method works from left to right, from the beginning of the piece to the end. A cost function is used which is based on the difference in pitch between consecutive notes in a voice: trying to minimise this difference (using Euclidean distances). So there are different levels of correctness assigned to each voice allocation, rather than a definite discrete classification. This would be useful in reconciling cases where there is a clash in assigning simultaneous notes to the same voice (assuming monophonicity of voices).

This cost function is based on predetermined rules inspired by Temperley's preference rules for voice-leading harmonic structure. These rules are an example of how the implementors add human knowledge into the system for it to work correctly. Madsen and Widmer have experimented manually with the setting of this cost function, using their musical knowledge and observations of the performance of the system to improve voice identification success.

Although some human-driven bias is inevitable when capturing a real-world system in a computational model, due to decisions made when structuring the model, I would prefer to keep the influence of human knowledge to a minimum. Instead it should learn as much as possible about how to construct the individual voices from the notes in the piece, rather than use human-devised rules to guide it and later correct it.

Madsen and Widmer's method is to heavily segment the voices so that small patterns of notes (c.f. Chew and Wu's contigs) are detected. They make a small post-processing effort to join these voice segments together but in general their main focus is on local voicings of notes rather than identifying a global voice that is present throughout the piece.

They only test on Bach fugues. I would like to see how their work generalises to other voice-driven music such as quartets or orchestral works.

No comments:

Post a Comment