Wednesday, 20 February 2008

Ongoing projects

One of my worries before starting this DPhil course was that I wouldn't have much variety of work to be doing, but would instead be spending all my time reading and trying to find suitable material to read.

In actual fact, it has turned out that I seem to have a million and one different projects on the go, to the point that I am finding it hard to keep track of everything. The reading pile is definitely there on my desk, waiting for me to pile through more of it. In addition to this though, I have papers to write, funding applications to prepare, courses to sit in on, interesting seminars to attend, talks to plan and give, coursework for this ATC course (including some specific blog entries that I really should get around to soon) and then the actual end of year report will start to loom, where I am supposed to finally know what I am going to be doing in my DPhil research...

So to try and get a handle on what I'm currently working on, I am going to make a list here, for my own reference.

I have submitted applications to a couple of charitable funds now. So the main funding application I am working on is to the AHRC. Because my work is so multi-disciplinary (computational models of musical creativity), I am also hoping to be considered for EPSRC funding through the department (which I am not hopeful of getting, as Music Informatics already has one departmentally-funded student). There is also a very tenuous case for submitting an application to the Open competition in ESRC, although I think this may not be really worth pursuing.

I will never have time to do all the reading I want to do! But this is a collection of what I would like to at least skim through:

A number of papers, particularly I need to look through the publications list of Wiggins, Cambouropoulos, Patel, Bharucha, Mozer, Johnson-Laird, Miranda, Dowling and Huron, to name but a few!

Books (a small selection, in no particular order):
- Margaret Boden's Creative Mind and Dimensions of Creativity
- Sternberg: The Handbook of Creativity
- Lerdahl and Jackendoff: A Generative Theory of Tonal music
- Sloboda: A musical mind
- Bregman: Auditory Scene Analysis
- Roads: The computer music tutorial
- Koestler: The Act of Creation
- Csikszentmihalyi (sp??) Creativity
- Wiggins/Deliege ed: Musical Creativity in theory and practice
- Peretz/Zatorre ed: Cognitive neuroscience of music
- Todd/Loy ed: Music and Connectionism
- Cope: Can't remember the name of the book but it is about his creative music system EMI

Conference proceedings: AISB 2002, ESCOM ?, a workshop on creativity in Edinburgh in the 90s and workshop in computational creativity 2007

I need to turn my MSc thesis into a paper that matches the extended abstract I submitted for an Interdisciplinary Musicology Conference in Greece, which I will be going to this summer. As I am writing this paper jointly with my old supervisor in Edinburgh, we need to work out how we are going to co-author the paper - I think the ideal scenario for both of us is that I write the bulk of the copy, with a lot of editing input from my supervisor, but I need to check this with him.

Also, I have jointly (with my current supervisor) submitted an abstract to a language and cognition conference in Brighton this summer. It is touch and go whether that will get accepted, I think, as it is slightly off topic for the conference; we are applying linguistic models of creativity to music, acknowledging the parallels between music and language, but the focus of this conference is on language. However this is work that I really want to investigate further, so I am going to carry on with the work anyway. If it doesn't get accepted at Brighton, then it will be useful for my DPhil anyway and might be publishable elsewhere.

Final paper is the one I submitted to ICMC earlier this month. Should it get accepted for ICMC, I am sure there will be some revision I will need to do...

The technical communications course that this blog has been created for is the only course that requires me to do coursework outside of my DPhil study. Although to be honest it is a little frustrating to have to think about deadlines for this course when I am more concerned about the fact that I still don't know what my research questions are going to be for my DPhil, the course is proving very useful and will be worth the effort I think. I have to prepare a poster and a talk for a mock-conference in March, for which I will use my ICMC project. I'm hoping to reuse the talk in ICMC (if accepted) and possibly at my old university, Warwick, as a researcher there is interested in my work and would like me to go and speak there.

There are so many seminars on at the moment that I would love to go and sit in on, but I have so little time. But I am keeping my eyes open. There is one in E-intentionality coming up on the link between creativity and novelty, which I will definitely go to, and a few others that I will try and get to as well.

At the moment, there are courses in Generative Creativity, Data Mining, Music Analysis, Neural Networks and Computational Music, for which I am trying to sit in on at least some of the lectures for (it depends on what that specific lecture is about as to whether I go to it or not).

Actually these lectures are serving a double purpose for me at the moment; seeing as three of them start at 9am and one at 10am, they are giving me a good reason to get myself out of bed and into campus at the beginning of the day! Then after the lecture I feel like my brain has been woken up, and I'm ready to get going with whatever else the day has in store for me.

Sussex offers a lot of training courses at the moment, currently I am doing the Profolio course for first year DPhils which is a few hours every few weeks, and also some skills courses offered by SP2, the training skills programme at Sussex. Seems a shame to waste these opportunities, and they are all helpful in some ways, some more than others.

As I mentioned, I am in the middle of arranging to give a talk at Warwick. Edinburgh have also booked me to give a talk at the end of April, on the content that I am submitting to the Greek conference, based on my MSc thesis (artificially intelligent piano accompaniment). That should do me for now...!

No teaching duties for me this term, although last term a 2 hours/week tutorial job took up a lot more of my time than anticipated. I do however have commitments in terms of marking, which is averaging about 5 hours a time, about every 2 weeks, and some monitoring work for the department which is half an hour a week.

This comes at the end of the list, even though it is the most important for me. At some point I need to finalise exactly what I will be doing the next year. Currently I know that I want to make models of some creative processes in music. So the lecture courses will help me to identify what models I could use and what history of creativity theory already exists, as will my reading. I need to pin down some specifics now.

Phew. That took a while to write down, and I've only included the academic projects I have ongoing and not the out-of-work projects such as music rehearsals, getting more gigs, getting to know more people in Brighton generally, family issues and running London marathon, all of which are time-consuming but important to me. I think its been worth the time though, its good to have a list down in black and white, that encapsulates what I am doing at the moment.
I'd better get on with it now then!

Tuesday, 12 February 2008

Trip to Greece

Following on from the last post...

Got some good news about an abstract I tentatively submitted to a conference in Thessaloniki for July, based on my master's thesis work. The conference is called 'Conference in Interdisciplinary Musicology' and it's run by someone whose publications interest me a lot, Emilios Cambouropoulos. The submitted was on an artificially intelligent musical accompaniment system, that I worked on as my master's project last year.

I co-authored an abstract for this with my old supervisor at Edinburgh (where I did my degree), and our abstract got accepted for the conference. So I'm off to Greece in the summer!

Don't think it hurt our chances of acceptance in the conference, that my old supervisor also supervised Emilios Cambouropoulos's PhD thesis at Edinburgh about ten years ago... but also the review comments were interesting to read. Two reviews were given, based on academic merit, relevance to the conference and relevance to the conference theme (musical structure). One reviewer thought our work was generally ok to good, based on the good application of computational methods to a musical problem but questioning the contribution back to music theory (which is a weak point for the work for this particular conference - I'll have to think about this a bit more). However the other reviewer loved the work, rating it as excellent for all the review criteria. That was very satisfying to read! I wonder if they would have still rated it as excellent for academic criteria if they had known it was sourced from a master's thesis work... :)

Mission complete

Thoughts collected, ideas collated, diagrams drawn, paper written.

It feels pretty good to look back on the paper I've produced on this music analysis project. Even if the paper doesn't get accepted for the conference I am aiming it for (ICMC - computer music conference) then I still have a 4 page paper which describes what work I have managed to do on this project in the last few months. It has pretty pictures and music notation and everything
... :)

At first I wasn't that motivated to get this music analysis project working properly. It didn't quite fit in with the 'music creativity' research I was really wanting to read about. Also, I am a bit nervous about the whole concept of presenting a paper on this at ICMC, which is a big conference for me. I've only been looking into this for a few months, and not exactly a solid few months at that, so to present that work to a bunch of academics who really know their stuff... quite a daunting experience. So if I didn't finish the paper in time, then - oh well!

One deadline extension later, though, and I suddenly had a couple of weeks more than I thought to really get my project working and to understand what I was bringing to this area of research. Having put my ideas onto how to analyse these Bach fugues and separate them out into their different voices, I got something working in Matlab. Didn't work 100%, but to see something kinda sorta working was a massive incentive to carry on and refine the work - to the point where I had to define for myself a point at which I would stop trying to make more and more improvements and instead knuckle down to writing the work up.

The researcher's nightmare of finding some related papers quite late on in the project work happened to me here. This was a real blow to me at first, as I thought: 'oh no, all my beautiful ideas have been done before'! But I started noticing aspects of my work that were really different to the work already done. I guess it was at this point that I realised that even though I'd only been tackling this fugue analysis problem for a few months, I still had valid ideas to bring to this problem and I had proposed a solution that had significant aspects to it, which hadn't been done before. At least I hope they haven't been done before... I've followed up the research trail a little bit more and haven't found anything yet...

So now what happens? Well I have now got a four-page paper under my belt; even if it doesn't get accepted for ICMC then I think I will try and get it published somewhere else (taking any review comments into consideration) as I think I've managed to get some good results out of this work, and I want to get it into the academic domain somehow. Let's see what ICMC say anyway, and go from there.

Tuesday, 5 February 2008

that'll do...

Ok so I think I've identified where my work on voice segregation fits in with the related work that has been published recently. There has been a lot of good progress in tackling this problem in the last five years or so, but it is by no means a solved problem and there is a lot of work still to do. My machine learning/data mining approach does not encode a lot of human knowledge about voice-leading music separation, but performs well at the voice separation task due to what it has learnt during training. Now to write up my work for ICMC and see if people agree that this work makes a good contribution

Re: Bach fugue project. Kirlin and Utgoff 2005: Learning to segregate voices

As the only paper I have found which incorporates some form of learning how to segregate voices by training on examples, rather than by following rules, this is the most similar to my work. However the authors only train their system on very small amounts of data; they use one piece (Bach's Ciaccona) for both training and testing, choosing small sections (circa 4-8 bars) from this piece to train the system on. The training sections are selected due to their similarity to the testing sections chosen from the piece. In the later experimentation there is some effort to combine the earlier training sections into one large training section. However this is still only using a relatively small amount of training music and still uses the same piece, so therefore there is a constant underlying tonality, style, set of basic thematic ideas). Some overfitting may well occur to that one specific piece (Ciaccona). How would this generalise to other similar works by Bach? or other Baroque pieces? or more widely across the musical spectrum? Could their system cope with larger amounts of training data and if so, why haven't they used more?

It is good to see in this paper that they say that they have found no other voice segregation methods that used automatic learning techniques - neither have I (yet)

This method learns how to identify what voice a note belongs to by observing its pitch relationship with the note in the previous time-slot. They examine the piece in smaller windows rather than in one large set, using a fixed window size (the exact window size is varied across experimentation to see which gives the best results).

Also: this system uses decision trees, therefore it is always a single, discrete answer that is given to the question "does this note belong to voice v?". No measure of the likelihood of this answer being correct, even in a non-probabilistic, general estimate. What happens in this system if there is a clash of two simultaneous notes being assigned to the same (monophonic) voice. The authors are unclear on this prospect.

Thinking about Cambouropoulos's points (Voice identification 2006), it is interesting to see that this system makes no use of "vertical" information in its process (harmonic structure at any particular time-point c.f. Schenkerian analysis middle ground analysis level). So they do not make use of observations about what notes are sounding at the same time to guide their working at all.

Another difference between this system and mine is in the end-goal of the system. My focus is on identifying the route that each voice takes throughout the entire course of the piece, assuming that voices are present (but not necessarily active) for the entirety of the music. K+U , on the other hand, take a lower-level approach, identifying fragments of the voices on selected bars, but allowing the voices to vary throughout the course of the piece.

Re: Bach fugue project. Madsen and Widmer 06: Separating voices in MIDI

This method works from left to right, from the beginning of the piece to the end. A cost function is used which is based on the difference in pitch between consecutive notes in a voice: trying to minimise this difference (using Euclidean distances). So there are different levels of correctness assigned to each voice allocation, rather than a definite discrete classification. This would be useful in reconciling cases where there is a clash in assigning simultaneous notes to the same voice (assuming monophonicity of voices).

This cost function is based on predetermined rules inspired by Temperley's preference rules for voice-leading harmonic structure. These rules are an example of how the implementors add human knowledge into the system for it to work correctly. Madsen and Widmer have experimented manually with the setting of this cost function, using their musical knowledge and observations of the performance of the system to improve voice identification success.

Although some human-driven bias is inevitable when capturing a real-world system in a computational model, due to decisions made when structuring the model, I would prefer to keep the influence of human knowledge to a minimum. Instead it should learn as much as possible about how to construct the individual voices from the notes in the piece, rather than use human-devised rules to guide it and later correct it.

Madsen and Widmer's method is to heavily segment the voices so that small patterns of notes (c.f. Chew and Wu's contigs) are detected. They make a small post-processing effort to join these voice segments together but in general their main focus is on local voicings of notes rather than identifying a global voice that is present throughout the piece.

They only test on Bach fugues. I would like to see how their work generalises to other voice-driven music such as quartets or orchestral works.

Monday, 4 February 2008

Re: Bach fugue project. Szeto and Wong: A Stream Segregation Algorithm for Polyphonic Music Databases

A plethora of patterns are perceived in music but the only patterns that are noticeable to us are those that are all contained within a single stream (''grouping of musical notes''). So the authors want to separate polyphonic music into streams, for music information retrieval systems such as indexing a database of musical pieces. Their use of stream seems to be exactly the same as what I refer to as a voice.

There is a useful discussion of the psychology behind why we perceive music patterns monophonically - i.e. one note at a time rather than a chord.

The way they separate music is by identifying clusters of notes that together represent a musical event. Although I understand how they differentiate between simultaneous events (events happening at the same time) and sequential events (events happening at completely separate points in time, with no temporal cross over) I can't quite see how they then link the different events together to form a stream. I think I need to skim read this paper again to make sure I understand it.

Re: Bach fugue project. Chew and Wu: Contig Mapping paper

Very similar work to what I want to do. Their key assumption is that voices don't cross over each other. They use this to allocate voices at points where all voices are sounding.

I disagree with their key assumption! But are they right? Seems a dangerous assumption on which to found their entire work on...

Surely a major problem can occur if you base your whole program on a key assumption from your own domain knowledge, which then turns out to be incorrect some of the time? Better to base the program on objective facts and observations?

But this approach is quite useful in that they cut down the amount of fugue they are looking at, significantly, by using marker points in time at which they are sure that they have successfully allocated notes to voices. Is there some key assumption that I can make that I can use in a similar way? For example if all voices are present, and the notes are very far apart (how would I quantify this) and in the correct range for each voice (ditto)

Their algorithm seems very efficient - O(n^2)

They make several mention of rules - to what extent is their approach rule based?

One major thing I learnt from this paper: if using unfamiliar jargon in a paper, define it early! They don't define 'contig' till p7/20!

writing up a term's worth of work

Not the most normal first post for a blog I guess... but the main purpose of this blog existing is to help me collect my thoughts together in my doctoral research. With that in mind then, the next few posts in this blog are going to help me collect together my thoughts on a project I'm working on.

I'm keeping details of the work I'm doing in a separate diary, but (stupidly) haven't kept much record of the papers I read in the build-up to carrying out this project.

So - I'm going to write short summaries of some papers relating to this project, that will include my reactions to them and how the work in the paper is relevant to my project.

A good starting point for this is to define the problem I am currently working on. Let's see if I can describe it (hopefully this will form part of the introduction for the paper I am writing):

Musical pieces can be made up of several melodic lines interwoven together. These melodic lines are commonly known as voices, although they are not used only in vocal music but are also to be found in music written for instruments. What is important about the voices is that each voice can be considered as a standalone melodic pattern, complete and interesting in its own right. Several related voices, combined together to form one piece of polyphonic music, can generate additional harmonic qualities to enhance the voices.

Fugues are a perfect example of this compositional technique in action, being constructed solely of a number of different melodic voices. J. S. Bach was a fundamentally important composer in the history of fugue composition; in particular his highly influential work The Well-Tempered Clavier comprises 48 fugues. Bach wrote two fugues in every key, to illustrate the full potential of each harmonic key in comparison to each other.

In analysis on Bach fugues, the musicologist identifies each individual voice first; this enables them to perform more advanced analysis on the melodic content such as the re-use of a melodic pattern in different voices. The musical score usually gives the musicologist much help in identifying each voice, as each voice is notated slightly differently (the direction of the note stems indicates which voice each note belongs to).

Identifying each voice would be a considerably harder task, however, if the notational clues were removed. The musicologist would rely on much information within the piece, such as the pitch of the notes and the rhythmical structure. They would use this information in conjunction with their knowledge of how Bach typically structured fugal voices.

Can a computer learn how to perform the same task of extracting the constituent voices from a Bach fugue? I suggest that, given minimal (if any) human assistance and a training set of fugues with the fugal voices already identified, patterns of vocal movement can be identified and learnt. Such patterns can then be used as background knowledge of Bach's fugal voice-writing, to assist the computer in identifying individual voices in a previously unseen Bach fugue.
I'm sure that introduction to the problem area will get extensively re-written but currently I am at the stage of a first draft of a paper describing my solution to this task.