Yes, this is the page where you can make program suggestions for the ISMIR 2012 "Demos and Late-breaking" track

(go here for more general infos about the ISMIR 2012 Demos and Late-breaking track)

Put your program suggestions below. If a category is missing, feel free to add it.
Don't forget to link your name to your email or twitter account!

How to edit?
Simple: click on the "edit" button on the top right corner of this page, type away, and save. Upon submission, you'll be asked to complete a quick form.

Be sure to save your work and check back as simultaneous edits cancel each other!
To propose a topic/session, put a tentative title in bold, then a quick paragraph explaining what you're planning. End it with "proposed by [yourname]" with a link to your email, twitter or some other way we can contact you.
To comment on a session (add up to its scope, say you're interested, etc.), add a bullet point after the proposal's main paragraph (see examples below). End your comment with your name, as above.
If you have any questions or problems, ask JJ Aucouturier for help.

10th October: We've assembled a preliminary program, based on your suggestions below. It's all open of course, and it can (and will likely) change until the last minute (and even during the event, if topics need to branch out, or if new ideas come up)
The program is editable here:

12th October: The event was a real blast, and participation exceeded even our wildest hopes (and boy, they were wild). Thanks to all involved. Here's a quick snap of the event's program at 4pm, immediately after the program building session and before the first session batch. All empty slots quickly filled up in the remaining of the afternoon, to the point of lacking space at the end of the day. Watch this space for more pictures of the complete events, and updates about session debriefs, reviews, etc. JJ Aucouturier

Picture by Mohamed Sordo (@neomoha)

UPDATE 19th December.

Links to abstracts published after the event, on

**Using Linked Open Data for Novel Artist Recommendations**
Stephan Baumann and Rafael Schirru
German Research Center for Artificial Intelligence
**Chordify: Chord transcription for the masses**
W. Bas de Haas1,3, José Pedro Magalhães2,3, Dion ten Heggeler3, Gijs Bekenkamp3, Tijmen Ruizendaal3
1Department of Information and Computing Sciences, Utrecht University,2Department of Computer Science, University of Oxford, 3Chordify
**A Music similarity game prototype using the CASIMIR API**
Daniel Wolff1, Guillaume Bellec2
1City University London, School of Informatics, Department of Computing, 2ENSTA ParisTech
**Notes from the ISMIR12 Late-Breaking session on evaluation in music information retrieval**
Geoffroy Peeters1, Julián Urbano2, Gareth J. F. Jones3
1STMS IRCAM-CNRS-UPMC, 2University Carlos III of Madrid, 3Dublin City University
**Infrastructures and Interfaces for data collection in MIR**
Tillman Weyde and Daniel Wolff
Department of Computing, City University London
**Music Imagery IR: Bringing the song on your mind back to your ears**
Sebastian Stober1, Jessica Thompson2
1Data & Knowledge Engineering Group, Otto-von-Guericke-Universitat Magdeburg,2Bregman Music and Auditory Research Studio, Dartmouth College
**Late-break session on Music Structure Analysis**
Bruno Rocha1, Jordan B. L. Smith2, Geoffroy Peeters3, Joe Cheri Ross4, Oriol Nieto5, Jan Van Balen6
1University of Amsterdam, 2Queen Mary University of London, 3IRCAM-CNRS STMS,4Indian Institute of Technology, Bombay, 5New York University, 6Utrecht University
**MIReS Roadmap: Challenges for Discussion**
MIReS consortium
**Shared Open Vocabularies and Semantic Media**
Gyorgy Fazekas, Sebastian Ewert, Alo Allik, Simon Dixon, Mark Sandler
Centre for Digital Music, Queen Mary University of London
**Teaching MIR: educational resourcs related to MIR**
Emilia Gómez
Music Technology Group, Universitat Pompeu Fabra
**Past, Present and Future in Ethnomusicology: the computational challenge**
Sergio Oramas1, Olmo Cornelis2
1Polytechnic University of Madrid, 2University College Ghent

MIR and Impact Factor

How can we improve MIR-related journals' Impact Factor.

(Mohamed Sordo)

I wanted to share with you a question that I know many of you have already thought about, still I think it would be really nice to have a discussion on it.

The main topic would be:

  • What steps, what guidelines, what procedures should we follow so that our MIR-related ISI-indexed peer-reviewed indexed journals have a higher impact factor?

We could discuss subjects such as:

- What is Impact Factor and how it is computed?
- Practical issues of Impact factor in researchers' career.
- Are we aware of all the MIR publications in Journals?
- Should we have a unifying Journal?

Chordroid: Real-time Chord Recognition
I've recently been in discussions with a master student here at C4DM, Samuel Myer, whose project work on real time chord recognition, supervised by Dawn Black, has resulted in a fine real time chord recogniser: Chordroid. In fact, it's an Android app.In the Demo and Late-breaking session, I'll demo the chord recogniser (a guitar will be available for anyone who wants to try it) in order to# show you how well it works# get feedback and suggestions from you about situations in which you would like to use such an app.This last point is the more important one to us because it's not quite obvious in what situations the app would be useful. While the app is a nice demonstration of MIR tech, its practical use is not entirely obvious, and any good suggestions could guide the further development of the application.- proposed by Matthias Mauch (C4DM, Queen Mary) Chord Extraction for the Masses
Always wanted to play along with your favourite tracks? Chordify is a music player that extracts chords from musical sources like Soundcloud, Youtube or your own files, and shows you which chord to play when.

The aim of Chordify is to make some of the technology that has been available within the MIR community for some time accessible to a broader audience. Our interface is designed to be extremely simple: generally everyone who can hold a musical instrument must be able to use it. At the "Demos and Late-breaking" session you can play around with the website, and we will be happy to answer any question that you might have about Chordify. What we hope to gain from our demo are suggestions, comments or criticism that help us improve our service.

For more information or a sneak preview see
- proposed by W. Bas de Haas (Utrecht University)

I am working on a user-interface for computer-aided motivic analysis on symbolic data. The system employs a pattern-detection algorithm presented at this years ICMC and aims at making the results easily accessible.
I would like to present the system and invite you to general discussions on pattern-detection and user-interfaces for computer-aided music analysis.
- proposed by V. Thomas

Large-scale symbolic music data sets, data-mining and search

Peachnote is a search engine and analysis platform for sheet music [paper]. We have automatically OMRed and indexed 2,000,000 music sheets from the IMSLP, Library of Congress, the web and other sources.
There are multiple datasets available for researchers and we are looking for suggestions from anybody interested in symbolic music data for sharing further useful data with the community.
The applications built using these data can be easily put in front of 1.5 million users a month, so the research based on these data can have quite high real-world impact and profile.

Possible applications include studies on
  • evolution of music
  • music similarity and clustering
  • language models for music
  • structural analysis of music

The data can be used to support
  • audio transcription
  • algorithmic composition
  • music analysis, etc.

If symbolic data can benefit your projects, or if you are excited about serving a large audience of musicians and amateurs, let's meet.

- proposed by Vladimir Viro

MIR (meta)Evaluation, groundruth

MIR (meta)Evaluation

This ISMIR we're presenting a paper on the current challenges in evaluating melody extraction (predominant F0 estimation) algorithms. Naturally, the topic is relevant to those in the MIR community working on melody extraction. BUT - I think that the discussion could consider melody extraction as a "use case" to discuss MIR evaluation procedures more in general, thus making it a relevant discussion for the wider MIR community. For example, in our study we consider the evaluation of the MIREX AME task and discuss issues related to the annotation procedure, the length of the excerpts used for evaluation and the size/content of the music collections. As you may have guessed, we found some important issues that require our (MIR community) attention. Since MIREX is tied with ISMIR, I think this would be the perfect forum to discuss issues related to MIR evaluation procedures and the importance of getting it right!
- proposed by Justin Salamon (MTG)
  • This could be a great session! Maybe people from different tasks discussing potential lines for research by pointing out possible weaknesses in the evaluation practices. Julián Urbano
  • I agree! Much needed. Anyone feels up for giving a short primer on effect sizes in statistics, and it relates to meta-evaluation? (I'm not remotely qualified, but would definitely be interested) JJ Aucouturier
  • There is A LOT on evaluation at this ISMIR: a special session, many papers... Importantly, in the MIReS wiki (see proposed late-break session "Grand Challenges in MIR research" below), we included "Evaluation" as a relevant challenge... it being a wiki, it is awaiting your contributions... Fabien Gouyon
  • We have done (unpublished) work on the evaluation of chord recognition systems, trying to account for subjectivity and biases in ground truth annotations, which we think are important confounding factors in all MIR evaluation tasks. As part of this study we investigated how well human annotators do when scored against each other (could be an upper limit on what to expect of a machine), and we developed a method to combine several 'ground truths' from several annotators into one single one that will be less biased. These ideas are in principle independent of the task and could also be used for AME and other tasks. So the topic of this proposed session is of great interest to us and we'd be happy to contribute! Ni Yizhao, Matt McVicar, Raul Santos-Rodriguez, Tijl De Bie (Raul and Tijl are present at ISMIR).
  • Happy to gather people for this! There is a panel on evaluation of Friday morning. Please, try discussing some of these issues with the wider audience in the morning too. If there is more people interested, we'll keep the discussion during the afternoon. Julián Urbano
  • As said by Fabien, and Julian, there a whole session on evalaution Friday morning including a round-table (links here:; in this link you find the preliminary set of questions that will be adressed by the panel-members; if you're thing about other important topics to be discussed please indicate it here, drop me an email, or just tell me; we could then discuss about this during the panel and during the late-breaking session. Geoffroy

The CASIMIR API with an example Game and Survey

We would like to present a game with a purpose and a general web API we currently develop for collecting music similarity data. Here, the API defines the data formats and the tasks to be solved by the user. It also takes charge of storage of survey and user data and statistical selection of samples. Still parameters like genre combinations can be set. The API provides information and audio urls to clips from the MagnaTagATune and Million song datasets. It is currently tested within a music comparison survey. Our intention is to encourage further surveys where efforts can be focused on the user interface design, leaving the sample management to the API. Daniel Wolff@City University Guillaume Bellec Tillman Weyde

Infrastructures and Interfaces for Data Collection in Music Information Retrieval

Supervised training of MIR models to ground truth data is a common technique in our community. Yet, only few data is out there on users perception given a certain bit of audio / music. How can we get such data for song similarity or emotion? How can we make it accessible and results comparable to others? Recently such information has been gathered using Games With A Purpose. We use such data in our paper at this ISMIR. We would like to discuss possible infrastructures and standardised building blocks that help improve data collection and support sharing and extensibility of datasets. Tillman WeydeDaniel Wolff@City University

We would like to see further presentations and discussions on the topic of collecting user ground truth data, including all relevant disciplines and evaluations like MIREX which already have defined some standards.

Melody Extraction

New Melody Extraction Vamp Plug-in:

On a more personal note, I'm happy to say that this ISMIR I'll be releasing the first version of my melody extraction algorithm as a Vamp plug-in (by melody extraction I mean predominant F0 estimation from polyphonic music). The plug-in provides not only the final melody estimated by the algorithm but also vizualisations of the signal's pitch content (salience function) and all pitch contours estimated from the signal out of which the melody is later selected. I'm hoping the plug-in will serve as a useful tool for the research community, both for those wishing to use predominant F0 estimation for more high-level tasks (transcription, pattern/motif detection, classification, etc.) and for those wishing to use the algorithm as a benchmark against which to compare their own melody extraction algorithms. If possible, I would be very much interested in presenting the plug-in during a demo session in ISMIR!
- proposed by Justin Salamon (MTG)
  • Yet another great proposal! [Gustavo Martins (CITAR / Católica Porto)]
  • I'd be interested in this, too. Matthias Mauch
  • UPDATE: I'm very happy to see there's interest in this demo session. The plug-in is now available for download online, in case you wish to try it out before the session: [Justin]

Beat-tracking, Pulse

Beat-tracking in non-western music:

I have been working recently with a dataset of North Indian (Hindustani) Classical music, studying links between emotions and tempo. It occurred to me informally that 1) indian (my co-author) and western (myself) listeners had very different ways to assess the tempi of these pieces, and that 2) typical beat-tracking algorithm were on neither target, really. I believe non-western rhythms violate assumptions that we have built in our algorithms - it's not a matter of improving them, but rather radically thinking what's a tempo. Anyone has similar experiences? Shouldn't we just gather and share insights? proposed by JJ Aucouturier

  • We currently make similar experiences in the context of African music, and the future direction of my personal research focuses on makam music, where I will encounter similar issues again. I would be very interested in a discussion related to that. Andre Holzapfel
  • Tempo perception is a very individual experience. Especially when it concerns music where your not familiar with. For African music we noticed quite some ambiguity, as in tempo octaves (which happens in Western music as well), but also a large amount of binary/ternary relationships. It makes this topic worthwile to investigate and discuss. Anyway, in general, I am keen on sessions on ethnic music and their specific problems towards computational analysis. I think there are quite some challenges for MIR in this field! proposed by [Olmo Cornelis]
  • When we make a comparison of tempo estimation in different dataset, for example MIREX2004 Ballroom and Songs, dataset provided by Hainsworth, African music dataset, and Beatles dataset, we could recently approach the best score 82.4 % of ACC2 for African music dataset which is the lowest score among those of previous mentioned datasets. Based on the numbers, the African music are worthy of attracting more attention because it’s harder than other dataset, although different genres in other datasets have quite different performance. We agree natural sounds are other interesting acoustics for studying. Fu-Hai Frank Wu
  • Regarding research in processing non-Western music in general, a meeting is going to happen Wednesday 10.10. at 19.30. If you want to participate, please send a mail to me (hannover at csd dot uos dot gr), because we will have to do a reservation for the restaurant where we will go. Meeting point is at the entrance of the venue. Andre Holzapfel

Can you find a pulse in this music?

According to Clayton (Clayton, 1996) there are at least 70 music genres which have no meter, but as himself writes: "[There is] a doubt as to how much music (if any) is completely free of pulsation, even if this pulsation is often indistinct or discontinuous, or only perceived by a proportion of listeners. [..] So strong is the urge to perceive pulse in music (and, perhaps to generate a pulse in performance), that there may be very little music which at no point suggests pulsation."

Can we contribute to this musicological discussion from an MIR perspective? Do we have computational tools to identify a pulsation in unmetered music? Here I attach musical fragments (from that are considered to have no meter. Can you find a pulse in them? Do you have any music fragment to share with everyone (ideally using CC license) of relevance to this topic?

Koran Recitation:(more examples)
Carnatic Alap: (more examples)
Hindustani Alap: (more examples)

At ISMIR we could have a group discussion on possible approaches to identify the pulse in this type of music and maybe share some initial results. Proposed by Xavier Serra

  • Very interested. We're working on a dataset of Hindustani Alap music, which I'd be happy to share. JJ Aucouturier
  • I would like to add Cretan rizitika as another interesting form to analyze. Andre Holzapfel
  • Those clips are quite different from other clips which our algorithm processed before. There is little or even no binary/ternary relationship between estimated tempo Beat Per Minute (BPM) values. And the most likelihood ones are close to 120 BPM because perception model is used in our algorithm. Fu-Hai Frank Wu

VRAPS: Visual Rhythm-based Audio Playback System

As traditionally defined in the context of music, a beat represents a distinctive musical event such as the hitting of a drum or the start of a new melodic note. Similarly, we define a visual beat to indicate distinctive visual events such as rhythmic gestures and dance motions, and propose a method for automatically detecting visual beats and tempo from a video signal (as described in thisarticle in IEEE Magazine) . In this demo, we present an interactive, real-time audio playback system that continuously adjusts the playback speed of an audio signal based on the visual rhythm detected from a real-time video capture device.
Ching-Wei Chen and Oscar Celma

Music recommendation

MIR and music recommendation:

Recommendation technologies have been used more and more in industry and requests for better, faster and domain-specific recommendation engines have been formulated by companies but also users. If other research communities, like the ones from the ACM RecSys and KDD conferences, are actively working on that hot topic, the MIR community does not seem to be putting as much efforts on solving the still largely unexplored problem of music recommendation and discovery. Indeed the number of papers covering that topic (and associated ones such as similarity) at ISMIR has been decreasing or at best stagnating for the past 6 years. So what is it? Are we done? Is Collaborative Filtering the best solution out there even for music recommendation? Is it too hard to use audio-extracted information and music metadata to significantly improve music recommendation? What about users? Are they happy with current commercial music recommendation engines? What can we do for them? proposed by Amélie Anglade
  • What if we think of recommendation not only on full songs but also on song sections? Could it be relevant to users? We think user adapted normalization could also be an interesting topic, as many users already have huge libraries and maybe would like to discover similarities inside their libraries. Bruno Rocha, Aline Honingh and Niels Bogaards
    • Count me in! (Mohamed Sordo) I'm really interested in the idea of working with song segments. Actually a good application of that would be to help Dj's in the selection of different segments for mixing, mashups, etc.
  • One question could be: do we know our users? it's them who make a music recommendation system to fail or to be succesful (Mohamed Sordo)
  • Subjective music recommendation system, which kind of user information we need to achieve a better recommendation results? (Yading Song)
  • After getting a playlist given by the music recommender, how the sequence of playlist influences the user listening experiences? (Yading Song)
  • Finding the right music track for film, television, radio or other media productions is a challenging task. Within the Making Musical Mood Metadata project with the BBC and I Like Music, we investigate ways to recommend music tracks from commercial and production music catalogues using content-based analysis and/or metadata. A first prototype of the recommendation system, to be assessed by producers, as part of the BBC Desktop Jukebox, will be demoed. (Mathieu Barthet, Centre for Digital Music, Queen Mary University of London)
Music Similarity:
We are developing a new music similarity model entirely based on audio content analysis. The goal is to provide the users the possibility to navigate their music libraries along different axes of similarity. This can be done at the "song-level" or at a smaller "segment-level". We would like to present this work at the demo session. Proposed by Bruno Rocha, Aline Honingh and Niels Bogaards

MixMatcher - a project to join music lovers
We have an undergoing project that focus on joining people with similar tastes in music. This is an Android application that navigate through users music libraries and try to connect them, based on music similarity API. The application is meant to be used in a confined area, as it is an offline application that connects people by bluetooth. There is a service that wakes up when the user switches on the internet (necessary to call the API services), and builds a music similarity database, so when the user runs the application he/she doesn't need internet anymore. We are also studying the possibiliy of connecting people in an online network (Henrique Lima, Rui Silva).

Holistic / Hybrid Recommendation and Explanation:
We work on hybrid recommendation strategies since 2003. Recent interest of commercial music streaming services and internet radio stations enabled us to work close together with real-world use-cases. Nevertheless the issue is tough. In our opinon users demand explanations. For this reason we currently develop a hybrid recommender consisting of CF (user plays), social facets (facebook, etc.) and semantic metadata (hopefully using the graph structure) (Freebase, Musicbrainz). In a joint project with a German music streaming service we will try to evaluate different strategy settings (i.e the weigthings of the several components) and explanation interfaces with real-world users in 2013. We would be very glad to discuss challenges, joint work, etc in this session. Music recommendation is NOT DEAD! :)
Stephan Baumann
Rafael Schirru
Christian Reuschling
Björn Forcher

Visual Music Discovery using Personalized Taste Clusters

With the advent of unlimited streaming music services, we now have the ability to listen to over 10 million tracks, whenever and wherever we want. The irony is that it hasn't gotten much easier for the listener to find music they will enjoy - navigating these massive music catalogs still requires typing in specific artists or tracks, or scrolling through lists of new releases, charts, or hand-picked playlists. Personalization of music discovery is generally limited to presenting the user with content similar to individual tracks or artists the user has listened to before. People's musical tastes are actually more complex - they usually enjoy several different types of music, and they may categorize these different tastes in terms of broader groups of musical qualities, rather than by individual artists. In this session, Gracenote and Spectralmind will present the results of a joint project to create a personalized visual interface for music discovery on streaming music services, as demonstrated by an app written for the Spotify platform. We begin by analyzing the musical attributes of songs from a user's listening history, and then create several taste clusters that describe the different types of music a user enjoys, expressed in terms of a combination of musical attributes
such as genre, mood, era, tempo, and more. (For example "Evocative Rock from the 70's" or "Edgy Rap from New York City"). Each of these taste clusters is displayed as a "bubble" in a visual user interface - by selecting a bubble, the user can easily get a playlist of songs from the entire Spotify catalog that match the attributes of that cluster. By zooming in on a bubble, the user can explore sub-clusters of that taste. This interface makes it easy for listeners to explore and discover music that they will enjoy from the massive catalog of music they have available.
Proposed by Ching-Wei Chen, Oscar Celma, Ewald Peiszer and Thomas Lidy

Source Separation

Qualitative Evaluation of Source Separation Algorithms

Recently, we put some thoughts into the problem of qualitative evaluation of source separation algorithms. We came up with the idea of splitting up a given recording of a musical piece into all its contained notes. To easily access, observe, and manipulate these notes, we created an audio player-like MATLAB interface (check out our wiki page here). To start a first experiment, we used an implementation of the NMF-based score-informed source separation algorithm by Ewert and Mueller (PDF file). Using our interface, we now try to detect positions in the audio recording where common source separation errors occur. We would like to present our interface at the demo session.
Proposed by Harald G. Grohganz, Thomas Praetzlich and Jonathan Driedger
  • really nice! I would definitely attend this one! [Gustavo Martins (CITAR / Católica Porto)]

Subjective and Objective Evaluation of Source Separation Algorithms

Recently, our group proposed a subjective listening test protocol and a set of objective metrics (implemented into a Matlab software called PEASS) for perceptual evaluation of source separation algorithms (PDF paper). These metrics are starting to be widely used in the source separation community (e.g., within the SiSEC evaluation campaign) but they are perhaps less well known to the music signal processing community. Since the main authors of this work will not attend the conference, we will not be able to discuss deeply the details of our approach. Still, would you be interested by an overview and some audio examples?
Proposed by Joachim Thiemann
  • really nice! I would definitely attend this one! [Gustavo Martins (CITAR / Católica Porto)]


We had quite a good discussion. Some felt that often, objective results were not "harsh" enough - even after using PEASS.
Some questions anyone doing Source Separation evaluations should ask themselves:
  • What purpose is the metric being used for? Should there be different metric for different applications?
  • For subjective tests, what questions should be asked of the participants?

As a general comment for subjective tests, it's always a good idea to run pilot tests before full-scale real ones: often naive listeners hear things differently than what you think.

Feature Learning

(Deep) Feature Learning from Music Signals

More and more MIR researchers start questioning the traditional approach of using off-the-shelve features or hand-crafting new features to solve problems in audio-based MIR, and propose to learn features from data instead. While one of this year's MIRrors papers promotes this idea very well, the MIRrors Session only leaves limited room for discussions.
Thus, I propose to have a collaborative late-breaking session, inviting everyone working on or interested in Deep Learning or Feature Learning from audio signals to drop by and exchange our results (whether published or immature) in the form of short demos. This will help us to see what kind of features other methods come up with, get a feeling for what you need for different tasks, and how to move on. Plus, it will allow us to get and keep in touch on each others' progress and coordinate our work, such that we don't happen to independently doing the same experiments at the same time -- after all, there's only a small flock of people doing Deep Learning in MIR.
  • Great idea! (Well, it was mine.) I would demo feature learning with mcRBMs from spectral patches (not just frames), including some silly examples of pop music excerpts generated by a mcRBM, and unsupervisedly learned speech and music detectors. Jan Schlüter
  • I second it being a good idea! (but I too am biased :oD ) We recently got a paper accepted to the upcoming ICMLA conference on CNNs for Automatic Chord ID, so I could show off some of the kernels / learned features, as well as the output surface learned over songs. We've also been crafting a Python MIR/Deep Learning library (see below), so I could demonstrate a lot of this work in the context of that codebase. Eric Humphrey
  • This is an awesome idea. Thank you for initiating and organizing the session! Unfortunately, I cannot attend this ISMIR due to some visa problem. Instead, my colleague, Jorge Herrera will present our ISMIR paper (music annotation/retrieval using unsupervised feature learning) and share our experiences with you.
    Other than the paper, we have worked a visualization demo because people tend to get bored with numbers (F-score, AROC....). He will show an awesome real-time spectrogram/hidden-layer activation/tag-cloud generation :-). Also, we could demonstrate some audio examples (reconstructed from learned features in our setting). The quality is actually pretty good. Juhan Nam
  • I agree, this is an awesome idea. It will be my pleasure to attend and discuss. I am eager to hear about your most recent ideas. However, I don't have anything to present that comes to mind. I also have a cool demo, but I will already present it during my oral presentation. If I think of something else, I'll let you know. Philippe Hamel
  • Very nice idea, I am also looking forward to meet and discuss, I wont be able to prepare a demo though. Jan Wuelfing

General (research practice, etc.)

Open-access publishing in the MIR community:

I have been very interested in Open Access publishing lately, and would be interested in sharing experiences of MIR authors who went this (relatively) new route. What OA journals are suitable for MIR-like publications? Have people noticed a different impact/follow-up to OA-published results compared to traditional publishing? How do you bear the cost when it's author-supported - write it down in the grant? proposed by JJ Aucouturier
  • Hello! I would be also very interesting on this topic, it would be nice know which OA journals could suit to MIR research and if the MIR community could start some initiative on this direction. It's sometimes difficult in the academic world that some of these journals are recognized as having an impact, even though it's not true. Emilia Gómez@MTG-UPF

MIR & Teaching:
Many researchers in our community have introduced the MIR word in undergraduate and graduate degrees. Anyone would be interested to share opinions about methodologies, tools and resources for introducing MIR techniques to students in different fields? (engineering, musicology, computer science, psychology, ) Proposed by Emilia Gómez @ MTG-UPF

emilia: to prepare this session, I am gathering information about courses related to MIR. If you are teaching MIR-related stuff, please fill up this web form!!!! Thanks!

Shifting from Matlab to Python (a wishlist)

I wouldn't do this myself, but I'd love to see (and attend) a session by an experimented Python user on how to make the shift from Matlab: what libraries, what install, what tips, matplotlib and whatnot. Could also be a forum to gather best practice and raise awareness of this interesting (and free) opportunity. Anyone? (please! please!) JJ Aucouturier
  • Awesome idea! T. Bertin-Mahieux
  • I would be interested in this topic! Srikanth Cherla (
  • I like this! [Gustavo Martins (CITAR / Católica Porto)]
  • I like this too. I won't be at ISMIR but might participate remotely, this is interesting. (I've been getting more into Python this year. FWIW a recent blog article by me: Some things I have learnt about optimising Python) [Dan Stowell (QMUL)]
  • Great idea! I dropped Matlab almost entirely and use Python+Numpy+Scipy+Matplotlib instead since it allows for mixing numerical methods with generic things you can expect from a more general purpose language, as well as the use of Web (and Semantic Web) related libraries e.g. Web clients, servers, or JSON parsers. As a result, I ported other people's Matlab code to Python several times before. I'd be glad to contribute to this session but probably can't commit to do it all on my own at this point. (George Fazekas, QMUL).
  • Count me in (hope it doesn't conflict with the Source Separation session!) Very interested in doing more sound processing with Numpy and Scipi. Joachim Thiemann (INRIA/IRISA)
  • Count me in! Gopala Krishna Koduri
  • Sounds like somebody should propose a tutorial for next year's ISMIR... I know I would go. Maybe a late-break session on the topic would help give shape to such a tutorial... Fabien Gouyon
  • I'd say that Python has already been partially used in a Tutorial in ISMIR 2009 (, but in this case it was mainly for web mining of music related data. Anyway, I agree that a whole tutorial on how to use Python for many MIR tasks will be great. (Mohamed Sordo)
  • Gasp! I was going to propose (possibly leading) an MIR with Python demo. A few thoughts. I've been cultivating an MIR / Deep Learning codebase in Python for some time now, and have started to migrate it to a publicly available (LGPL'ed) library [MARLib @PyPI @BitBucket] with the help of other folks at NYU. There are other Python audio libraries floating about (Michael Casey's Bregmancomes to mind), but we're trying to accomplish two specific things with this one:
    • diminish the barrier-to-entry for Matlab-ers to make the switch as painless as possible, and
    • provide a seamless interface for sharing trained deep learning machines, so that you can use it like you would any old function (just like the FFT)
  • [sorry, continued] I'm planning to split my time between this and the Deep Learning camp at the same time, but if other Python evangelists want to help orchestrate an impromptu "how I learned to quit Matlab and love Python" I am 100% on board. Based on my experience trying to get new Python users up to speed, this usually manifests itself as two distinct tasks: Installation / Emotional Support Group and Getting Familiar with Python (Syntax & Such). I'm content to throw myself in the ring as a co-organizer, and maybe in a few days those that are interested to lead could connect via email / make a google doc? I'm not entirely sure from all the comments above who's interested in organizing versus attending, so maybe a table can help disambiguate (I've made an attempt, but please correct as needed). Also, I second Fabien's sentiments on a tutorial next year. Eric Humphrey
  • I guess we can approach this session in two ways:
    • Introduction to Python, and present the main differences with Matlab (following this type of table:, but including the most important signal processing stuff like fft, spectrogram, etc).
    • Assume that the attendees already have some experience with Python, but they want to learn more MIR-related stuff with Python, so that we can go more deeply into, e.g., how to extract features from audio. Oriol Nieto
Python > Matlab Demo Session

Love to facilitate!
Love to attend!
Eric Humphrey
George Fazekas
Oriol Nieto
Srikanth Cherla
Gustavo Martins
Joachim Thiemann
Gopala Krishna Koduri
JJ Aucouturier

Future Research Directions and Upcoming Projects

This year's ISMIR will be the thirteenth of its kind. With increasing industrial involvement and having music focused sessions in many major signal processing/machine learning/cognition conferences the importance of MIR and related fields becomes more and more apparent. However, gradually approaching maturity we also see more and more worn-out paths and approaches in MIR. Therefore, it makes sense to discuss what future research directions might be interesting and what will happen in the next years.

Semantic Media project

In this context, we would like to discuss the Semantic Media project. Within this project we want to focus on investigating novel ways to empower users to find relevant content in large collections of media documents by exploring how metadata can be already generated during the production of content, how linked data technology can be used to efficiently store this metadata and to incorporate external knowledge, and how industry and universities can work together in this field. Sebastian Ewert
  • This sounds great, and relates to some work that we (myself, Kevin Page and Dave De Roure) have been doing recently on linked data publication of metadata for a corpus (the Live Music Archive or etree). See This also relates to questions of shared open vocabularies -- we've been using the music ontology for our markup. Sean Bechhofer

Music Imagery Information Retrieval (MIIR)

Next year, I want to start a new research project trying to build brain-computer interfaces (BCIs) based on music imagery - i.e. deliberately imagining well-known music by recreating the perceptual experience in one's mind. (One of the first steps will be to collect high-quality ground truth data comprising EEG recordings of music perception and imagery aligned to the corresponding audio recordings.)
I have prepared a short paper to outline my ideas and would like to discuss them with others who are interested in this subject. Maybe we can form a special interest group on music imagery. Sebastian Stober
  • here's a really unusual topic! (reading Oliver Sacks' Musicophilia, which has quite a few chapters about this, as I write) I'm interested! JJ Aucouturier

Shared Open Vocabulary for Audio Research and Retrieval
While there has been tremendous work in the MIR community to create easy to use feature extractor tools (e.g. Marsyas, jMIR, MIR toolbox, Vamp plugins to name a few), it remains difficult to know whether a feature computed by one tool is the same as (or compatible/replaceable with) a feature computed by another tool. Moreover, if different tools were used in the same experiment, their outputs typically need conversion to some sort of common format, and for reproducibility, this glue code needs to evolve with the changes of the tools themselves. Similar problems arise with the release of data sets, like MSD or SALAMI, in a variety of different formats, as well as in the use of various Web APIs.

The goal of the Shared Open Vocabulary for Audio Research and Retrieval (SOVARR) project is to investigate if and how audio research communities would benefit from using interoperable file formats, data structures, vocabularies or ontologies, what are the primary needs of MIR researchers, and what are the main barriers to the uptake of shared vocabularies.

The project is due to start on the 1st of October and it aims to be highly community focussed. As part of this effort, we would like to invite everyone interested for a discussion along the lines of questions such as: Is your research code sustainable? Are your results (and the way they were derived) sufficiently described and easily reproducible? Are you using interoperable tools that allow plugging different components into existing methods/algorithms for flexible experimentation and efficient research workflows?

We would also like to consider practical problems such as the need for describing very large data sets in a compact format, or the potential complexity of using shared and globally unique identifiers to maintain the meaning of data across different tools or over long periods of time. The project finally aims to revise and integrate existing vocabularies and research tools after reflecting on our findings. Some initial ideas/suggestions on requirements for these would also be greatly appreciated. George Fazekas, QMUL and Alo Allik

Grand challenges of MIR research

By expanding its context and addressing challenges such as multimodal information, multiculturalism and multidisciplinarity, MIR has the potential for a major impact on the future economy, the arts and education, not merely through applications of technical components, but also by evolving to address questions of fundamental human understanding, and build upon ideas of personalisation, interpretation, embodiment, findability and community.

What are the most important challenges facing the MIR community in the coming years?

In the MIReS project we have begun to identify several areas for future investigation by considering technical, as well as social and exploitation aspects of MIR research. Amongst the many topics are musically-relevant data, knowledge-driven methodologies, interface and interaction aspects, evaluation of research results, social aspects, culture specificity, industrial, artistic, and educational applications.

The current list of challenges is a work in progress and can be found on the MIReS wiki.

We warmly welcome suggestions and additions to this list by the MIR community, and are very interested to hear particularly from researchers who may have already begun to address some of these new challenges. Feel free to add comments or additional challenges to our wiki and highlight the challenge you think deserves a longer discussion. An ISMIR session would enable the community to participate in a lively discussion over the future of the field. Michela Magas

  • I have been involved in writing these challenges but I think it is very important that people with other perspectives give their opinion on this. I am sure we have missed many things. Xavier Serra
  • The ISMIR 2012 program also includes other occasions to start everybody thinking about MIR challenges and get ready for the late-break session: e.g. the "MIRrors" papers, and the Panel and Poster sessions on Evaluation Initiatives in MIR. Fabien Gouyon
  • We have put together a questionnaire to get opinions on the challenges presented in MIReS wiki. The questionnaire is in If enough answers come before the session we could discuss them. Xavier Serra

Multi-timescale representations of music audio

We had a discussion about the different multi-timescale modelisation techniques for music audio. Here is a few notes I took during the session. Philippe Hamel

Types of multiscale representations:
cascaded wavelets
modulation spectra -- gives more information about lowest bins of the fft
multi time-frequency scales
convolutional deep networks

why don't we use it wavelets?
-- too slow?
-- too complicated (not a pretty matrix)?

what do we want to get out of multi-timescale?
model time dynamics
we do not know which timescale is used to solve for different tasks

What tasks could benefit from multi-timescale representations?
chord recognition
music description (transcription)
pulse modelisation
music similarity
music emotion

How to model longer timescales?
Is the spectral domain a good way to do this, or should we use more complex machine learning techniques.
Or, modelling feature evolution.