Skip to main content
Wikispaces Classroom is now free, social, and easier than ever.
Try it today.
Pages and Files
Wrap-up Feature Learning
Add "All Pages"
Yes, this is the page where you can make program suggestions for the ISMIR 2012 "Demos and Late-breaking" track
for more general infos about the ISMIR 2012 Demos and Late-breaking track)
Put your program suggestions below. If a category is missing, feel free to add it.
Don't forget to link your name to your email or twitter account!
How to edit?
Simple: click on the "edit" button on the top right corner of this page, type away, and save. Upon submission, you'll be asked to complete a quick form.
Be sure to save your work and check back as simultaneous edits cancel each other!
To propose a topic/session, put a tentative title in bold, then a quick paragraph explaining what you're planning. End it with "proposed by [yourname]" with a link to your email, twitter or some other way we can contact you.
To comment on a session (add up to its scope, say you're interested, etc.), add a bullet point after the proposal's main paragraph (see examples below). End your comment with your name, as above.
If you have any questions or problems, ask
10th October: We've assembled a preliminary program, based on your suggestions below. It's all open of course, and it can (and will likely) change until the last minute (and even during the event, if topics need to branch out, or if new ideas come up)
The program is editable here:
12th October: The event was a real blast, and participation exceeded even our wildest hopes (and boy, they were wild). Thanks to all involved. Here's a quick snap of the event's program at 4pm, immediately after the program building session and before the first session batch. All empty slots quickly filled up in the remaining of the afternoon, to the point of lacking space at the end of the day. Watch this space for more pictures of the complete events, and updates about session debriefs, reviews, etc.
Picture by Mohamed Sordo (@neomoha)
UPDATE 19th December.
Links to abstracts published after the event, on ismir.net:
**Using Linked Open Data for Novel Artist Recommendations**
Stephan Baumann and Rafael Schirru
German Research Center for Artiﬁcial Intelligence
**Chordify: Chord transcription for the masses**
W. Bas de Haas1,3, José Pedro Magalhães2,3, Dion ten Heggeler3, Gijs Bekenkamp3, Tijmen Ruizendaal3
1Department of Information and Computing Sciences, Utrecht University,2Department of Computer Science, University of Oxford, 3Chordify
**A Music similarity game prototype using the CASIMIR API**
Daniel Wolff1, Guillaume Bellec2
1City University London, School of Informatics, Department of Computing, 2ENSTA ParisTech
**Notes from the ISMIR12 Late-Breaking session on evaluation in music information retrieval**
Geoffroy Peeters1, Julián Urbano2, Gareth J. F. Jones3
1STMS IRCAM-CNRS-UPMC, 2University Carlos III of Madrid, 3Dublin City University
**Infrastructures and Interfaces for data collection in MIR**
Tillman Weyde and Daniel Wolff
Department of Computing, City University London
**Music Imagery IR: Bringing the song on your mind back to your ears**
Sebastian Stober1, Jessica Thompson2
1Data & Knowledge Engineering Group, Otto-von-Guericke-Universitat Magdeburg,2Bregman Music and Auditory Research Studio, Dartmouth College
**Late-break session on Music Structure Analysis**
Bruno Rocha1, Jordan B. L. Smith2, Geoffroy Peeters3, Joe Cheri Ross4, Oriol Nieto5, Jan Van Balen6
1University of Amsterdam, 2Queen Mary University of London, 3IRCAM-CNRS STMS,4Indian Institute of Technology, Bombay, 5New York University, 6Utrecht University
**MIReS Roadmap: Challenges for Discussion**
**Shared Open Vocabularies and Semantic Media**
Gyorgy Fazekas, Sebastian Ewert, Alo Allik, Simon Dixon, Mark Sandler
Centre for Digital Music, Queen Mary University of London
**Teaching MIR: educational resourcs related to MIR**
Music Technology Group, Universitat Pompeu Fabra
**Past, Present and Future in Ethnomusicology: the computational challenge**
Sergio Oramas1, Olmo Cornelis2
1Polytechnic University of Madrid, 2University College Ghent
MIR and Impact Factor
How can we improve MIR-related journals' Impact Factor.
I wanted to share with you a question that I know many of you have already thought about, still I think it would be really nice to have a discussion on it.
The main topic would be:
What steps, what guidelines, what procedures should we follow so that our MIR-related ISI-indexed peer-reviewed indexed journals have a higher impact factor?
We could discuss subjects such as:
- What is Impact Factor and how it is computed?
- Practical issues of Impact factor in researchers' career.
- Are we aware of all the MIR publications in Journals?
- Should we have a unifying Journal?
Chordroid: Real-time Chord Recognition
I've recently been in discussions with a master student here at C4DM, Samuel Myer, whose project work on real time chord recognition, supervised by Dawn Black, has resulted in a fine real time chord recogniser:
. In fact, it's an Android app.In the Demo and Late-breaking session, I'll demo the chord recogniser (a guitar will be available for anyone who wants to try it) in order to# show you how well it works# get feedback and suggestions from you about situations in which you would like to use such an app.This last point is the more important one to us because it's not quite obvious in what situations the app would be useful. While the app is a nice demonstration of MIR tech, its practical use is not entirely obvious, and any good suggestions could guide the further development of the application.- proposed by
Matthias Mauch (C4DM, Queen Mary)
chordify.net: Chord Extraction for the Masses
Always wanted to play along with your favourite tracks? Chordify is a music player that extracts chords from musical sources like Soundcloud, Youtube or your own files, and shows you which chord to play when.
The aim of Chordify is to make some of the technology that has been available within the MIR community for some time accessible to a broader audience. Our interface is designed to be extremely simple: generally everyone who can hold a musical instrument must be able to use it. At the "Demos and Late-breaking" session you can play around with the website, and we will be happy to answer any question that you might have about Chordify. What we hope to gain from our demo are suggestions, comments or criticism that help us improve our service.
For more information or a sneak preview see
- proposed by
W. Bas de Haas
I am working on a user-interface for computer-aided motivic analysis on symbolic data. The system employs a pattern-detection algorithm presented at this years
and aims at making the results easily accessible.
I would like to present the system and invite you to general discussions on pattern-detection and user-interfaces for computer-aided music analysis.
- proposed by
Large-scale symbolic music data sets, data-mining and search
is a search engine and analysis platform for sheet music [
]. We have automatically
ed and indexed
from the IMSLP, Library of Congress, the web and other sources.
There are multiple
available for researchers and we are looking for suggestions from anybody interested in symbolic music data for
with the community.
The applications built using these data can be easily put in front of
1.5 million users a month
, so the
research based on these data can have quite
real-world impact and profile
Possible applications include studies on
evolution of music
music similarity and clustering
language models for music
structural analysis of music
The data can be used to support
music analysis, etc.
If symbolic data can benefit your projects, or if you are excited about serving a large audience of musicians and amateurs, let's meet.
- proposed by
MIR (meta)Evaluation, groundruth
This ISMIR we're presenting a paper on the
current challenges in evaluating melody extraction
(predominant F0 estimation) algorithms. Naturally, the topic is relevant to those in the MIR community working on melody extraction. BUT - I think that the discussion could consider melody extraction as a "use case" to discuss MIR evaluation procedures more in general, thus making it a relevant discussion for the wider MIR community. For example, in our study we consider the evaluation of the MIREX AME task and discuss issues related to the annotation procedure, the length of the excerpts used for evaluation and the size/content of the music collections. As you may have guessed, we found some important issues that require our (MIR community) attention. Since MIREX is tied with ISMIR, I think this would be the perfect forum to discuss issues related to MIR evaluation procedures and the importance of getting it right!
- proposed by
This could be a great session! Maybe people from different tasks discussing potential lines for research by pointing out possible weaknesses in the evaluation practices.
I agree! Much needed. Anyone feels up for giving a short primer on effect sizes in statistics, and it relates to meta-evaluation? (I'm not remotely qualified, but would definitely be interested)
There is A LOT on evaluation at this ISMIR: a special session, many papers... Importantly, in
(see proposed late-break session "Grand Challenges in MIR research" below), we included "E
valuation" as a relevant challenge
... it being a wiki, it is awaiting your contributions...
We have done (unpublished) work on the evaluation of chord recognition systems, trying to account for subjectivity and biases in ground truth annotations, which we think are important confounding factors in all MIR evaluation tasks. As part of this study we investigated how well human annotators do when scored against each other (could be an upper limit on what to expect of a machine), and we developed a method to combine several 'ground truths' from several annotators into one single one that will be less biased. These ideas are in principle independent of the task and could also be used for AME and other tasks. So the topic of this proposed session is of great interest to us and we'd be happy to contribute! Ni Yizhao, Matt McVicar,
Tijl De Bie
(Raul and Tijl are present at ISMIR).
Happy to gather people for this! There is a panel on evaluation of Friday morning. Please, try discussing some of these issues with the wider audience in the morning too. If there is more people interested, we'll keep the discussion during the afternoon.
As said by Fabien, and Julian, there a whole session on evalaution Friday morning including a round-table (links here:
); in this link you find the preliminary set of questions that will be adressed by the panel-members; if you're thing about other important topics to be discussed please indicate it here, drop me an email, or just tell me; we could then discuss about this during the panel and during the late-breaking session. Geoffroy
The CASIMIR API with an example Game and Survey
We would like to present a game with a purpose and a general web API we currently develop for collecting music similarity data. Here, the API defines the data formats and the tasks to be solved by the user. It also takes charge of storage of survey and user data and statistical selection of samples. Still parameters like genre combinations can be set. The API provides information and audio urls to clips from the MagnaTagATune and Million song datasets. It is currently tested within a music comparison survey. Our intention is to encourage further surveys where efforts can be focused on the user interface design, leaving the sample management to the API.
Daniel Wolff@City University
Infrastructures and Interfaces for Data Collection in Music Information Retrieval
Supervised training of MIR models to ground truth data is a common technique in our community. Yet, only few data is out there on users perception given a certain bit of audio / music. How can we get such data for song similarity or emotion? How can we make it accessible and results comparable to others? Recently such information has been gathered using Games With A Purpose. We use such data in our
paper at this ISMIR
. We would like to discuss possible infrastructures and standardised building blocks that help improve data collection and support sharing and extensibility of datasets.
Daniel Wolff@City University
We would like to see further presentations and discussions on the topic of collecting user ground truth data, including all relevant disciplines and evaluations like MIREX which already have defined some standards.
New Melody Extraction Vamp Plug-in:
On a more personal note, I'm happy to say that this ISMIR I'll be releasing the first version of my melody extraction algorithm as a Vamp plug-in (by melody extraction I mean predominant F0 estimation from polyphonic music). The plug-in provides not only the final melody estimated by the algorithm but also vizualisations of the signal's pitch content (salience function) and all pitch contours estimated from the signal out of which the melody is later selected. I'm hoping the plug-in will serve as a useful tool for the research community, both for those wishing to use predominant F0 estimation for more high-level tasks (transcription, pattern/motif detection, classification, etc.) and for those wishing to use the algorithm as a benchmark against which to compare their own melody extraction algorithms. If possible, I would be very much interested in presenting the plug-in during a demo session in ISMIR!
- proposed by
Yet another great proposal! [Gustavo Martins (CITAR / Católica Porto)]
I'd be interested in this, too. Matthias Mauch
UPDATE: I'm very happy to see there's interest in this demo session. The plug-in is now available for download online, in case you wish to try it out before the session:
Beat-tracking in non-western music:
I have been working recently with a dataset of North Indian (Hindustani) Classical music, studying links between emotions and tempo. It occurred to me informally that 1) indian (my co-author) and western (myself) listeners had very different ways to assess the tempi of these pieces, and that 2) typical beat-tracking algorithm were on neither target, really. I believe non-western rhythms violate assumptions that we have built in our algorithms - it's not a matter of improving them, but rather radically thinking what's a tempo. Anyone has similar experiences? Shouldn't we just gather and share insights? proposed by
We currently make similar experiences in the context of African music, and the future direction of my personal research focuses on makam music, where I will encounter similar issues again. I would be very interested in a discussion related to that.
Tempo perception is a very individual experience. Especially when it concerns music where your not familiar with. For African music we noticed quite some ambiguity, as in tempo octaves (which happens in Western music as well), but also a large amount of binary/ternary relationships. It makes this topic worthwile to investigate and discuss. Anyway, in general, I am keen on sessions on ethnic music and their specific problems towards computational analysis. I think there are quite some challenges for MIR in this field! proposed by [
When we make a comparison of tempo estimation in different dataset, for example MIREX2004 Ballroom and Songs, dataset provided by Hainsworth, African music dataset, and Beatles dataset, we could recently approach the best score 82.4 % of ACC2 for African music dataset which is the lowest score among those of previous mentioned datasets. Based on the numbers, the African music are worthy of attracting more attention because it’s harder than other dataset, although different genres in other datasets have quite different performance. We agree natural sounds are other interesting acoustics for studying.
Fu-Hai Frank Wu
Regarding research in processing non-Western music in general, a meeting is going to happen Wednesday 10.10. at 19.30. If you want to participate, please send a mail to me (hannover at csd dot uos dot gr), because we will have to do a reservation for the restaurant where we will go. Meeting point is at the entrance of the venue.
Can you find a pulse in this music?
According to Clayton (
) there are at least 70 music genres which have no meter, but as himself writes: "[There is] a doubt as to how much music (if any) is completely free of pulsation, even if this pulsation is often indistinct or discontinuous, or only perceived by a proportion of listeners. [..] So strong is the urge to perceive pulse in music (and, perhaps to generate a pulse in performance), that there may be very little music which at no point suggests pulsation."
Can we contribute to this musicological discussion from an MIR perspective? Do we have computational tools to identify a pulsation in unmetered music? Here I attach musical fragments (from
) that are considered to have no meter. Can you find a pulse in them? Do you have any music fragment to share with everyone (ideally using CC license) of relevance to this topic?
At ISMIR we could have a group discussion on possible approaches to identify the pulse in this type of music and maybe share some initial results. Proposed by
Very interested. We're working on a dataset of Hindustani Alap music, which I'd be happy to share.
I would like to add Cretan rizitika as another interesting form to analyze.
Those clips are quite different from other clips which our algorithm processed before. There is little or even no binary/ternary relationship between estimated tempo Beat Per Minute (BPM) values. And the most likelihood ones are close to 120 BPM because perception model is used in our algorithm.
Fu-Hai Frank Wu
VRAPS: Visual Rhythm-based Audio Playback System
As traditionally defined in the context of music, a beat represents a distinctive musical event such as the hitting of a drum or the start of a new melodic note. Similarly, we define a visual beat to indicate distinctive visual events such as rhythmic gestures and dance motions, and propose a method for automatically detecting visual beats and tempo from a video signal (as described in
article in IEEE Magazine) . In this demo, we present an interactive, real-time audio playback system that continuously adjusts the playback speed of an audio signal based on the visual rhythm detected from a real-time video capture device.
MIR and music recommendation:
Recommendation technologies have been used more and more in industry and requests for better, faster and domain-specific recommendation engines have been formulated by companies but also users. If other research communities, like the ones from the ACM RecSys and KDD conferences, are actively working on that hot topic, the MIR community does not seem to be putting as much efforts on solving the still largely unexplored problem of music recommendation and discovery. Indeed the number of papers covering that topic (and associated ones such as similarity) at ISMIR has been decreasing or at best stagnating for the past 6 years. So what is it? Are we done? Is Collaborative Filtering the best solution out there even for music recommendation? Is it too hard to use audio-extracted information and music metadata to significantly improve music recommendation? What about users? Are they happy with current commercial music recommendation engines? What can we do for them? proposed by
What if we think of recommendation not only on full songs but also on song sections? Could it be relevant to users? We think user adapted normalization could also be an interesting topic, as many users already have huge libraries and maybe would like to discover similarities inside their libraries.
Count me in! (
) I'm really interested in the idea of working with song segments. Actually a good application of that would be to help Dj's in the selection of different segments for mixing, mashups, etc.
One question could be: do we know our users? it's them who make a music recommendation system to fail or to be succesful (
Subjective music recommendation system, which kind of user information we need to achieve a better recommendation results? (
After getting a playlist given by the music recommender, how the sequence of playlist influences the user listening experiences? (
Finding the right music track for ﬁlm, television, radio or other media productions is a challenging task. Within the Making Musical Mood Metadata project with the BBC and I Like Music, we investigate ways to recommend music tracks from commercial and production music catalogues using content-based analysis and/or metadata. A first prototype of the recommendation system, to be assessed by producers, as part of the BBC Desktop Jukebox, will be demoed. (
, Centre for Digital Music, Queen Mary University of London)
We are developing a new music similarity model entirely based on audio content analysis. The goal is to provide the users the possibility to navigate their music libraries along different axes of similarity. This can be done at the "song-level" or at a smaller "segment-level". We would like to present this work at the demo session. Proposed by
MixMatcher - a project to join music lovers
We have an undergoing project that focus on joining people with similar tastes in music. This is an Android application that navigate through users music libraries and try to connect them, based on music similarity Last.fm API. The application is meant to be used in a confined area, as it is an offline application that connects people by bluetooth. There is a service that wakes up when the user switches on the internet (necessary to call the Last.fm API services), and builds a music similarity database, so when the user runs the application he/she doesn't need internet anymore. We are also studying the possibiliy of connecting people in an online network (
Holistic / Hybrid Recommendation and Explanation:
We work on hybrid recommendation strategies since 2003. Recent interest of commercial music streaming services and internet radio stations enabled us to work close together with real-world use-cases. Nevertheless the issue is tough. In our opinon users demand explanations. For this reason we currently develop a hybrid recommender consisting of CF (user plays), social facets (facebook, etc.) and semantic metadata (hopefully using the graph structure) (Freebase, Musicbrainz). In a joint project with a German music streaming service we will try to evaluate different strategy settings (i.e the weigthings of the several components) and explanation interfaces with real-world users in 2013. We would be very glad to discuss challenges, joint work, etc in this session. Music recommendation is NOT DEAD! :)
Visual Music Discovery using Personalized Taste Clusters
With the advent of unlimited streaming music services, we now have the ability to listen to over 10 million tracks, whenever and wherever we want. The irony is that it hasn't gotten much easier for the listener to find music they will enjoy - navigating these massive music catalogs still requires typing in specific artists or tracks, or scrolling through lists of new releases, charts, or hand-picked playlists. Personalization of music discovery is generally limited to presenting the user with content similar to individual tracks or artists the user has listened to before. People's musical tastes are actually more complex - they usually enjoy several different types of music, and they may categorize these different tastes in terms of broader groups of musical qualities, rather than by individual artists. In this session, Gracenote and Spectralmind will present the results of a joint project to create a personalized visual interface for music discovery on streaming music services, as demonstrated by an app written for the Spotify platform. We begin by analyzing the musical attributes of songs from a user's listening history, and then create several taste clusters that describe the different types of music a user enjoys, expressed in terms of a combination of musical attributes
such as genre, mood, era, tempo, and more. (For example "Evocative Rock from the 70's" or "Edgy Rap from New York City"). Each of these taste clusters is displayed as a "bubble" in a visual user interface - by selecting a bubble, the user can easily get a playlist of songs from the entire Spotify catalog that match the attributes of that cluster. By zooming in on a bubble, the user can explore sub-clusters of that taste. This interface makes it easy for listeners to explore and discover music that they will enjoy from the massive catalog of music they have available.
, Ewald Peiszer and
Qualitative Evaluation of Source Separation Algorithms
Recently, we put some thoughts into the problem of qualitative evaluation of source separation algorithms. We came up with the idea of splitting up a given recording of a musical piece into all its contained notes. To easily access, observe, and manipulate these notes, we created an audio player-like MATLAB interface (check out our wiki page
). To start a first experiment, we used an implementation of the NMF-based score-informed source separation algorithm by Ewert and Mueller
). Using our interface, we now try to detect positions in the audio recording where common source separation errors occur. We would like to present our interface at the demo session.
Harald G. Grohganz
really nice! I would definitely attend this one! [Gustavo Martins (CITAR / Católica Porto)]
Subjective and Objective Evaluation of Source Separation Algorithms
proposed a subjective listening test protocol and a set of objective metrics (implemented into a Matlab software called
) for perceptual evaluation of source separation algorithms (
). These metrics are starting to be widely used in the source separation community (e.g., within the
evaluation campaign) but they are perhaps less well known to the music signal processing community. Since the main authors of this work will not attend the conference, we will not be able to discuss deeply the details of our approach. Still, would you be interested by an overview and some audio examples?
really nice! I would definitely attend this one! [Gustavo Martins (CITAR / Católica Porto)]
We had quite a good discussion. Some felt that often, objective results were not "harsh" enough - even after using PEASS.
Some questions anyone doing Source Separation evaluations should ask themselves:
What purpose is the metric being used for? Should there be different metric for different applications?
For subjective tests, what questions should be asked of the participants?
As a general comment for subjective tests, it's always a good idea to run pilot tests before full-scale real ones: often naive listeners hear things differently than what you think.
(Deep) Feature Learning from Music Signals
More and more MIR researchers start questioning the traditional approach of using off-the-shelve features or hand-crafting new features to solve problems in audio-based MIR, and propose to learn features from data instead. While
one of this year's MIRrors papers
promotes this idea very well, the MIRrors Session only leaves limited room for discussions.
Thus, I propose to have a collaborative late-breaking session, inviting everyone working on or interested in Deep Learning or Feature Learning from audio signals to drop by and exchange our results (whether published or immature) in the form of short demos. This will help us to see what kind of features other methods come up with, get a feeling for what you need for different tasks, and how to move on. Plus, it will allow us to get and keep in touch on each others' progress and coordinate our work, such that we don't happen to independently doing the same experiments at the same time -- after all, there's only a small flock of people doing Deep Learning in MIR.
Great idea! (Well, it was mine.) I would demo feature learning with mcRBMs from spectral patches (not just frames), including some silly examples of pop music excerpts generated by a mcRBM, and unsupervisedly learned speech and music detectors.
I second it being a good idea! (but I too am biased :oD ) We recently got a paper accepted to the upcoming ICMLA conference on CNNs for Automatic Chord ID, so I could show off some of the kernels / learned features, as well as the output surface learned over songs. We've also been crafting a Python MIR/Deep Learning library (see below), so I could demonstrate a lot of this work in the context of that codebase.
This is an awesome idea. Thank you for initiating and organizing the session! Unfortunately, I cannot attend this ISMIR due to some visa problem. Instead, my colleague, Jorge Herrera will present our ISMIR paper (music annotation/retrieval using unsupervised feature learning) and share our experiences with you.
Other than the paper, we have worked a visualization demo because people tend to get bored with numbers (F-score, AROC....). He will show an awesome real-time spectrogram/hidden-layer activation/tag-cloud generation :-). Also, we could demonstrate some audio examples (reconstructed from learned features in our setting). The quality is actually pretty good.
I agree, this is an awesome idea. It will be my pleasure to attend and discuss. I am eager to hear about your most recent ideas. However, I don't have anything to present that comes to mind. I also have a cool demo, but I will already present it during my oral presentation. If I think of something else, I'll let you know.
Very nice idea, I am also looking forward to meet and discuss, I wont be able to prepare a demo though.
General (research practice, etc.)
Open-access publishing in the MIR community:
I have been very interested in Open Access publishing lately, and would be interested in sharing experiences of MIR authors who went this (relatively) new route. What OA journals are suitable for MIR-like publications? Have people noticed a different impact/follow-up to OA-published results compared to traditional publishing? How do you bear the cost when it's author-supported - write it down in the grant? proposed by
Hello! I would be also very interesting on this topic, it would be nice know which OA journals could suit to MIR research and if the MIR community could start some initiative on this direction. It's sometimes difficult in the academic world that some of these journals are recognized as having an impact, even though it's not true.
MIR & Teaching:
Many researchers in our community have introduced the MIR word in undergraduate and graduate degrees. Anyone would be interested to share opinions about methodologies, tools and resources for introducing MIR techniques to students in different fields? (engineering, musicology, computer science, psychology, ) Proposed by
Emilia Gómez @ MTG-UPF
Great idea! I'd love to attend such a session!
I'm definitely attending this one. This is a great idea!
emilia: to prepare this session, I am gathering information about courses related to MIR. If you are teaching MIR-related stuff, please fill up this
Shifting from Matlab to Python (a wishlist)
I wouldn't do this myself, but I'd love to see (and attend) a session by an experimented Python user on how to make the shift from Matlab: what libraries, what install, what tips, matplotlib and whatnot. Could also be a forum to gather best practice and raise awareness of this interesting (and free) opportunity. Anyone? (please! please!)
I would be interested in this topic! Srikanth Cherla (firstname.lastname@example.org)
I like this! [Gustavo Martins (CITAR / Católica Porto)]
I like this too. I won't be at ISMIR but might participate remotely, this is interesting. (I've been getting more into Python this year. FWIW a recent blog article by me:
Some things I have learnt about optimising Python
) [Dan Stowell (QMUL)]
Great idea! I dropped Matlab almost entirely and use Python+Numpy+Scipy+Matplotlib instead since it allows for mixing numerical methods with generic things you can expect from a more general purpose language, as well as the use of Web (and Semantic Web) related libraries e.g. Web clients, servers, or JSON parsers. As a result, I ported other people's Matlab code to Python several times before. I'd be glad to contribute to this session but probably can't commit to do it all on my own at this point. (
Count me in (hope it doesn't conflict with the Source Separation session!) Very interested in doing more sound processing with Numpy and Scipi.
Joachim Thiemann (INRIA/IRISA)
Count me in!
Gopala Krishna Koduri
Sounds like somebody should propose a tutorial for next year's ISMIR... I know I would go. Maybe a late-break session on the topic would help give shape to such a tutorial...
I'd say that Python has already been partially used in a Tutorial in ISMIR 2009 (
), but in this case it was mainly for web mining of music related data. Anyway, I agree that a whole tutorial on how to use Python for many MIR tasks will be great. (
Gasp! I was going to propose (possibly leading) an MIR with Python demo. A few thoughts. I've been cultivating an MIR / Deep Learning codebase in Python for some time now, and have started to migrate it to a publicly available (LGPL'ed) library [MARLib
] with the help of other folks at NYU. There are other Python audio libraries floating about (Michael Casey's
comes to mind), but we're trying to accomplish two specific things with this one:
diminish the barrier-to-entry for Matlab-ers to make the switch as painless as possible, and
provide a seamless interface for sharing trained deep learning machines, so that you can use it like you would any old function (just like the FFT)
[sorry, continued] I'm planning to split my time between this and the Deep Learning camp at the same time, but if other Python evangelists want to help orchestrate an impromptu "how I learned to quit Matlab and love Python" I am 100% on board. Based on my experience trying to get new Python users up to speed, this usually manifests itself as two distinct tasks:
Installation / Emotional Support Group
Getting Familiar with Python (Syntax & Such)
. I'm content to throw myself in the ring as a co-organizer, and maybe in a few days those that are interested to lead could connect via email / make a google doc? I'm not entirely sure from all the comments above who's interested in organizing versus attending, so maybe a table can help disambiguate (I've made an attempt, but please correct as needed). Also, I second Fabien's sentiments on a tutorial next year.
I guess we can approach this session in two ways:
Introduction to Python, and present the main differences with Matlab (following this type of table:
, but including the most important signal processing stuff like fft, spectrogram, etc).
Assume that the attendees already have some experience with Python, but they want to learn more MIR-related stuff with Python, so that we can go more deeply into, e.g., how to extract features from audio.
Python > Matlab Demo Session
Love to facilitate!
Love to attend!
Gopala Krishna Koduri
Future Research Directions and Upcoming Projects
This year's ISMIR will be the thirteenth of its kind. With increasing industrial involvement and having music focused sessions in many major signal processing/machine learning/cognition conferences the importance of MIR and related fields becomes more and more apparent. However, gradually approaching maturity we also see more and more worn-out paths and approaches in MIR. Therefore, it makes sense to discuss what future research directions might be interesting and what will happen in the next years.
Semantic Media project
In this context, we would like to discuss the Semantic Media project. Within this project we want to focus on investigating novel ways to empower users to find relevant content in large collections of media documents by exploring how metadata can be already generated during the production of content, how linked data technology can be used to efficiently store this metadata and to incorporate external knowledge, and how industry and universities can work together in this field.
This sounds great, and relates to some work that we (myself, Kevin Page and Dave De Roure) have been doing recently on linked data publication of metadata for a corpus (the Live Music Archive or etree). See
. This also relates to questions of shared open vocabularies -- we've been using the music ontology for our markup.
Music Imagery Information Retrieval (MIIR)
Next year, I want to start a new research project trying to build brain-computer interfaces (BCIs) based on music imagery - i.e. deliberately imagining well-known music by recreating the perceptual experience in one's mind. (One of the first steps will be to collect high-quality ground truth data comprising EEG recordings of music perception and imagery aligned to the corresponding audio recordings.)
I have prepared a
short paper to outline my ideas
and would like to discuss them with others who are interested in this subject. Maybe we can form a special interest group on music imagery.
here's a really unusual topic! (reading Oliver Sacks' Musicophilia, which has quite a few chapters about this, as I write) I'm interested!
Shared Open Vocabulary for Audio Research and Retrieval
While there has been tremendous work in the MIR community to create easy to use feature extractor tools (e.g. Marsyas, jMIR, MIR toolbox, Vamp plugins to name a few), it remains difficult to know whether a feature computed by one tool is the same as (or compatible/replaceable with) a feature computed by another tool. Moreover, if different tools were used in the same experiment, their outputs typically need conversion to some sort of common format, and for reproducibility, this glue code needs to evolve with the changes of the tools themselves. Similar problems arise with the release of data sets, like MSD or SALAMI, in a variety of different formats, as well as in the use of various Web APIs.
The goal of the Shared Open Vocabulary for Audio Research and Retrieval (SOVARR) project is to investigate if and how audio research communities would benefit from using interoperable file formats, data structures, vocabularies or ontologies, what are the primary needs of MIR researchers, and what are the main barriers to the uptake of shared vocabularies.
The project is due to start on the 1st of October and it aims to be highly community focussed. As part of this effort, we would like to invite everyone interested for a discussion along the lines of questions such as: Is your research code sustainable? Are your results (and the way they were derived) sufficiently described and easily reproducible? Are you using interoperable tools that allow plugging different components into existing methods/algorithms for flexible experimentation and efficient research workflows?
We would also like to consider practical problems such as the need for describing very large data sets in a compact format, or the potential complexity of using shared and globally unique identifiers to maintain the meaning of data across different tools or over long periods of time. The project finally aims to revise and integrate existing vocabularies and research tools after reflecting on our findings. Some initial ideas/suggestions on requirements for these would also be greatly appreciated.
George Fazekas, QMUL
Grand challenges of MIR research
By expanding its context and addressing challenges such as multimodal information, multiculturalism and multidisciplinarity, MIR has the potential for a major impact on the future economy, the arts and education, not merely through applications of technical components, but also by evolving to address questions of fundamental human understanding, and build upon ideas of personalisation, interpretation, embodiment, findability and community.
What are the most important challenges facing the MIR community in the coming years?
we have begun to identify several areas for future investigation by considering technical, as well as social and exploitation aspects of MIR research. Amongst the many topics are musically-relevant data, knowledge-driven methodologies, interface and interaction aspects, evaluation of research results, social aspects, culture specificity, industrial, artistic, and educational applications.
The current list of challenges is a work in progress and can be found on the
We warmly welcome suggestions and additions to this list by the MIR community, and are very interested to hear particularly from researchers who may have already begun to address some of these new challenges. Feel free to add comments or additional challenges to our wiki and highlight the challenge you think deserves a longer discussion. An ISMIR session would enable the community to participate in a lively discussion over the future of the field.
I have been involved in writing these challenges but I think it is very important that people with other perspectives give their opinion on this. I am sure we have missed many things.
The ISMIR 2012 program also includes other occasions to start everybody thinking about MIR challenges and get ready for the late-break session: e.g. the
papers, and the Panel and Poster sessions on
We have put together a questionnaire to get opinions on the challenges presented in
. The questionnaire is in
. If enough answers come before the session we could discuss them.
Multi-timescale representations of music audio
We had a discussion about the different multi-timescale modelisation techniques for music audio. Here is a few notes I took during the session.
Types of multiscale representations:
modulation spectra -- gives more information about lowest bins of the fft
multi time-frequency scales
convolutional deep networks
why don't we use it wavelets?
-- too slow?
-- too complicated (not a pretty matrix)?
what do we want to get out of multi-timescale?
model time dynamics
we do not know which timescale is used to solve for different tasks
What tasks could benefit from multi-timescale representations?
music description (transcription)
How to model longer timescales?
Is the spectral domain a good way to do this, or should we use more complex machine learning techniques.
Or, modelling feature evolution.
help on how to format text
Turn off "Getting Started"