Research Fellow in Machine Learning
This project investigates how co-creative AI systems for music composition can be designed in collaboration with practising musicians. Rather than evaluating a finished AI tool, this work takes a co-design approach to identify the needs, values, and concerns of practising musicians regarding AI in music, and uses these insights to develop a musical AI tool alongside musicians. The research is primarily aimed at HCI researchers, AI and music technology developers, and designers of co-creative AI systems. The core research questions driving this work are:
The research follows a co-design case study involving 13 practising musicians across two workshops and a two-week ecological evaluation, exploring how musicians compose and how AI might support their creative process.
The first workshop explored musicians' composition processes and their perceptions of AI in music. Participants composed a piece beforehand and reflected on their process, then engaged in both human-human and human-AI collaboration tasks using Music Transformer for reharmonisation. Key observations revealed that music is seen as a deeply human, emotional, embodied practice, with strong resistance to framing AI as a collaborator. However, musicians showed openness to AI as a tool, particularly one that generates musical variations, supports ideation rather than replacing authorship, and allows adjustable levels of agency. This led to the proposal of a musical variation tool.
The second workshop tested whether a variation tool would fit into musicians' workflows and identified desired features. Participants used a prototype system based on Music Transformer, uploading MIDI files and generating harmonic variations. Musicians confirmed that a variation tool is especially valuable during the ideation phase and should allow control over both the level and type of variation (pitch, rhythm, harmony), possibly support training on personal style, preserve expressive MIDI qualities, and integrate into DAWs (Digital Audio Workstations).
Based on workshop insights, a new system was developed using MusicBERT, which uses masked prediction on MIDI sequences operating on Octuple encoding where each note includes eight attributes (bar, instrument, position, pitch, duration, velocity, time signature, tempo). Users can mask selected notes or attributes, with the model predicting replacements—more masking creates more variation. The system implements functionalities including adding new notes, bar control for selecting specific bars to vary, and bar-level masking for more cohesive variations. This shift from Music Transformer to MusicBERT improved controllability, directly responding to musician feedback.
Six musicians used the system in their own creative environments for two weeks, keeping journals and concluding with a focus group discussion. Soundscapes (MIDI compositions) were generated through selective masking of musical attributes, chaining variations, adding new notes, and iterative filtering and refinement. Evaluation was qualitative, with musicians selecting useful fragments, editing and refining generated material, and primarily using outputs as inspiration rather than final content.
This work demonstrates that co-design can meaningfully shape AI music tools from early development stages, and that practising musicians are open to AI under specific conditions including tool framing, creative ownership, controllability, and workflow integration. The variation-based AI system offers a productive model for co-creative interaction. The following design insights emerged for future co-creative AI systems aimed at practising musicians: