Using markerless motion capture to drive music generation
Contributor: Louis Martinez
Mentors: Suresh, Yohaï-Eliel BERREBY
Gesture-based music generation has existed for some years now, thanks to software such as Wekinator. However, they use machine learning methods that have been overtaken by recent Deep Learning innovations, and therefore limit the studies that can be carried out on musical creativity and sound/movement interaction. The aim of this project is to use these modern Deep Learning methods to generate music driven by motion capture. The framework consists of 3 distinct modules. Firstly, user gestures are detected using landmarks generated by MediaPipe by Google. Secondly, these landmarks are used to recognize the gesture performed by the user, and generate Open Sound Control (OSC) signals. Finally, the OSC signals are converted into music using Max/MSP. In addition, new gestures can be dynamically added to the framework, with just a few examples, and mapped to new sounds. Deliverables: - Experiment to estimate the appropriate delay between a gesture and the resulting sound/effect. - Framework based on LivePose or Pose2Art using only the gestures predifined in MediaPipe, able to interface the chosen camera and the sound-card - Model based on MediaPipe holistic model to classifiy dynamic gestures - Famework to map gestures to sounds through OSC signals. - Few-Shot Learning-based model to dynamically add new gestures.
![](/sites/default/files/gbb-uploads/incf-wg-gsoc-image.jpg)
- Framework based on LivePose or Pose2Art using only the gestures predifined in MediaPipe, able to interface the chosen camera and the sound-card
- Model based on MediaPipe holistic model to classifiy dynamic gestures
- Famework to map gestures to sounds through OSC signals.
- Few-Shot Learning-based model to dynamically add new gestures