Qosmo Music & Sound AI

Video2Music

AI selects songs that “feel right” for the video

FEATURES

  • Music that "feels right" for the given video is selected in instantly from a target music library with millions of songs. No limitations in musical styles

  • In addition to the “video→music” search, “music→video” search is also available, improving a cross-modal content search UX

  • Combined with Qosmo’s other search algorithms, you can build a wide range of search services, including similar-song suggestions

  • Music that "feels right" for the given video is selected in instantly from a target music library with millions of songs. No limitations in musical styles

  • In addition to the “video→music” search, “music→video” search is also available, improving a cross-modal content search UX

  • Combined with Qosmo’s other search algorithms, you can build a wide range of search services, including similar-song suggestions

USE CASE

  • Boost purchases at stock music/video services

    Stock music/video services conventionally use tags and keywords to aid search but it requires that users know what they are looking for. Video2Music solves this problem directly by allowing users to ask “find music that fits with this video.”

  • Music selection requests from customers in video production

    Song selection has depended on expert librarians at record companies and others who frequently receives requests for matching music. Video2Music allows those less familiar with the library to effectively narrow down candidates to a few selections.

  • Feature integration with movie editing software

    Many movie editing software products offer music library for users to choose from. It would greatly increase UX if the product can recommend a timely selection of music that fits with the content under production.

IMPLEMENTA-
TION

IMPLEMENTATION

TECHNOLOGY

  • Video2Music uses the deep-learning algorithm called Transformer to convert video/music input into mutually comparable latent vector features. By training a model using a large number of movie contents online*. Using the Contrastive Learning technique, we successfully calculate quantitatively the fitness between a video and a song, two distinct forms of media. The pre-trained model provided with the product license already supports a wide range of input videos and music styles but can be re-trained with additional data to improve accuracy for specific applications.

  • *Training machine learning models from copyrighted materials is permitted by the copyright law of Japan

TECH SPEC

  • Pricing

    Initial fee (initial library indexing, system integration etc)

    Monthly fee (charged by fixed rate up to specified number of API calls)

  • Input/Output

    Input: Video (30 seconds or longer)

    Output: Song candidates (can reverse input and output)

  • Operating Environment

    Cloud: REST API

    On-premise: Linux-GPU environment

  • Processing speed

    Indexing: < 3 seconds (per song)

    Matching (per search) : < 1 second

Get in touch with us here!

CONTACT

Get in touch with us here!

CONTACT