Improving (Re-)Usability of Musical Datasets: An Overview of the DOREMUS Project

Read original: arXiv:2405.03382 - Published 5/7/2024 by Pasquale Lisena (WEB3), Manel Achichi (WEB3), Pierre Choff'e (BnF), C'ecile Cecconi (WEB3), Konstantin Todorov (WEB3), Bernard Jacquemin (GERIICO), Raphael Troncy
Total Score

0

🤖

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • The DOREMUS project aims to improve the description and exploration of music data from three French institutions.
  • The paper presents the data model based on FRBRoo, the conversion and linking processes using linked data technologies, and the prototypes created to make the data more accessible to web users.

Plain English Explanation

The DOREMUS project is working to create better tools for understanding and exploring music data. They are taking information from three different French organizations and linking it together in a way that makes it easier for people to find and use.

The project is using a data model called FRBRoo to structure the information, which helps connect different pieces of data about the same musical works or performances. They are also using linked data technologies to make the data more accessible and interlinked on the web.

The team has developed some prototypes, or early versions, of tools that allow web users to more easily search, browse, and interact with the combined music data from the three institutions. This could help researchers, musicians, and the general public better understand and explore the rich musical heritage represented in these collections.

Technical Explanation

The DOREMUS project is focused on improving the description and exploration of music data from three French institutions: the Bibliothèque nationale de France, Philharmonie de Paris, and Radio France.

The researchers have developed a data model based on FRBRoo, which is an ontology that helps structure information about bibliographic entities like books, musical works, and performances. This allows them to better represent the relationships between different aspects of a musical work, such as the composition, recordings, and individual performances.

To bring together the data from the three institutions, the team has employed linked data technologies. This involves converting the existing data into a common format, establishing links between related entities, and publishing the resulting linked dataset on the web. The prototypes they have built demonstrate how this linked data can be consumed by web users through search, browsing, and exploration interfaces.

Critical Analysis

The DOREMUS project tackles an important challenge in the digital humanities by aiming to provide better tools for understanding and working with music data from multiple sources. The use of a robust data model like FRBRoo and the adoption of linked data principles are positive steps towards achieving this goal.

One potential limitation of the work is the scope, as it is focused on data from just three French institutions. Expanding the project to incorporate music data from a wider range of sources, both nationally and internationally, could further enhance the value and impact of the research.

Additionally, while the prototypes demonstrate the potential of the linked data approach, more user testing and evaluation would be helpful to ensure the tools are meeting the needs of the target audiences, which include researchers, musicians, and the general public.

Further research could also explore ways to automatically extract and link music-related data from web sources, [such as video and text](https://aimodels.fyi/papers/arxiv/nes-video-music-database-dataset-symbolic-video, https://aimodels.fyi/papers/arxiv/musilingo-bridging-music-text-pre-trained-language), or to develop advanced music analysis and fingerprinting techniques that could enhance the data and its discoverability.

Conclusion

The DOREMUS project demonstrates the potential of linked data and semantic technologies to improve the description and exploration of music collections. By connecting data from multiple sources using a robust data model, the team has laid the groundwork for more powerful tools that can help researchers, musicians, and the public better understand and engage with our musical heritage.

While the current focus is on French institutions, expanding the scope and continuing to refine the user experience could lead to significant advancements in the field of digital musicology and the preservation of cultural knowledge.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤖

Total Score

0

Improving (Re-)Usability of Musical Datasets: An Overview of the DOREMUS Project

Pasquale Lisena (WEB3), Manel Achichi (WEB3), Pierre Choff'e (BnF), C'ecile Cecconi (WEB3), Konstantin Todorov (WEB3), Bernard Jacquemin (GERIICO), Raphael Troncy

DOREMUS works on a better description of music by building new tools to link and explore the data of three French institutions. This paper gives an overview of the data model based on FRBRoo, explains the conversion and linking processes using linked data technologies and presents the prototypes created to consume the data according to the web users' needs.

Read more

5/7/2024

📊

Total Score

0

From Data Complexity to User Simplicity: A Framework for Linked Open Data Reconciliation and Serendipitous Discovery

Marco Grasso (University of Bologna), Giulia Renda (University of Bologna), Marilena Daquino (University of Bologna)

This article introduces a novel software solution to create a Web portal to align Linked Open Data sources and provide user-friendly interfaces for serendipitous discovery. We present the Polifonia Web portal as a motivating scenario and case study to address research problems such as data reconciliation and serving generous interfaces in the music heritage domain.

Read more

5/27/2024

Optical Music Recognition in Manuscripts from the Ricordi Archive
Total Score

0

Optical Music Recognition in Manuscripts from the Ricordi Archive

Federico Simonetta, Rishav Mondal, Luca Andrea Ludovico, Stavros Ntalampiras

The Ricordi archive, a prestigious collection of significant musical manuscripts from renowned opera composers such as Donizetti, Verdi and Puccini, has been digitized. This process has allowed us to automatically extract samples that represent various musical elements depicted on the manuscripts, including notes, staves, clefs, erasures, and composer's annotations, among others. To distinguish between digitization noise and actual music elements, a subset of these images was meticulously grouped and labeled by multiple individuals into several classes. After assessing the consistency of the annotations, we trained multiple neural network-based classifiers to differentiate between the identified music elements. The primary objective of this study was to evaluate the reliability of these classifiers, with the ultimate goal of using them for the automatic categorization of the remaining unannotated data set. The dataset, complemented by manual annotations, models, and source code used in these experiments are publicly accessible for replication purposes.

Read more

8/21/2024

Toward FAIR Semantic Publishing of Research Dataset Metadata in the Open Research Knowledge Graph
Total Score

0

Toward FAIR Semantic Publishing of Research Dataset Metadata in the Open Research Knowledge Graph

Raia Abu Ahmad, Jennifer D'Souza, Matthaus Zloch, Wolfgang Otto, Georg Rehm, Allard Oelen, Stefan Dietze, Soren Auer

Search engines these days can serve datasets as search results. Datasets get picked up by search technologies based on structured descriptions on their official web pages, informed by metadata ontologies such as the Dataset content type of schema.org. Despite this promotion of the content type dataset as a first-class citizen of search results, a vast proportion of datasets, particularly research datasets, still need to be made discoverable and, therefore, largely remain unused. This is due to the sheer volume of datasets released every day and the inability of metadata to reflect a dataset's content and context accurately. This work seeks to improve this situation for a specific class of datasets, namely research datasets, which are the result of research endeavors and are accompanied by a scholarly publication. We propose the ORKG-Dataset content type, a specialized branch of the Open Research Knowledge Graoh (ORKG) platform, which provides descriptive information and a semantic model for research datasets, integrating them with their accompanying scholarly publications. This work aims to establish a standardized framework for recording and reporting research datasets within the ORKG-Dataset content type. This, in turn, increases research dataset transparency on the web for their improved discoverability and applied use. In this paper, we present a proposal -- the minimum FAIR, comparable, semantic description of research datasets in terms of salient properties of their supporting publication. We design a specific application of the ORKG-Dataset semantic model based on 40 diverse research datasets on scientific information extraction.

Read more

4/15/2024