Making the most of precious data: experts weigh in on neuroscience data sharing
Whether it’s modelling or simulation, high-definition images or spreadsheets, current neuroscience research generates a lot of data. But are researchers making the most of all this raw information, which often takes a lot of computational time and effort to produce? How much time and effort could be saved by reusing datasets previously produced by other researchers? This was the topic of a dedicated discussion at the 2022 FENS forum, where experts in the field of neuroscience data pointed out that very few laboratories are effectively sharing and reusing data.
“Data sharing in neuroscience is facing three main challenges” explained Maaike van Swieten from the University of Oslo. Van Swieten is a researcher in neuroinformatics and a data curator for the EBRAINS platform. “First, we must bring seemingly unrelated data together, integrating multiscale, multi-model data into a unified database. Secondly, we must all speak a common language, preferably computer based and machine actionable, to ensure our datasets are compatible. And thirdly, we must support researchers with practical expertise on how to share their data in the best possible way”.
“Keeping data open for research is a non-negligible expense. Is that data reaching its full potential?” said Stephanie Albin of The Kavli Foundation, a private foundation that promotes open data use. “Unfortunately, current data has often little purview outside the research it was produced for. Data standardization, for example the use of unified data languages such as NWB (Neurodata Without Borders) is a possible answer”.
“We often tell researchers 'go out and share your data’, but how are they supposed to do it? We need to create environments conducive to data sharing and reuse” said Mathew Bidsall Abrams, Director of INCF, the International Neuroinformatics Coordinating Facility. INCF and HBP/EBRAINS collaborate on KnowlegeSpace, a collaborative data-driven search engine for neuroscience. “It’s both a Wikipedia and data repository” explains Abrams. “You look for a neuroscience concept and you can access all datased related to that concept with their sources. It is one of the few places where you can easily find large datasets related to international brain initiatives”.
“A recent survey shows that 46% of researchers struggle with organising their data in a presentable way, 37% are unsure about copyright issues, and 33% don’t know which repository to use” said Van Swieten. “EBRAINS services provide curations and tools that address these practical challenges”. The Knowledge Graph, which you can access here, is a structure for metadata representation that allows you to connect different datasets together, creating and organising a network of preexisting data relevant to your research.
The structure is dictated by the openMINDS community driven annotation models, and in addition, EBRAINS offers curation services provided by experts to help researchers disentangle themselves from the intricacies. “For example, you can extract all the relevant information from an abstract and categorize it in a structured format. That is turned into advanced metadata and then becomes more effectively accessible through the Knowledge Graph” said. “The three main challenges (bringing data together, shared language, and practical issues) are addressed by Knowledge Graph, openMINDS, and curation services respectively”.