EUROPEAN RARE DISEASE REGISTRY
WHITE PAPER

Written by Riku Rinta-Jouppi (MA, MSc), Attorney-at-Law, Partner

Index:

 

 


Abundant, easily findable, accessible, interoperable and reusable (FAIR) data and evidence based medicine (EBM) are the corner stones of modern data-driven medical research. The rapid pace of technological progress (Moore’s law) has enabled the collection, governance and analysis of health data at scale. Carefully formulated research questions are being asked for clinical trials to be carried out where answers that can then be found and insights gained from the data. Medical research runs on data. Machine learning is only as good as the quality and quantity of data that gets in. This is a major problem in the fields of medicine that only have small data or no data at all. Some research fields are lucky to have oceans of data enabling wide scale use of the latest data-hungry research methods. Other research groups may be able to develop the best algorithms in the world but they are of no use if the relevant data is simply not available to them. Many healthcare institutions are struggling with data governance: how to best use the data they have in the best interests of the patients. Improved access to aggregated big data sets for research use is a human rights issue for rare disease patients, so essential have they become in informing ever faster and more accurate diagnosis and better treatment.

As ERDR will generate and make rare disease data more available for research, the data models informing diagnosis will become more accurate. The ability to search for data and recruit patients for further search from ERDR at European level will remove many of the barriers in access to data existing at the national level. Essentially ERDR will create a fully liquid and fair market for fit-for-purpose rare disease research data. This can be done without a loss of privacy as the data donors will be able to make their decisions on the level of access to data that they want to provide and they will be compensated for their efforts, perhaps not in fiat money, but by using the latest methods of state of the art token economics. A new economic model needs to be built to support the data altruistic rare disease research effort and giving back the individual patients and the patient community some new revenue streams.

Data is sometimes compared to gold due to its rarity and value. Yet, health data cannot be found in nature. It is generated by us humans in a specific format usually for some specific purpose. Different from gold, data as a digital asset is essentially fungible in the sense that it can be copied at zero cost and used repeatedly for a variety of purposes and used to build a wide variety of high added value data products. Value can for data can be created in particular in aggregation and integration over various data types and between different types of organisations and stakeholders.

In practice, health data can also found to be subject to various costs and liabilities due to the high regulatory and legal requirements placed on it in the European Union. Particularly personally identifiable information (PII) is to be minimised by design and by default as that it considered to be a particularly sensitive class of information. This has led to a “better safe than sorry” culture in public health institutions where data tends to be locked in fragmented, high-security silos with a very low rate of utilization for research use. Thus, the promise of data economy remains unrealised in the rare disease area. What ERDR sets out to do is to “break the silos”.

All the planned increase in data sharing and data donations require trust on the part of the data donor that their data will be protected from abuse. Still, Data breaches and cybersecurity attacks happen every day.

Why is ERDR needed? Because without big data the 95% of rare diseases without treatments and the rare diseases with unreliable diagnostic tests will have no hope if the data problem is not solved first.

Big data based analytics or “omics” require vast quantities of data in order to provide reliable results. In a study there can be even a million variables and the contribution of any single variable is very small. Multiomics refer to:

Big data based analytics or “omics” require vast quantities of data in order to provide reliable results. In a study there can be even a million variables and the contribution of any single variable is very small. Multiomics refer to:

Genomics

Transcriptomics

Proteomics

Metabolomics

Libidomics

Multivariate analytics are particularly important for multigene diseases. Even hundreds or thousands of gene variants can contribute to an individual’s genetic riskprofile. However, 70% of rare diseases are estimated to be based on a singular defect genetic: an addition, omission or deletion within a single gene. In rare diseases it is not necessary to have data sets of hundreds of thousands of participants with tens of thousands of patients as in some common diseases. We can do with less but not completely without data. Broad international co-operation is vital in collecting samples and in federating research results.

Finland has 11 biobanks that store over 500 000 blood or dna samples and millions of tissue samples. Biobanks have been established to serve university hospitals, universities, THL, Red Cross Blood Service and Terveystalo. The biobanks cooperate under the name FINBB.

Biobanks are an essential part of modern biomedical research. If the national research infrastructure is maintained well they are attractive partners for pharmaceutical companies looking for participants with particular biomarkers for clinical trials.