Data Integration and Networking

We facilitate and support stratification and visualization of this data through the efficient integration and harmonization of biomedical data across institutional and national borders. This occurs, for instance, in the form of a web-based case- and sample-specific search for human biosamples and their relevant data in biobank networks and metabiobanks.

Research and Development Services:

  • Secure data exchange between research institutions, external partners and data banks
  • Cutting-edge technologies for efficient integration, annotation, harmonization, anonymization, stratification and visualization of biomedical data
  • Automated knowledge extraction from free text/text mining with CRIP.CodEx
  • Tailor-made, user-friendly solutions for developing metabiobanks or the integration of existing biobanks into our metabiobank portals
  • Applications from the Fraunhofer CRIP Toolbox – our »Swiss Army Knife« – for biomedical research platforms

CRIP: Central Research Infrastructure for Molecular Pathology

CRIP has been online since 2006 and is the prototype for all of the Working Group’s other metabiobanks. Developed in collaboration with the CRIP advisory board and founding partners, the CRIP Privacy Regime is a groundbreaking concept – legally approved in terms of data protection in Germany since 2006.

The integration of biobanks, for example, into so-called metabiobanks enables and supports the web-based case- and sample-specific search for human biosamples and relevant data across institutions and national borders. In this way, samples collected during diagnostics and treatment (e.g. blood, serum, tissue) and the associated data can be made available quickly and with statistical relevance for research. CRIP has undergone successful international evaluation.

Based on the software originally developed for CRIP, over the years, the modular CRIP Toolbox software portfolio was developed for the purposes of flexible, and tailor-made setup of biobank portals.

CRIP.CodEx: Information Extraction from Medical Free-Text Results

Today, medical research takes advantage of a variety of technologies, databases and networked platforms, such as Arevir, CRIP and p-BioSPRE, in order to process the growing body of knowledge and variety of measurement data and to use it to improve drug compatibility, to help make decisions and for statistical analyses. However, the information contained in the free-text results is usually unavailable for the purposes of research or personalized medicine if the data is not prepared appropriately and made available in a well-structured format.

The integration in the context of various research approaches requires that the information contained in medical free-text results be presented in a structured format and can be extracted properly from the text. The CRIP.CodEx method makes this extraction user-friendly, fast and efficient while also presenting the extracted information in a well-structured format (e.g. ICD codes, TNM system). CRIP.CodEx identifies word relationships, negations and their scope in free text, without requiring any access to internal or external databases or other resources. The extraction rules are »learned« automatically with the one-time import of a coding guide – no annotated training set or manual entry of rules is required. Dictionaries and rules can be easily added as required.

CRIP.CodEx was developed based on the texts of pathology reports and returns a hit rate of 97 to 99% with accuracy of 94 to 98% (German pathology results; ICD-O-3). However, the software can also be used for free-text results from other areas of medicine and combined with other systems (e.g. MOTS from the University of Potsdam).


p-BioSPRE: Biobank Portal of the p-medicine Consortium

As part of the EU-funded »p-medicine« project, the p-BioSPRE platform was developed together with the Fraunhofer IBMT from 2011 to 2015. It grants authorized researchers access to highly annotated biobanks for the most common form of childhood leukemia, acute lymphoblastic leukemia (ALL) and Wilms’ tumors (nephroblastoma), including information on the related patient consent declarations. p-BioSPRE can be expanded in flexible ways at any time to include additional tumor entities and other biobank partners.

Demonstrator with test data for numerous tumor entities

P2B2: Project Portal in the German Biobank Registry

The project portal of the German Biobank Registry (P2B2) offers a way to simultaneously search for specific samples or cases in a wide range of different biobanks. Developed between 2010 and 2012, the BMBF-funded project P2B2 is the proof-of-concept for establishing a project portal for all kinds of medical biobanks on the basis of CRIP, which was originally developed for pathology tissue banks. P2B2 is accessible via single sign-on together with the German Biobank Registry.

  • CRIP.CodEx – Knowledge extraction from medical free text sources
  • CRIP.IDB/.IANUS – data integration and harmonization
  • CRIP.Anon – anonymization of medical patient and case data
  • CRIP.Webservice – M2M access to shared infrastructure components
  • CRIP.Searchtool – portal interface for searches, project queries and research

We work with more than a dozen university clinics/biobank partners and have outstanding expertise in the areas of

  • semantic integration of a wide range of case- and sample-relevant data from medical research and care,
  • contractual arrangements for the data transfer in terms of the relevant ethical and legal regulations, and
  • governance of cross-institutional biobank networks.

  • Gros O, Thasler R (2016): »Diagnostic Free Text Analysis in Biobanks with CRIP.CodEx: Automated Matching of Classifications«, in the Proceedings of the 7th International Symposium on Semantic Mining in Biomedicine,, p70-74, urn:nbn:de:0074-1650-6
  • Weiler G, Schröder C, Schera F, Dobkowicz M, Kiefer S, Heidtke K R, Hänold S, Nwankwo I, Forgo N, Stanulla M, Eckert C, Graf N. p-BioSPRE – an information and communication technology framework for transnational biomaterial sharing and access. ecancer medical science 2014(8).
  • Gros O (2014). Computergestützte Wissensextraktion aus Befundtexten der Pathologie. Dissertation Mathemat-naturw. Fakultät der Universität Potsdam, 2014.
  • Gros O and Stede M (2013): »Determining Negation Scope in German and English medical diagnoses«, in M. Taboada and R. Trnavac (Hrsg.): Nonveridicality and Evaluation – Theoretical, Computational and Corpus Approaches, Studies in Pragmatics 11, BRILL. ISBN: 9789004258167
  • Schröder C, Heidtke K R, Zacherl N, Zatloukal K, Taupitz J (2010). Safeguarding donors' personal rights and biobank autonomy in biobank networks: the CRIP privacy regime. Cell Tissue Bank 12(3): 233 – 240; doi: 10.1007/s10561-010-9190-8. Epub 2010 Jul 15.