Today, medical research takes advantage of a variety of technologies, databases and networked platforms, such as Arevir, CRIP and p-BioSPRE, in order to process the growing body of knowledge and variety of measurement data and to use it to improve drug compatibility, to help make decisions and for statistical analyses. However, the information contained in the free-text results is usually unavailable for the purposes of research or personalized medicine if the data is not prepared appropriately and made available in a well-structured format.
The integration in the context of various research approaches requires that the information contained in medical free-text results be presented in a structured format and can be extracted properly from the text. The CRIP.CodEx method makes this extraction user-friendly, fast and efficient while also presenting the extracted information in a well-structured format (e.g. ICD codes, TNM system). CRIP.CodEx identifies word relationships, negations and their scope in free text, without requiring any access to internal or external databases or other resources. The extraction rules are »learned« automatically with the one-time import of a coding guide – no annotated training set or manual entry of rules is required. Dictionaries and rules can be easily added as required.
CRIP.CodEx was developed based on the texts of pathology reports and returns a hit rate of 97 to 99% with accuracy of 94 to 98% (German pathology results; ICD-O-3). However, the software can also be used for free-text results from other areas of medicine and combined with other systems (e.g. MOTS from the University of Potsdam).