Design and Implementation of a Structured Electronic Form for Celiac Disease ‎Pathology ‎Reports: A Text Mining Approach

Abstract:
Introduction
Pathology reports generally use an unstructured text format and contain a complex web of ýrelations between medical concepts. In order to enable computers to understand and analyze ýthe reports’ free text, we aimed to convert these concepts and their relations into a structured ýformat.ý
Methods
The training, validation, and evaluation of this implementation study was based on a corpus ýof 258 pathology reports with a positive diagnosis of celiac disease randomly selected from ýamong the records of 2 pathology laboratories. Our proposed system consisted of 3 phases of ýstandardization of celiac disease pathology reports using Delphi technique with 3 experts, ýinformation extraction from free text reports with text mining techniques using Stanford ýParser, and automatic classification of celiac disease stages in marsh system using decision ýtree classifier J48 algorithm.ý
Results
We were successful in extracting information from free text pathology reports and assigning ýeach piece of information to the associated pre-defined fields in standardized template form ýwith an accuracy of 76%. After determining marsh stage for each report in the third phase, ýour system showed an average overall accuracy of 62%. Evaluation of the third phase as an ýindependent system with manually corrected, gold-standard input achieved an accuracy of ýgreater than 84%.ý
Conclusion
The benefits of standardized synoptic pathology reporting include enhanced completeness ýand improved consistency, avoidance of confusion and error, and facilitation of the faster and ýsafer transmission of critical pathological data in comparison with narrative reports.ý
Language:
Persian
Published:
Health Information Management, Volume:13 Issue: 1, 2016
Page:
19
https://www.magiran.com/p1546262