Electronic supplementary material
The online version of this article (doi:10.1186/1472-6947-12-9) contains supplementary material, which is available to authorized users.
The authors declare that they have no competing interests.
JL and ZL conceived the whole study, participated in its design, analyzed the results and wrote the manuscript. JL implemented methods and performed the experiments. All authors read and approved the final manuscript.
Each day, millions of health consumers seek drug-related information on the Web. Despite some efforts in linking related resources, drug information is largely scattered in a wide variety of websites of different quality and credibility.
As a step toward providing users with integrated access to multiple trustworthy drug resources, we aim to develop a method capable of identifying drug's dosage form information in addition to drug name recognition. We developed rules and patterns for identifying dosage forms from different sections of full-text drug monographs, and subsequently normalized them to standardized RxNorm dosage forms.
Our method represents a significant improvement compared with a baseline lookup approach, achieving overall macro-averaged Precision of 80%, Recall of 98%, and F-Measure of 85%.
We successfully developed an automatic approach for drug dosage form identification, which is critical for building links between different drug-related resources.