Electronic supplementary material
The online version of this article (doi:10.1186/1472-6947-12-72) contains supplementary material, which is available to authorized users.
The author(s) declare that they have no competing interests.
JLW collected the corpus, designed the experiment, and contributed to writing the paper. LCY designed the study, interpreted experiment results, and contributed to writing the paper. PJC restructured the paper and contributed to writing the paper. All of authors read and approved the final manuscript.
Online psychiatric texts are natural language texts expressing depressive problems, published by Internet users via community-based web services such as web forums, message boards and blogs. Understanding the cause-effect relations embedded in these psychiatric texts can provide insight into the authors’ problems, thus increasing the effectiveness of online psychiatric services.
Previous studies have proposed the use of word pairs extracted from a set of sentence pairs to identify cause-effect relations between sentences. A word pair is made up of two words, with one coming from the cause text span and the other from the effect text span. Analysis of the relationship between these words can be used to capture individual word associations between cause and effect sentences. For instance, (broke up, life) and (boyfriend, meaningless) are two word pairs extracted from the sentence pair: “I broke up with my boyfriend. Life is now meaningless to me”. The major limitation of word pairs is that individual words in sentences usually cannot reflect the exact meaning of the cause and effect events, and thus may produce semantically incomplete word pairs, as the previous examples show. Therefore, this study proposes the use of inter-sentential language patterns such as ≪broke up, boyfriend>, <life, meaningless≫ to detect causality between sentences. The inter-sentential language patterns can capture associations among multiple words within and between sentences, thus can provide more precise information than word pairs. To acquire inter-sentential language patterns, we develop a text mining framework by extending the classical association rule mining algorithm such that it can discover frequently co-occurring patterns across the sentence boundary.
Performance was evaluated on a corpus of texts collected from PsychPark (http://www.psychpark.org), a virtual psychiatric clinic maintained by a group of volunteer professionals from the Taiwan Association of Mental Health Informatics. Experimental results show that the use of inter-sentential language patterns outperformed the use of word pairs proposed in previous studies.
This study demonstrates the acquisition of inter-sentential language patterns for causality detection from online psychiatric texts. Such semantically more complete and precise features can improve causality detection performance.