nach oben

Journal of Medical Systems

Erschienen in:

01.10.2011 | Original Paper

Evaluating Cluster Preservation in Frequent Itemset Integration for Distributed Databases

verfasst von: Sumeet Dua, Michael P. Dessauer, Prerna Sethi

Erschienen in: Journal of Medical Systems | Ausgabe 5/2011

Einloggen, um Zugang zu erhalten

Abstract

Medical sciences are rapidly emerging as a data rich discipline where the amount of databases and their dimensionality increases exponentially with time. Data integration algorithms often rely upon discovering embedded, useful, and novel relationships between feature attributes that describe the data. Such algorithms require data integration prior to knowledge discovery, which can lack the timeliness, scalability, robustness, and reliability of discovered knowledge. Knowledge integration algorithms offer pattern discovery on segmented and distributed databases but require sophisticated methods for pattern merging and evaluating integration quality. We propose a unique computational framework for discovering and integrating frequent sets of features from distributed databases and then exploiting them for unsupervised learning from the integrated space. Assorted indices of cluster quality are used to assess the accuracy of knowledge merging. The approach preserves significant cluster quality under various cluster distributions and noise conditions. Exhaustive experimentation is performed to further evaluate the scalability and robustness of the proposed methodology.

Deeray, T., and Verhayden, P. Towards a semantic integration of medical relational databases by using ontologies: a case study. On the Move to Meaningful Internet System 2003 Workshop (OTM ’03), Lecture Notes in Computer Sciences 2889, pp. 137–150, 2003

Hadzic, M., and Chang, E., Onto-agent methodology for design of ontology-based mufti-agent systems. Int. J. Comput. Syst. Sci. Eng. 23:19–30, 2008.

Batini, C., Lenzerini, M., and Navathe, S. B., A comparative analysis of methodologies for database schema integration. ACM Comput. Surv. 18:323–364, 1986.CrossRef

Piatetsky-Shapiro, G., Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro, G., and Frawley, W. J. (Eds.), Knowledge Discovery in Databases. AAAI/MIT Press, Cambridge, 1991.

Goethals, B., Survey on Frequent Pattern Mining. Available at http://www.cs.columbia.edu/∼jebara/6772/papers/SurveyFPMining.pdf, 2003.

Dua, S., Jain, V., and Thompson, H. W., Patient classification using association mining of clinical images. Biomedical Imaging: From Nano to Macro, 2008. ISBI 2008. 5th IEEE International Symposium on, pp.253–256, 14–17 May 2008.

Zaki, M. J., Parthsarathy, S., Ogihara, M., and Li, W., New algorithms for fast discovery of association rules. KDD, pp. 283–286, 1997.

Lent, B., Swami, A., and Widom, J., Clustering association rules. Proc. 1997 Int’l Conf. Data Eng., pp. 220–231, Apr. 1997.

Agrawal, R., and Srikant, R., Fast algorithms for mining association rules in large databases. VLDB ’94: Proceedings of the 20th International Conference on Very Large Data Bases. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., pp. 487–499, 1994.

10.

Sethi, P., and Jain M., A comparative feature selection approach for the prediction of healthcare coverage. Communications in Computer and Information Science, to appear 2010.

11.

Delen, D., Fuller, C., McCann, C., and Ray, D., Analysis of healthcare coverage: a data mining approach. Exp. Syst. Appl. 36:995–1003, 2009.CrossRef

12.

Dua, S., Singh, H., and Thompson, H. W., Associative classification of mammograms using weighted rules. Exp. Syst. Appl. 36(5):9250–9259, 2009.CrossRef

13.

Han, J., Pei, H., and Yin, Y., Mining frequent patterns without candidate generation. In: Proc. conf. on the Management of Data (SIGMOD’00, Dallas, TX). ACM Press, New York, 2000.

14.

Sethi, P., and Leangsuksun, C., A novel computational framework for fast distributed computing and knowledge integration for microarray gene expression data analysis. Advanced Information Networking and Applications, International Conference on, pp. 613–617, 20th International Conference on Advanced Information Networking and Applications - Volume 2 (AINA’06), 2006.

15.

Rand, W. M., Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66:846–850, 1971.CrossRef

16.

Hubert, L., and Arabie, P., Comparing partitions. J. Classif. 193–218, 1985.

Titel: Evaluating Cluster Preservation in Frequent Itemset Integration for Distributed Databases
verfasst von: Sumeet Dua
Michael P. Dessauer
Prerna Sethi
Publikationsdatum: 01.10.2011
Verlag: Springer US
Erschienen in: Journal of Medical Systems / Ausgabe 5/2011
Print ISSN: 0148-5598
Elektronische ISSN: 1573-689X
DOI: https://doi.org/10.1007/s10916-010-9512-1

Springer Medizin

Abstract

Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten

Weitere Artikel der Ausgabe 5/2011

Matching of a Huge Set of MR Images with a Parallel Processing Model

Special Issue Editorial on Proceedings of the Second International Conference on Biomedical Engineering and Informatics (BMEI 2009)

Calcaneal Osteotomy Preoperative Planning System with 3D Full-Sized Computer-Assisted Technology

Thermal Shock Resistance of Skin Tissue

Variable Threshold Method for ECG R-peak Detection

Inter-Greedy Technique for Fusion of Different Segmentation Strategies Leading to High-Performance Carotid IMT Measurement in Ultrasound Images