skip to main content
research-article

Parallel Colt: A High-Performance Java Library for Scientific Computing and Image Processing

Published:01 September 2010Publication History
Skip Abstract Section

Abstract

Major breakthroughs in chip and software design have been observed for the last nine years. In October 2001, IBM released the world’s first multicore processor: POWER4. Six years later, in February 2007, NVIDIA made a public release of CUDA SDK, a set of development tools to write algorithms for execution on Graphic Processing Units (GPUs). Although software vendors have started working on parallelizing their products, the vast majority of existing code is still sequential and does not effectively utilize modern multicore CPUs and manycore GPUs.

This article describes Parallel Colt, a multithreaded Java library for scientific computing and image processing. In addition to describing the design and functionality of Parallel Colt, a comparison to MATLAB is presented. Two ImageJ plugins for iterative image deblurring and motion correction of PET brain images are described as typical applications of this library. Performance comparisons with MATLAB, including GPU computations via AccelerEyes’ Jacket toolbox are also given.

References

  1. }}AccelerEyes. 2009. Jacket. http://accelereyes.com/.Google ScholarGoogle Scholar
  2. }}Ahmed, N., Natarajan, T., and Rao, K. R. 1974. Discrete cosine transfom. Trans. Comput. C-23, 1, 90--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. }}Amerdo, B., Bodnartchouk, V., Caromel, D., Delbé, C., Huet, F., and Taboada, G. L. 2008. Current state of Java for HPC. Tech. rep. inria-00312039, INRIA.Google ScholarGoogle Scholar
  4. }}Anderson, E., Bai, Z., Bischof, C., Blackford, L. S., Demmel, J., Dongarra, J. J., Croz, J. D., Hammarling, S., Greenbaum, A., McKenney, A., and Sorensen, D. 1999. LAPACK Users’ Guide, 3rd Ed. SIAM, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. }}Arndt, H., Bundschus, M., and Nägele, A. 2009. Towards a next-generation matrix library for Java. In Proceedings of the 33rd Annual IEEE International Computer Software and Applications Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. }}Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., and der Vorst, H. V. 1994. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods 2nd Ed. SIAM, Philadelphia, PA.Google ScholarGoogle Scholar
  7. }}Bertero, M. and Boccacci, P. 1998. Introduction to Inverse Problems in Imaging. IOP Publishing Ltd., London.Google ScholarGoogle Scholar
  8. }}Blackford, L. S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., and Whaley, R. C. 2002. An updated set of Basic Linear Algebra Subprograms (BLAS). ACM Trans. Math. Softw. 28, 2, 135--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. }}Bluestein, L. I. 1968. A linear filtering approach to the computation of the discrete Fourier transform. Northeast Electronics Research and Engineering Meeting Record 10, 218--219.Google ScholarGoogle Scholar
  10. }}Buttari, A., Langou, J., Kurzak, J., and Dongarra, J. 2007. A class of parallel tiled linear algebra algorithms for multicore architectures. Tech. rep., Innovative Computing Laboratory.Google ScholarGoogle Scholar
  11. }}Byous, J. 2003. Java technology: The early years. http://java.sun.com/features/1998/05/birthday.html.Google ScholarGoogle Scholar
  12. }}Chung, J., Nagy, J. G., and O’Leary, D. P. 2008. A weighted GCV method for Lanczos hybrid regularization. Elec. Trans. Numer. Anal. 28, 149--167.Google ScholarGoogle Scholar
  13. }}Cooley, J. W. and Tukey, J. W. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 90, 297--301.Google ScholarGoogle ScholarCross RefCross Ref
  14. }}Dautelle, J.-M. 2007. JScience Project. http://jscience.org/.Google ScholarGoogle Scholar
  15. }}Davis, T. 2009. The University of Florida sparse matrix collection. http://www.cise.ufl.edu/research/sparse/matrices/.Google ScholarGoogle Scholar
  16. }}Davis, T. A. 2006. Direct Methods for Sparse Linear Systems. SIAM, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. }}Doolin, D. M., Dongarra, J., and Seymour, K. 1999. JLAPACK - Compiling LAPACK Fortran to Java. Sci. Program. 7, 2, 111--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. }}Dougherty, R. 2005. Extensions of DAMAS and benefits and limitations of deconvolution in beamforming. In Proceedings of the 11th AIAA/CEAS Aeroacoustics Conference.Google ScholarGoogle ScholarCross RefCross Ref
  19. }}DRA Systems. 2000. OR-Objects. http://opsresearch.com/OR-Objects/.Google ScholarGoogle Scholar
  20. }}Faber, T. L., Raghunath, N., Tudorascu, D., and Votaw, J. R. 2009. Motion correction of PET brain images through deconvolution: I. Theoretical development and analysis in software simulations. Phys. Med. Biol. 54, 3, 797--811.Google ScholarGoogle ScholarCross RefCross Ref
  21. }}Frigo, M. and Johnson, S. G. 2005. The design and implementation of FFTW3. Proc. IEEE 93, 2, 216--231.Google ScholarGoogle ScholarCross RefCross Ref
  22. }}Frigo, M. and Johnson, S. G. 2009. FFT benchmark methodology. http://www.fftw.org/speed/method.html.Google ScholarGoogle Scholar
  23. }}Golub, G. H. and Loan, C. F. V. 1996. Matrix Computations (Johns Hopkins Studies in Mathematical Sciences). The Johns Hopkins University Press.Google ScholarGoogle Scholar
  24. }}Gomez, R. 2009. JQuantLib project. http://www.jquantlib.org/index.php/Main_Page.Google ScholarGoogle Scholar
  25. }}Hansen, P. C. 1997. Rank-Deficient and Discrete Ill-Posed Problems. SIAM, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. }}Hartley, R. 1942. A more symmetrical Fourier analysis applied to transmission problems. In Proceedings of the Institute of Radio Engineer (IRE).Google ScholarGoogle ScholarCross RefCross Ref
  27. }}Heimsund, B.-O. 2007. Matrix toolkits for Java. http://ressim.berlios.de/.Google ScholarGoogle Scholar
  28. }}Henriksson, J. 2009. Endrov project. http://www.endrov.net/index.php/Main_Page.Google ScholarGoogle Scholar
  29. }}Hicklin, J., Moler, C., Webb, P., Boisvert, R. F., Miller, B., Pozo, R., and Remington, K. 2005. JAMA: A Java matrix package. http://math.nist.gov/javanumerics/jama/.Google ScholarGoogle Scholar
  30. }}Hoschek, W. 2000. Uniform, versatile and efficient dense and sparse multi-dimensional arrays. http://acs.lbl.gov/%7Ehoschek/publications/ACMJava2000.pdf.Google ScholarGoogle Scholar
  31. }}Hoschek, W. 2004. Colt project. http://dsd.lbl.gov/%7Ehoschek/colt/index.html.Google ScholarGoogle Scholar
  32. }}Hudson, M. and Larkin, R. 1994. Accelerated image reconstruction using ordered subsets of projection data. IEEE Trans. Med. Imag. 13, 601--609.Google ScholarGoogle ScholarCross RefCross Ref
  33. }}JCuda. 2009. JCuda project. http://www.jcuda.org/jcuda/JCuda.html.Google ScholarGoogle Scholar
  34. }}JUnit Community. 2009. JUnit Project. http://www.junit.org/.Google ScholarGoogle Scholar
  35. }}Kahan, W. and Darcy, J. D. 1998. How Java’s floating-point hurts everyone everywhere. http://www.cs.berkeley.edu/%7Ewkahan/JAVAhurt.pdf.Google ScholarGoogle Scholar
  36. }}Liebke, D. 2009. Incanter project. http://wiki.github.com/liebke/incanter.Google ScholarGoogle Scholar
  37. }}Menke, M. A. and Buckley, K. 1996. Compensation methods for head motion detected during PET imaging. IEEE Trans. Nucl. Sci. 43, 310--317.Google ScholarGoogle ScholarCross RefCross Ref
  38. }}Mcintosh, A. R., Bookstein, F. L., Haxby, J. V., and Grady, C. L. 1996. Spatial pattern analysis of functional brain images using partial least squares. Neuroimage 3, 143--157.Google ScholarGoogle ScholarCross RefCross Ref
  39. }}Messaoudi, C., Boudier, T., Sorzano, C., and Marco, S. 2007. TomoJ: Tomography software for three-dimensional reconstruction in transmission electron microscopy. BMC Bioinf. 8, 1, 288.Google ScholarGoogle ScholarCross RefCross Ref
  40. }}Nagy, J. G., Palmer, K., and Perrone, L. 2004. Iterative methods for image deblurring: A MATLAB object-oriented approach. Numer. Algo. 36, 1, 73--93.Google ScholarGoogle ScholarCross RefCross Ref
  41. }}Nagy, J. G. and StrakošZ. 2000. Enforcing nonnegativity in image reconstruction algorithms. In Proceedings of the Mathematical Modeling, Estimation, and Imaging, D.C. Wilson et al. Ed., Vol. 4121. 182--190.Google ScholarGoogle Scholar
  42. }}NVIDIA Corporation. 2009. CUDA zone. http://www.nvidia.com/object/cuda_home.html.Google ScholarGoogle Scholar
  43. }}Ooura, T. 2006. General purpose FFT (fast Fourier/cosine/sine transform) package. http://www.kurims.kyoto-u.ac.jp/%7Eooura/fft.html.Google ScholarGoogle Scholar
  44. }}Picard, Y. and Thompson, C. J. 1997. Motion correction of PET images using multiple acquisition frames. IEEE Trans. Med. Imag. 16, 137--144.Google ScholarGoogle ScholarCross RefCross Ref
  45. }}Rasband, W. S. 2009. ImageJ, U.S. National Institutes of Health, Bethesda, MD. http://rsb.info.nih.gov/ij/.Google ScholarGoogle Scholar
  46. }}Saad, Y. 1994. ILUT: A dual threshold incomplete ILU factorization. Numer. Linear Algebr. Appl. 1, 387--402.Google ScholarGoogle ScholarCross RefCross Ref
  47. }}Schatzman, J. C. 1996. Accuracy of the discrete Fourier transform and the fast Fourier transform. SIAM J. Sci. Comput. 17, 5, 1150--1166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. }}Strother, S., McIntosh, A. R., Somji, I., Oder, A., Spreng, N., Waller, J., Wong, F., Wright, D., Yourganov, G., and Zhao, R. 2009. PLS/NPAIRS-J project. http://code.google.com/p/plsnpairs/.Google ScholarGoogle Scholar
  49. }}Strother, S. C., Anderson, J., and Hansen, L. K. 2002. The quantitative evaluation of functional neuroimaging experiments: The NPAIRS. NeuroImage 15, 4, 747--771.Google ScholarGoogle ScholarCross RefCross Ref
  50. }}Sun Microsystems. 2004. New features and enhancements J2SE 5.0. http://java.sun.com/j2se/1.5.0/docs/relnotes/features.html.Google ScholarGoogle Scholar
  51. }}Swarztrauber, P. N. 2004. FFTPACK5. http://www.cisl.ucar.edu/css/software/fftpack5/.Google ScholarGoogle Scholar
  52. }}Vanek, P., Mandel, J., and Brezina, M. 1996. Algebraic multigrid based on smoothed aggregation for second and fourth order problems. Comput. 56, 179--196.Google ScholarGoogle ScholarCross RefCross Ref
  53. }}Verrill, S. 2009. Nonlinear optimization Java package. http://www1.fpl.fs.fed.us/optimization.html.Google ScholarGoogle Scholar
  54. }}Vogel, C. R. 2002. Computational Methods for Inverse Problems. SIAM, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. }}Wendykier, P. 2009a. CSparseJ project. http://sites.google.com/site/piotrwendykier/software/csparsej.Google ScholarGoogle Scholar
  56. }}Wendykier, P. 2009b. JPlasma project. http://sites.google.com/site/piotrwendykier/software/jplasma.Google ScholarGoogle Scholar
  57. }}Wendykier, P. 2009c. JTransforms project. http://sites.google.com/site/piotrwendykier/software/jtransforms.Google ScholarGoogle Scholar
  58. }}Wendykier, P. 2009d. Parallel Colt project. http://sites.google.com/site/piotrwendykier/software/parallelcolt.Google ScholarGoogle Scholar
  59. }}Wendykier, P. 2009e. Parallel HRRT Deconvolution project. http://sites.google.com/site/piotrwendykier/software/deconvolution/parallelhrrtdeconvolution.Google ScholarGoogle Scholar
  60. }}Wendykier, P. 2009f. Parallel iterative deconvolution project. http://sites.google.com/site/piotrwendykier/software/deconvolution/paralleliterativedeconvolution.Google ScholarGoogle Scholar
  61. }}Whaley, R. C. and Dongarra, J. J. 1998. Automatically tuned linear algebra software ATLAS. In Proceedings of Supercomputing Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. }}Yip, P. and Rao, K. R. 1980. A fast computational algorithm for the discrete sine transform. IEEE Trans. Comm. 28, 2, 304 -- 307.Google ScholarGoogle ScholarCross RefCross Ref
  63. }}Zhang, B. 2005. Java FFTPack project. http://jfftpack.sourceforge.net/.Google ScholarGoogle Scholar

Index Terms

  1. Parallel Colt: A High-Performance Java Library for Scientific Computing and Image Processing

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Mathematical Software
        ACM Transactions on Mathematical Software  Volume 37, Issue 3
        September 2010
        296 pages
        ISSN:0098-3500
        EISSN:1557-7295
        DOI:10.1145/1824801
        Issue’s Table of Contents

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 September 2010
        • Accepted: 1 April 2010
        • Revised: 1 January 2010
        • Received: 1 December 2008
        Published in toms Volume 37, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader