A Brief of Distributed Data Processing


  • Clemente Izurieta Gianforte School of Computing, Montana State University
  • Nate Woods Gianforte School of Computing, Montana State University
  • Ann Marie Reinhold Gianforte School of Computing, Montana State University


mapping study, data processing, distributed processing systems


Distributed data processing is a cornerstone in modern cloud and edge computing environments because of its ability to handle large amounts of information that can overwhelm a single computer. However, the ontogeny of research in the field of distributed data processing remains poorly characterized. Therefore, we reviewed 70 publications discussing distributed data processing. Distributed processing systems is an active area of research with publications increasing in numbers since the early 2000s. The most salient topics in distributed processing systems were affiliated with system architecture and programming paradigms. However, researchers lack standard metrics for reporting throughput, hampering the comparison of existing studies. This study is a first step towards characterizing this field of research and identifying important areas of opportunity.


Akoka J., Comyn-Wattiau I., and Laoufi N. Research on big data–a systematic mapping study. Comp. Stand & Interfaces, 54:105–115, 2017.

Fahed Alkhabbas, Romina Spalazzese, and Paul Davidsson. Characteriz- ing internet of things systems through taxonomies: A systematic mapping study. Internet of Things, 7:100084, 2019.

Sean Bleier. Natural language took kit’s list of english stopwords. https://gist.github.com/sebleier/554280, 2021. Accessed: 2021-01-24.

D. Budgen, M. Turner, P. Brereton, and B. A. Kitchenham. Using mapping studies in soft. engineering. In PPIG, V8, pg. 195–204, 2008.

Omri M.N., Helali L. A survey of data center consolidation in cloud computing systems. doi.org/10.1016/j.cosrev.2021.100366, Feb 2021.

Machine Learning Group University of WAIKATO. Weka 3 - data mining with open source machine learning software in java. https://www.cs.waikato.ac.nz/ml/weka/, 2021. Accessed: 2021-01-24.

Thies W., and Amarasinghe S. An empirical characterization of stream programs and its implications for language and compiler design. In 2010 19th Int. Conf. on Parallel Architectures and Compilation Techniques (PACT), pages 365–376. IEEE, 2010.

Twitter. Decahose api. https://developer.twitter.com/en/docs/twitter- api/enterprise/decahose-api/overview/decahose, 2021.