Distributed, Parallel, and Cluster Computing · Computer Science
A Benchmarking Study to Evaluate Apache Spark on Large-Scale Supercomputers
George K. Thiruvathukal, Cameron Christensen, Xiaoyong Jin, François Tessier +1
2019-10-01
Distributed, Parallel, and Cluster Computing · Computer Science
Declarative Data Pipeline for Large Scale ML Services
Yunzhao Yang, Runhui Wang, Xuanqing Liu, Adit Krishnan +13
2025-11-07
Distributed, Parallel, and Cluster Computing · Computer Science
High-Dimensional Data Processing: Benchmarking Machine Learning and Deep Learning Architectures in Local and Distributed Environments
Julian Rodriguez, Piotr Lopez, Emiliano Lerma, Rafael Medrano +1
2025-12-12
Databases · Computer Science
A Big Data Analysis Framework Using Apache Spark and Deep Learning
Anand Gupta, Hardeo Thakur, Ritvik Shrivastava, Pulkit Kumar +1
2017-11-28
Distributed, Parallel, and Cluster Computing · Computer Science
Real-time Text Analytics Pipeline Using Open-source Big Data Tools
Hassan Nazeer, Waheed Iqbal, Fawaz Bokhari, Faisal Bukhari +1
2017-12-13
Distributed, Parallel, and Cluster Computing · Computer Science
Reproducible Experiments for Comparing Apache Flink and Apache Spark on Public Clouds
Shelan Perera, Ashansa Perera, Kamal Hakimzadeh
2016-10-17
Databases · Computer Science
On the Evaluation of RDF Distribution Algorithms Implemented over Apache Spark
Olivier Curé, Hubert Naacke, Mohamed-Amine Baazizi, Bernd Amann
2015-07-10
Distributed, Parallel, and Cluster Computing · Computer Science
Distributed Streaming Analytics on Large-scale Oceanographic Data using Apache Spark
Janak Dahal, Elias Ioup, Shaikh Arifuzzaman, Mahdi Abdelguerfi
2019-08-02
Distributed, Parallel, and Cluster Computing · Computer Science
A Spark ML driven preprocessing approach for deep learning based scholarly data applications
Samiya Khan, Xiufeng Liu, Mansaf Alam
2019-11-19
Distributed, Parallel, and Cluster Computing · Computer Science
Mining Area Skyline Objects from Map-based Big Data using Apache Spark Framework
Chen Li, Ye Zhu, Yang Cao, Jinli Zhang +3
2024-04-05
Machine Learning · Computer Science
Imbalanced Big Data Oversampling: Taxonomy, Algorithms, Software, Guidelines and Future Directions
William C. Sleeman, Bartosz Krawczyk
2022-11-16
Databases · Computer Science
InferSpark: Statistical Inference at Scale
Zhuoyue Zhao, Jialing Pei, Eric Lo, Kenny Q. Zhu +1
2017-10-10
Distributed, Parallel, and Cluster Computing · Computer Science
Nuova frontiera della classificazione testuale: Big data e calcolo distribuito
Marco Covelli, Massimiliano Morrelli
2019-08-22
Databases · Computer Science
Benchmarking Distributed Stream Data Processing Systems
Jeyhun Karimov, Tilmann Rabl, Asterios Katsifodimos, Roman Samarev +2
2019-06-27
Distributed, Parallel, and Cluster Computing · Computer Science
Understanding and Optimizing the Performance of Distributed Machine Learning Applications on Apache Spark
Celestine Dünner, Thomas Parnell, Kubilay Atasu, Manolis Sifalakis +1
2018-06-21
Distributed, Parallel, and Cluster Computing · Computer Science
A Survey on Spark Ecosystem for Big Data Processing
Shanjiang Tang, Bingsheng He, Ce Yu, Yusen Li +1
2020-12-17
Distributed, Parallel, and Cluster Computing · Computer Science
Machine Learning Pipelines with Modern Big Data Tools for High Energy Physics
Matteo Migliorini, Riccardo Castellotti, Luca Canali, Marco Zanetti
2020-06-17
Artificial Intelligence · Computer Science
Large-Scale Intelligent Microservices
Mark Hamilton, Nick Gonsalves, Christina Lee, Anand Raman +7
2022-03-17
Distributed, Parallel, and Cluster Computing · Computer Science
Rethinking Storage Management for Data Processing Pipelines in Cloud Data Centers
Ubaid Ullah Hafeez, Martin Maas, Mustafa Uysal, Richard McDougall
2022-11-07
Machine Learning · Computer Science
MLlib: Machine Learning in Apache Spark
Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks +12
2015-05-27