A Distributed Learning Architecture for Scientific Imaging Problems

doi:10.48550/arXiv.1809.05956

A Distributed Learning Architecture for Scientific Imaging Problems

Current trends in scientific imaging are challenged by the emerging need of integrating sophisticated machine learning with Big Data analytics platforms. This work proposes an in-memory distributed learning architecture for enabling sophisticated learning and optimization techniques on scientific imaging problems, which are characterized by the combination of variant information from different origins. We apply the resulting, Spark-compliant, architecture on two emerging use cases from the scientific imaging domain, namely: (a) the space variant deconvolution of galaxy imaging surveys (astrophysics), (b) the super-resolution based on coupled dictionary training (remote sensing). We conduct evaluation studies considering relevant datasets, and the results report at least 60\% improvement in time response against the conventional computing solutions. Ultimately, the offered discussion provides useful practical insights on the impact of key Spark tuning parameters on the speedup achieved, and the memory/disk footprint.

Publication:

arXiv e-prints

Pub Date:

September 2018

DOI:

10.48550/arXiv.1809.05956

arXiv:

arXiv:1809.05956

Bibcode:

2018arXiv180905956P

Keywords:

Computer Science - Distributed;
Parallel;
and Cluster Computing

NASA/ADS

A Distributed Learning Architecture for Scientific Imaging Problems

Abstract