no code implementations • 12 Dec 2017 • Salman Salloum, Yulin He, Joshua Zhexue Huang, Xiaoliang Zhang, Tamer Z. Emara, Chenghao Wei, Heping He
In this paper, we propose the random sample partition (RSP) data model to represent a big data set as a set of non-overlapping data subsets, called RSP data blocks, where each RSP data block has a probability distribution similar to the whole big data set.