DisCo: Distributed Co-clustering with Map-Reduce: A Case Study towards Petabyte-Scale End-to-End Mining, 2008, ICDM
PDF http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.163.1677&rep=rep1&type=pdf
Huge datasets are becoming prevalent; even as researchers, we now routinely have to work with datasets that are up to a few terabytes in size. Interesting real-world applications produce huge volumes of messy data. The mining process involves several steps, starting from pre-processing the raw data to estimating the final models. As data become more abundant, scalable and easy-to-use tools for distributed processing are also emerging. Among those, Map-Reduce has been widely embraced by both academia and industry.
gnuplotでeps
12 年前
0 件のコメント:
コメントを投稿