鈴村グループ (旧・鈴村研究室) : A Code Generation Approach to Optimizing High-Performance Distributed Data Stream Processing

2010年1月5日火曜日

A Code Generation Approach to Optimizing High-Performance Distributed Data Stream Processing

現在の System S/SPADE の中身を最も忠実に反映している論文です。

A Code Generation Approach to Optimizing High-Performance Distributed Data Stream Processing(PDF) , Bu˘ gra Gedik

ACM CIKM 2009 (http://www.comp.polyu.edu.hk/conference/cikm2009/)
Conference on Information and Knowledge Management archive
Proceeding of the 18th ACM conference on Information and knowledge management table of contents

We present a code-generation-based optimization approach to　bringing performance and scalability to distributed stream processing　applications. We express stream processing applications using　an operator-based, stream-centric language called SPADE, which　supports composing distributed data flow graphs out of toolkits of　type-generic operators. A major challenge in building such applications　is to find an effective and flexible way of mapping the logical graph of operators into a physical one that can be deployed on a set of distributed nodes. This involves finding how best operators map to processes and how best processes map to computing nodes. In this paper, we take a two-stage optimization approach, where an instrumented version of the application is first generated by the SPADE compiler to profile and collect statistics about the processing and communication characteristics of the operators within the application. In the second stage, the profiling information is fed to an optimizer to come up with a physical data flow graph that is deployable across nodes in a computing cluster. This approach not only creates highly optimized applications that are tailored to the underlying computing and networking infrastructure, but also makes it possible to re-target the application to a different hardware setup by simply repeating the optimization step and re-compiling
the application to match the physical flow graph produced by the optimizer. Using real-world applications, from diverse domains such as finance and radio-astronomy, we demonstrate the effectiveness of our approach on System S – a large-scale, distributed stream
processing platform.

鈴村グループ (旧・鈴村研究室)

2010年1月5日火曜日

A Code Generation Approach to Optimizing High-Performance Distributed Data Stream Processing

0 件のコメント:

コメントを投稿

メニュー

マイブログリスト

フォロワー

ブログアーカイブ

参加ユーザー

鈴村グループ (旧・鈴村研究室)

2010年1月5日火曜日

A Code Generation Approach to Optimizing High-Performance Distributed Data Stream Processing

0 件のコメント:

コメントを投稿

メニュー

マイブログ リスト

フォロワー

ブログ アーカイブ

参加ユーザー

マイブログリスト

ブログアーカイブ