Hadoop 導入
- CentOS 5.6, Java SE 6 Update 27, Hadoop 0.20.203.0
- インストールなどは以下のページを参考にした
- http://d.hatena.ne.jp/yokkuns/20110426/1303781488
- 物理1台で擬似分散
- CPU は AMD Athlon 64 x2
Pi を計算するサンプルを実行してみた
- CPU 2コアなので2プロセスが速そうだなと思ったので
- 思ったよりも結果に揺れがあるので10回ずつ測って平均とかのほうがいいかも
1プロセス
$ ./hadoop jar ../hadoop-examples-0.20.203.0.jar pi 1 10000 Number of Maps = 1 Samples per Map = 10000 Wrote input for Map #0 Starting Job < 省略 > Job Finished in 39.313 seconds Estimated value of Pi is 3.14080000000000000000
2プロセス
$ ./hadoop jar ../hadoop-examples-0.20.203.0.jar pi 2 10000 Number of Maps = 2 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Starting Job < 省略 > Job Finished in 38.043 seconds Estimated value of Pi is 3.14280000000000000000
4プロセス
$ ./hadoop jar ../hadoop-examples-0.20.203.0.jar pi 4 10000 Number of Maps = 4 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Starting Job < 省略 > Job Finished in 42.332 seconds Estimated value of Pi is 3.14140000000000000000