Advances, Systems and Applications
From: Performance characterization and analysis for Hadoop K-means iteration
Workloads | System Resource Utilization |
---|---|
WordSort | Sort Phase: IO-bound in the Reduce Phase: Communication-bound. |
Word Count | CPU-bound |
TeraSort | Map Stage: CPU-Bound |
Reduce stage: IO-bound | |
NutchIndexing | IO-bound with high CPU utilizations in the map stage. This workload is mainly used for web searching. |
K-means | CPU-bound in the iteration, IO-bound in the clustering. It is used for machine learning and data mining. |