Skip to main content

Advances, Systems and Applications

Table 1 Workloads based on Hadoop framework: System Resource Utilization

From: Performance characterization and analysis for Hadoop K-means iteration

Workloads

System Resource Utilization

WordSort

Sort Phase: IO-bound in the Reduce Phase: Communication-bound.

Word Count

CPU-bound

TeraSort

Map Stage: CPU-Bound

Reduce stage: IO-bound

NutchIndexing

IO-bound with high CPU utilizations in the map stage. This workload is mainly used for web searching.

K-means

CPU-bound in the iteration, IO-bound in the clustering. It is used for machine learning and data mining.