Skip to main content

Advances, Systems and Applications

Table 5 Comparing the number of MapReduce tasks and the sizes of read/written HDFS files by Hive and SharedHive for different correlation level 100GB TPC-H data warehouse queries

From: Improving the performance of Hadoop Hive by sharing scan and computation tasks

Query set (correlation level)

# Map tasks

# Reduce tasks

Read (GB)

Written (GB)

 

Hive

S.Hive

Hive

S.Hive

Hive

S.Hive

Hive

S.Hive

11,12 (set 1) (none)

463

463

116

116

123

123

20

20

6,17 (set 5) (partial)

1,568

663

326

89

51,561

51,561

504,975

34,806

14,19 (set 7) (full)

668

334

280

84

356

171

168

2

1,3,11,14,17,19 (set 11) (mixed)

2,149

1,150

452

288

636

404

291

206