Here is Presto,Impala Simple test comparison of these two typical memory databases , Of course, this kind of memory database is similar to spark
sql, This kind of database has a large amount of data , When multi table associated query , Will show their own advantages , Here is a group impala and presto Performance comparison chart of :

Environmental preparation :1 platform 32G Memory ,2 platform 16G Memory , Not fully saturated memory configuration

test data :hive in 3 Zhang 2000W Table of data quantity

colony :impala and presro Deployed at 3 On machines

presto edition :presto-server-0.191
(presto install :

impala edition :2.8.0-cdh5.11.0

1, Aggregate operation of single table


1s(presto At present, it is only accurate to integer , So less than 1s Also display 1s)




Take it 3 second , namely :4,3,3 (s)


Take it 3 second :0.74,0.75,0.76(s)

2, Single value query

Presto : Query a ID Records of

3 second :6,5,6(s)


3 All the time 1.7s about

3, Two table Association (2 Zhang 2000W Watch making join)


3 Secondary results :9,11,9


3 Secondary results in 7s about

4,3 Table Association (3 Zhang 2000W Watch making join)


4 Secondary results :13,11,15,12(s)


3 Secondary results in 8.9s about

summary : This is a comparison of query efficiency in some scenarios , Not a lot of data , But I can see some problems , What they have in common is to eat memory , Of course, with enough memory , And there are clusters of appropriate scale , Performance should be better , As can be seen from the figure above Impala Slightly ahead of performance presto, however presto Rich data source support , include hive, Figure database , Traditional relational database ,Redis etc.

shortcoming : These two are right hbase No good support ,presto
I won't support it , But yes hdfs,hive Good compatibility , In fact, it's natural , So data source processing is very important , in the light of hbase The second level index query of phoenix, It's not bad either