Here is Presto,Impala Simple test comparison of these two typical memory databases , Of course, this kind of memory database is similar to spark
sql, This kind of database has a large amount of data , When multi table associated query , Will show their own advantages , Here is a group impala and presto Performance comparison chart of :



Environmental preparation :1 platform 32G Memory ,2 platform 16G Memory , Not fully saturated memory configuration

test data :hive in 3 Zhang 2000W Table of data quantity

colony :impala and presro Deployed at 3 On machines



presto edition :presto-server-0.191
(presto install :http://blog.csdn.net/u012551524/article/details/79013194)

impala edition :2.8.0-cdh5.11.0




1, Aggregate operation of single table




Presto:count







1s(presto At present, it is only accurate to integer , So less than 1s Also display 1s)




Impala:count




0.24s




Presto:count,distinct





Take it 3 second , namely :4,3,3 (s)







Impala:count,distinct




Take it 3 second :0.74,0.75,0.76(s)




2, Single value query



Presto : Query a ID Records of


3 second :6,5,6(s)




Impala:




3 All the time 1.7s about




3, Two table Association (2 Zhang 2000W Watch making join)

Presto:




3 Secondary results :9,11,9




Impala:




3 Secondary results in 7s about




4,3 Table Association (3 Zhang 2000W Watch making join)

Presto:




4 Secondary results :13,11,15,12(s)




Impala:







3 Secondary results in 8.9s about





summary : This is a comparison of query efficiency in some scenarios , Not a lot of data , But I can see some problems , What they have in common is to eat memory , Of course, with enough memory , And there are clusters of appropriate scale , Performance should be better , As can be seen from the figure above Impala Slightly ahead of performance presto, however presto Rich data source support , include hive, Figure database , Traditional relational database ,Redis etc.




shortcoming : These two are right hbase No good support ,presto
I won't support it , But yes hdfs,hive Good compatibility , In fact, it's natural , So data source processing is very important , in the light of hbase The second level index query of phoenix, It's not bad either