大数据生态系统
1:存储
hadoop hdfs
2:计算引擎
map/reduce v1
map/reduce v2(map/reduce on yarn)
Tez
spark
3:IMpala preso Drill 直接跑在hdfs上
pig(脚本方式) hive(SQL语言)跑在map/reduce上
hive on Tez/sparkSQL
4:流式计算 -storm
5:kv store
cassandra mongodb hbase
6:Tensorflow Mahout
7:Zookeeper Portobuf
8:sqoop kafka flume