大数据生态系统

1:存储

hadoop hdfs

2:计算引擎

map/reduce v1

map/reduce v2(map/reduce on yarn)

Tez

spark

3:IMpala preso Drill 直接跑在hdfs上

pig(脚本方式) hive(SQL语言)跑在map/reduce上

hive on Tez/sparkSQL

4:流式计算 -storm

5:kv  store

 cassandra mongodb hbase

6:Tensorflow Mahout

7:Zookeeper Portobuf

8:sqoop kafka flume