kettle_Hbase
☀Hbase学习笔记
读取hdfs文件并将sal大于1000的数据保存到hbase中
前置说明:
1.需要配置HadoopConnect 将集群中的/usr/local/soft/hbase-1.4.6/conf/hbase-site.xml复制至Kettle中的
Kettle\pdi-ce-8.2.0.0-342\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\hdp26目录中
2.配置Hadoop Cluster 中Zookeeper的Hostname为master,port为2181
1、在HBase中创建一张people表
hbase(main):004:0> create 'people','info'
2、按下图建立流程图
data:image/s3,"s3://crabby-images/acda5/acda5a11116ea7a4ac635694054bc79185f86079" alt=""
- 文本文件输入
data:image/s3,"s3://crabby-images/ecbb3/ecbb325d7d9fc5cba777e3ea1a49a098646839cf" alt=""
data:image/s3,"s3://crabby-images/2b4bf/2b4bfff84f78d857d4be47b625b72ef7c26d7130" alt=""
- 设置过滤记录
data:image/s3,"s3://crabby-images/0228c/0228c567d3c7dd13da8097866e49aea29cf52ccd" alt=""
-
设置HBase output
编辑hadoop连接,并配置zookeeper地址
data:image/s3,"s3://crabby-images/0c721/0c721a779a055f06f21c043cf29f261bad5ca9df" alt=""
data:image/s3,"s3://crabby-images/8f99a/8f99adbf1257aaf8fb444a309aea2772f0b3ee66" alt=""
- 执行转换
data:image/s3,"s3://crabby-images/a0589/a0589621a701647be6655c270cc2c39926c71c08" alt=""
-
查看hbase people表的数据
scan 'people'
注意:若报错没有权限往hdfs写文件,在Spoon.bat中第119行添加参数
"-DHADOOP_USER_NAME=root" "-Dfile.encoding=UTF-8"