Ranger集成CDH6.3.1详细步骤

CDH-ranger

基础环境:

jdk: 1.8

maven: mvn -verison

Apache Maven 3.9.4

brew search bzip2

brew install bzip2

brew list

yum install git -y

yum install -y python3

yum install -y bzip2

yum -y install fontconfig-devel

pip3 install requests

下载ranger源码:

git clone --branch release-ranger-2.1.0 https://github.com/apache/ranger.git

########################################POM文件修改##################################

Ranger修改POM文件:

vi pom.xml

1)在repositories新增以下部分,加快编译速度。

cloudera

https://repository.cloudera.com/artifactory/cloudera-repos

true

false

2)修改组件为CDH对应的版本

<hadoop.version>3.0.0-cdh6.3.1</hadoop.version>

<hbase.version>2.1.0-cdh6.3.1</hbase.version>

<hive.version>2.1.1-cdh6.3.1</hive.version>

<kafka.version>2.2.1-cdh6.3.1</kafka.version>

<solr.version>7.4.0-cdh6.3.1</solr.version>

<zookeeper.version>3.4.5-cdh6.3.1</zookeeper.version>

主要修改包括hadoop,kafka,hbase等等。这块需要用到啥组件就改对应组件即可。

3)修改ES对应版本

<elasticsearch.version>7.13.0</elasticsearch.version>

HIVE版本兼容问题

Apache Ranger 2.1.0 对应hive版本3.1.2,CDH 6.3.2对应hive版本2.1.1,不兼容,hive server启动会报错。

1-下载Apache Ranger1.2.0 版本: git clone --branch release-ranger-1.2.0 https://github.com/apache/ranger.git

2-删除Apache Ranger 2.1.0 版本的hive插件hive-agent:rm -rf ./ranger2.1/hive-agent

3-将Apache Ranger1.2.0 版本的hive插件hive-agent拷贝到Apache Ranger 2.1.0 目录中:cp -r ./ranger1.2/hive-agent ./ranger2.1/

4-使用下面的pom文件替代hive-agent下面的pom
<?xml version="1.0" encoding="UTF-8"?> 4.0.0 ranger-hive-plugin Hive Security Plugin Hive Security Plugins jar

这里面主要做了三个操作:

1、把hive相关的版本替换成2.1.1-cdh6.3.1(按需修改),如果不修改,默认情况下为hive-3.1.2(虽然已经在ranger的pom里面已经配置为hive.version为2.1.1-cdh6.3.1,但是没生效)

2、在添加仓库地址,之前在ranger这个父pom里面已经添加了下述的配置,但是实际没有找到

cloudera

https://repository.cloudera.com/artifactory/cloudera-repos

true

false

3、修改新拷贝过去 hive-agent目录中pom.xml的ranger版本号为2.1.0:

vim ./ranger-2.1/hive-agent/pom.xml

复制代码
<parent>
    <groupId>org.apache.ranger</groupId>
    <artifactId>ranger</artifactId>
    <version>2.1.0</version>
    <relativePath>..</relativePath>
</parent>

添加ranger-plugins-common依赖

在下面添加如下依赖,否则找不到添加的代码。

复制代码
    <dependency>
        <groupId>org.apache.ranger</groupId>
        <artifactId>ranger-plugins-common</artifactId>
        <version>2.1.1-SNAPSHOT</version>
        <scope>compile</scope>
    </dependency>

kylin插件POM修改

1)修改./ranger-2.1/ranger-kylin-plugin-shim/pom.xml 文件
org.apache.kylin kylin-server-base {kylin.version} provided org.apache.kylin kylin-external-htrace org.apache.calcite calcite-core org.apache.calcite calcite-linq4j 2)修改./ranger-2.1/plugin-kylin/pom.xml 文件 org.apache.kylin kylin-server-base {kylin.version} provided org.apache.kylin kylin-external-htrace org.apache.calcite calcite-core org.apache.calcite calcite-linq4j

修改distro对应的pom文件

vim ./ranger-2.1/distro/pom.xml

maven-assembly-plugin

3.3.0

主要是版本修改下,不然编译不过去。

########################################兼容性源码修改##################################

兼容性源码修改

1-修改RangerDefaultAuditHandler.java类

vim /data/packages/ranger/ranger-2.1/agents-common/src/main/java/org/apache/ranger/plugin/audit/RangerDefaultAuditHandler.java

在源码的import导入里面添加:

import org.apache.ranger.authorization.hadoop.config.RangerConfiguration;

在public class RangerDefaultAuditHandler implements行下面添加如下代码

protected static final String RangerModuleName = RangerConfiguration.getInstance().get(RangerHadoopConstants.AUDITLOG_RANGER_MODULE_ACL_NAME_PROP , RangerHadoopConstants.DEFAULT_RANGER_MODULE_ACL_NAME);

2-修改RangerConfiguration.java类:

vim /data/packages/ranger/ranger-2.1/agents-common/src/main/java/org/apache/ranger/authorization/hadoop/config/RangerConfiguration.java

在public class RangerConfiguration extends Configuration代码下面添加如下代码:

private static volatile RangerConfiguration config;

public static RangerConfiguration getInstance() {

RangerConfiguration result = config;

if (result == null) {

synchronized (RangerConfiguration.class) {

result = config;

if (result == null) {

config = result = new RangerConfiguration();

}

}

}

return result;

}

3-修改RequestUtils.java

vim /root/ranger-cdh-hylink/ranger-elasticsearch-plugin-shim/src/main/java/org/apache/ranger/authorization/elasticsearch/plugin/utils/RequestUtils.java

SocketAddress socketAddress = request.getHttpChannel().getRemoteAddress();

修改为

SocketAddress socketAddress = request.getRemoteAddress();

4-修改ElasticSearchAccessAuditsService.java

vim /root/ranger-cdh-hylink/security-admin/src/main/java/org/apache/ranger/elasticsearch ElasticSearchAccessAuditsService.java

returnList.setTotalCount(response.getHits().getTotalHits().value);

修改为:

returnList.setTotalCount(response.getHits().getTotalHits());

5-修改RangerElasticsearchPlugin.java类

vim /data/packages/ranger/ranger-2.1/ranger-elasticsearch-plugin-shim/src/main/java/org/apache/ranger/authorization/elasticsearch/plugin/RangerElasticsearchPlugin.java

把createComponents方法上面的@Override删除

@Override

public Collection createComponents

修改为:

public Collection createComponents

6-修改ServiceKafkaClient.java类

vim /data/packages/ranger/ranger-2.1/plugin-kafka/src/main/java/org/apache/ranger/services/kafka/client/ServiceKafkaClient.java

38行删除:

import scala.Option;

87行:

ZooKeeperClient zookeeperClient = new ZooKeeperClient(zookeeperConnect, sessionTimeout, connectionTimeout,

1, Time.SYSTEM, "kafka.server", "SessionExpireListener", Option.empty());

修改:

ZooKeeperClient zookeeperClient = new ZooKeeperClient(zookeeperConnect, sessionTimeout, connectionTimeout,

1, Time.SYSTEM, "kafka.server", "SessionExpireListener");

7-实现getHivePolicyProvider方法

vim /data/packages/ranger/ranger-2.1/hive-agent/src/main/java/org/apache/ranger/authorization/hive/authorizer/RangerHiveAuthorizer.java

在public boolean needTransform下面添加如下代码

复制代码
@Override
public HivePolicyProvider getHivePolicyProvider() throws HiveAuthzPluginException {
	if (hivePlugin == null) {
		throw new HiveAuthzPluginException();
	}
	RangerHivePolicyProvider policyProvider = new RangerHivePolicyProvider(hivePlugin, this);

	return policyProvider;
}

########################################CDH平台适配##################################

CDH平台适配 - 配置文件

问题描述:

CDH在重启组件服务时为组件服务独立启动进程运行,动态生成运行配置文件目录和配置文件,ranger插件配置文件部署到CDH安装目录无法被组件服务读取到。

解决方案

在/data/packages/ranger/ranger-2.1/agents-common/src/main/java/org/apache/ranger/authorization/hadoop/config/RangerPluginConfig.java中添加copyConfigFile方法:

1)把所有的import导入的类换成如下:

import org.apache.commons.collections.CollectionUtils;

import org.apache.commons.io.FileUtils;

import org.apache.commons.io.filefilter.IOFileFilter;

import org.apache.commons.io.filefilter.RegexFileFilter;

import org.apache.commons.io.filefilter.TrueFileFilter;

import org.apache.commons.lang.StringUtils;

import org.apache.hadoop.conf.Configuration;

import org.apache.log4j.Logger;

import org.apache.ranger.authorization.utils.StringUtil;

import org.apache.ranger.plugin.policyengine.RangerPolicyEngineOptions;

import java.io.File;

import java.net.URL;

import java.util.*;

2)在private Set superGroups代码下面添加如下代码:

private void copyConfigFile(String serviceType) {

// 这个方法用来适配CDH版本的组件,非CDH组件需要跳出

if (serviceType.equals("presto")) {

return;

}

// 环境变量

Map map = System.getenv();

Iterator it = map.entrySet().iterator();

while (it.hasNext()) {

Map.Entry entry = (Map.Entry) it.next();

LOG.info("env key: " + entry.getKey() + ", value: " + entry.getValue());

}

// 系统变量

Properties properties = System.getProperties();

Iterator itr = properties.entrySet().iterator();

while (itr.hasNext()) {

Map.Entry entry = (Map.Entry) itr.next();

LOG.info("system key: " + entry.getKey() + ", value: " + entry.getValue());

}

复制代码
String serviceHome = "CDH_" + serviceType.toUpperCase() + "_HOME";
if ("CDH_HDFS_HOME".equals(serviceHome)) {
	serviceHome = "CDH_HADOOP_HOME";
}

serviceHome = System.getenv(serviceHome);
File serviceHomeDir = new File(serviceHome);
String userDir = System.getenv("CONF_DIR");
File destDir = new File(userDir);

LOG.info("-----Service Home: " + serviceHome);
LOG.info("-----User dir: " + userDir);
LOG.info("-----Dest dir: " + destDir);

IOFileFilter regexFileFilter = new RegexFileFilter("ranger-.+xml");
Collection<File> configFileList = FileUtils.listFiles(serviceHomeDir, regexFileFilter, TrueFileFilter.INSTANCE);
boolean flag = true;
for (File rangerConfigFile : configFileList) {
	try {
		if (serviceType.toUpperCase().equals("HIVE") && flag) {
			File file = new File(rangerConfigFile.getParentFile().getPath() + "/xasecure-audit.xml");
			FileUtils.copyFileToDirectory(file, destDir);
			flag = false;
			LOG.info("-----Source dir: " + file.getPath());
		}
		FileUtils.copyFileToDirectory(rangerConfigFile, destDir);
	} catch (IOException e) {
		LOG.error("Copy ranger config file failed.", e);
	}
}

}

3)在addResourcesForServiceType方法第一行添加copyConfigFile的调用:

private void addResourcesForServiceType(String serviceType) {

复制代码
copyConfigFile(serviceType);

String auditCfg    = "ranger-" + serviceType + "-audit.xml";
String securityCfg = "ranger-" + serviceType + "-security.xml";
String sslCfg 	   = "ranger-policymgr-ssl.xml";

if (!addResourceIfReadable(auditCfg)) {
    addAuditResource(serviceType);
}

if (!addResourceIfReadable(securityCfg)) {
    addSecurityResource(serviceType);
}

if (!addResourceIfReadable(sslCfg)) {
    addSslConfigResource(serviceType);
}

}

CDH平台适配 - ENABLE-AGENT.SH配置

问题描述

hdfs和yarn插件安装部署后,插件jar包会部署到组件安装目录的share/hadoop/hdfs/lib子目录下,启动hdfs或yarn运行时加载不到这些jar包,会报ClassNotFoundException: Class org.apache.ranger.authorization.yarn.authorizer.RangerYarnAuthorizer not found

kafka插件安装部署后,启动运行时会从插件jar包所在目录加载ranger插件配置文件,读不到配置文件会报错addResourceIfReadable(ranger-kafka-audit.xml): couldn't find resource file location

解决方案

修改agents-common模块enable-agent.sh脚本文件:

vim /data/packages/ranger/ranger-2.1/agents-common/scripts/enable-agent.sh

HCOMPONENT_LIB_DIR=${HCOMPONENT_INSTALL_DIR}/share/hadoop/hdfs/lib

修改为:

HCOMPONENT_LIB_DIR=${HCOMPONENT_INSTALL_DIR}

将:

elif [ "${HCOMPONENT_NAME}" = "kafka" ]; then

复制代码
HCOMPONENT_CONF_DIR=${HCOMPONENT_INSTALL_DIR}/config

修改为:

elif [ "${HCOMPONENT_NAME}" = "kafka" ]; then

复制代码
HCOMPONENT_CONF_DIR=${PROJ_LIB_DIR}/ranger-kafka-plugin-impl

########################################源码编译##################################

源码编译

cd ./ranger2.1

编译

root@hadoop1 ranger-2.1\]# /data/maven/bin/mvn clean package install -Dmaven.test.skip=true -X 或者 \[root@hadoop1 ranger-2.1\]# /data/maven/bin/mvn clean compile package assembly:assembly install -DskipTests -Drat.skip=true 或者 \[root@hadoop1 ranger-2.1\]# /data/maven/bin/mvn clean package install -Dpmd.skip=true -Dcheckstyle.skip=true -Dmaven.test.skip=true 说明: 选择第一种方式编译,会跳过测试代码编译以及测试; 第二种方式编译会跳过测试代码测试,但是不会跳过编译; 第三种方式主要是忽略一些规范问题,比如修改源码时代码或者注释不规范,编译可能会报You have 1 PMD violation,通过这种方式解决即可。 本次使用的最后一个命令最终编译成功。构建过程躺了很多坑,差不多花了两天才解决。相比ranger编译社区版来说,要费劲一些。 报错:Could not resolve dependencies for project org.apache.ranger:ranger-kylin-plugin:jar:2.1.0 修改:ranger-release-ranger-2.1.0/ranger-kylin-plugin-shim的pom.xml文件 org.apache.kylin kylin-server-base k y l i n . v e r s i o n \< / v e r s i o n \> \< s c o p e \> p r o v i d e d \< / s c o p e \> \< e x c l u s i o n s \> \< e x c l u s i o n \> \< g r o u p I d \> o r g . a p a c h e . k y l i n \< / g r o u p I d \> \< a r t i f a c t I d \> k y l i n − e x t e r n a l − h t r a c e \< / a r t i f a c t I d \> \< / e x c l u s i o n \> \< e x c l u s i o n \> \< g r o u p I d \> o r g . a p a c h e . c a l c i t e \< / g r o u p I d \> \< a r t i f a c t I d \> c a l c i t e − c o r e \< / a r t i f a c t I d \> \< / e x c l u s i o n \> \< e x c l u s i o n \> \< g r o u p I d \> o r g . a p a c h e . c a l c i t e \< / g r o u p I d \> \< a r t i f a c t I d \> c a l c i t e − l i n q 4 j \< / a r t i f a c t I d \> \< / e x c l u s i o n \> \< / e x c l u s i o n s \> \< / d e p e n d e n c y \> 修改 r a n g e r − r e l e a s e − r a n g e r − 2.1.0 / p l u g i n − k y l i n 的 p o m . x m l 文件 \< d e p e n d e n c y \> \< g r o u p I d \> o r g . a p a c h e . k y l i n \< / g r o u p I d \> \< a r t i f a c t I d \> k y l i n − s e r v e r − b a s e \< / a r t i f a c t I d \> \< v e r s i o n \> {kylin.version}\ \provided\ \ \ \org.apache.kylin\ \kylin-external-htrace\ \ \ \org.apache.calcite\ \calcite-core\ \ \ \org.apache.calcite\ \calcite-linq4j\ \ \ \ 修改ranger-release-ranger-2.1.0/plugin-kylin的pom.xml文件 \ \org.apache.kylin\ \kylin-server-base\ \ kylin.version\\provided\\\\org.apache.kylin\\kylin−external−htrace\\\\org.apache.calcite\\calcite−core\\\\org.apache.calcite\\calcite−linq4j\\\\修改ranger−release−ranger−2.1.0/plugin−kylin的pom.xml文件\\org.apache.kylin\\kylin−server−base\\{kylin.version} provided org.apache.kylin kylin-external-htrace org.apache.calcite calcite-core org.apache.calcite calcite-linq4j 约10分钟,可编译完成,过程中可能出现编译错误,根据错误信息进行逐条解决。 在/root/ranger-cdh-hylink/target目录下生成以下文件: ########################################安装RANGER-ADMIN################################## 安装RANGER-ADMIN 安装ranger-admin的机器可以不在Hadoop集群内部,可以是随便一台。这里我就安装在当前机器上。 1、解压admin软件包 tar -zxvf ranger-2.1.0-admin.tar.gz -C /opt/ranger-2.1.0-admin 2、修改install.properties文件 注:ranger-admin依赖数据库,需要存储信息。安装数据库操作请自行百度。 cd /opt/ranger-2.1.0-admin 修改install.properties文件,需要修改的信息如下: (1)数据库配置: SQL_CONNECTOR_JAR=/usr/share/java/mysql-connector-java.jar #sql连接器,需要从官网下载,然后修改成该名字,当然也可以自己定义 db_root_user=root #数据库用户名 db_root_password=\*\*\*\*\* #数据库密码 db_host=localhost #数据库主机,如果不在一台机器,请修改 以下三个属性是用于设置ranger数据库的 db_name=ranger #数据库名字 db_user=ranger #管理数据库的用户 db_password=\*\*\*\* #密码 (2)、审计日志 如果没有安装solr,这里可以全部注释 audit_store=solr audit_solr_urls=http://localhost:6083/solr/ranger_audits audit_solr_user=solr (3)、策略管理配置 policymgr_external_url=http://localhost:6080 #配置用户名和端口,如果不想使用默认值,可以修改。 ranger在连接数据库的时候,对于密码强度要求很高,当然,这也是由于数据库的密码策略导致的,如果设置的密码不符合数据库的密码策略,那么ranger连接数据库会失败,同时,还需要在数据库里面赋予相应的权限。如下: mysql5.7数据库密码策略:大小写+数字+特殊字符+密码长度\>8 create database `ranger` CHARACTER SET utf8 COLLATE utf8_general_ci; #创建数据库 grant all on *.* to 'root'@'%' identified by 'your password'; flush privileges; grant all on *.* to 'ranger'@'%' identified by 'your password'; flush privileges; ########################################初始化RANGER-ADMIN################################## cd /opt/ranger-2.1.0-admin ./setup.sh 如果初始化ok,启动admin 如果执行的过程中报: SQLException : SQL state: 42000 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Access denied for user 'root'@'%' to database 'ranger' ErrorCode: 1044 连接数据库,执行: UPDATE mysql.user SET Grant_priv='Y', Super_priv='Y' WHERE User='root'; ########################################启动RANGER-ADMIN################################## 启动ranger-admin ranger-admin start 或 cd /opt/ranger-2.1.0-admin ./ews/ranger-admin-services.sh start 启动服务 Ranger的默认端口是6080,如果需要修改,请修改install.properties配置文件。 访问http://s1:6080/,默认用户名/密码:admin 登录之后,出现以下页面:No Content(204) 在/opt/ranger-2.1.0-admin/ews/logs目录下,查看ranger-admin-\*-root.log 显示 2020-10-16 02:14:14,563 \[localhost-startStop-1\] ERROR org.apache.ranger.plugin.store.EmbeddedServiceDefsUtil (EmbeddedServiceDefsUtil.java:174) - EmbeddedServiceDefsUtil.init(): failed java.lang.AbstractMethodError: javax.ws.rs.core.Response.getStatusInfo()Ljavax/ws/rs/core/Response$StatusType; at javax.ws.rs.WebApplicationException.computeExceptionMessage(WebApplicationException.java:212) at javax.ws.rs.WebApplicationException.(WebApplicationException.java:186) at javax.ws.rs.WebApplicationException.(WebApplicationException.java:91) at org.apache.ranger.common.RESTErrorUtil.createRESTException(RESTErrorUtil.java:54) at org.apache.ranger.common.RESTErrorUtil.createRESTException(RESTErrorUtil.java:301) at org.apache.ranger.service.RangerBaseModelService.read(RangerBaseModelService.java:240) at org.apache.ranger.biz.ServiceDBStore.getServiceDef(ServiceDBStore.java:1371) at org.apache.ranger.plugin.store.AbstractServiceStore.updateTagServiceDefForUpdatingAccessTypes(AbstractServiceStore.java:288) at org.apache.ranger.plugin.store.AbstractServiceStore.postCreate(AbstractServiceStore.java:128) at org.apache.ranger.biz.ServiceDBStore.createServiceDef(ServiceDBStore.java:667) at org.apache.ranger.plugin.store.EmbeddedServiceDefsUtil.getOrCreateServiceDef(EmbeddedServiceDefsUtil.java:295) at org.apache.ranger.plugin.store.EmbeddedServiceDefsUtil.init(EmbeddedServiceDefsUtil.java:147) at org.apache.ranger.biz.ServiceDBStore$1.doInTransaction(ServiceDBStore.java:391) at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) at org.apache.ranger.biz.ServiceDBStore.initStore(ServiceDBStore.java:388) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) 解决方案 进去/opt/ranger-2.1.0-admin/ews/webapp/WEB-INF/lib目录 cd /opt/ranger-2.1.0-admin/ews/webapp/WEB-INF/lib 删除以下JAR文件: rm javax.ws.rs-api-2.1.jar rm jersey-client-2.6.jar rm jersey-server-2.27.jar 再次访问http://s1:6080/ 至此,Apache Ranger与CHD6.3.1集成编程安装完成。 集成CHD6.3.1的Apache Ranger 2.1.0源码地址: https://github.com/gm19900510/ranger/tree/release-ranger-2.1.0-cdh-6.3.1-hylink 下载的安装包: https://mvnrepository.com/artifact/org.apache.ranger/ranger-plugins-cred/2.1.0 https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common/3.0.0-cdh6.3.1 https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-project/3.0.0-cdh6.3.1 问题: npm ERR! network request to https://registry.npmjs.org/bluebird/-/bluebird-3.5.3.tgz failed 解决方案: brew install node 1 清空缓存 npm cache clean --force 2 查看当前的npm镜像设置 npm config get registry 3 切换新源 npm config set registry https://registry.npmmirror.com 4 查看新源是否设置成功 npm config get registry 5 可以正常安装需要的工具了 npm insatll 配置取消校验 npm config set ca "" npm config set strict-ssl false /opt/homebrew/Cellar/maven/3.9.9/libexec/conf/settings.xml https://repository.cloudera.com/artifactory/cloudera-repos/ mvn assembly:single

相关推荐
!chen2 小时前
Hadoop和Spark大数据挖掘与实战
hadoop·数据挖掘·spark
IT成长日记8 小时前
【Hive入门】Hive分区与分区表完全指南:从原理到企业级实践
数据仓库·hive·hadoop·hive分区·hive分区表
柳如烟@13 小时前
Hadoop伪分布式模式搭建全攻略:从环境配置到实战测试
大数据·hadoop·分布式·mysql
Aimyon_3620 小时前
Apache Sqoop数据采集问题
hadoop·apache·sqoop
IT成长日记2 天前
【Hive入门】Hive基础操作与SQL语法:DDL操作全面指南
hive·hadoop·sql·ddl操作
IT成长日记2 天前
【Hive入门】Hive分桶表深度解析:从哈希分桶到Join优化的完整指南
hive·hadoop·哈希算法·哈希分桶·join优化
和算法死磕到底2 天前
ubantu18.04(Hadoop3.1.3)之Spark安装和编程实践
大数据·hadoop·pycharm·spark
菜鸟、上路2 天前
Hadoop 集群扩容新增节点操作文档
大数据·hadoop·分布式
IT成长日记2 天前
【Hive入门】Hive动态分区与静态分区:使用场景与性能对比完全指南
数据仓库·hive·hadoop·动态分区·静态分区
嘟嘟嘟嘟嘟嘟嘟.2 天前
spark和hadoop之间的对比和联系
hadoop·spark