Ambari集成Apache Kyuubi实践

目前还有很多公司基于HDP来构建自己的大数据平台,随着Apache Kyuubi的持续热度,如何基于原有的HDP产品来集成Apache Kyuubi,很多人都迫切的需求。集成Apache Kyuubi到HDP中,主要涉及Ambari的二次开发。本文详细叙述了集成Apache Kyuubi的实践过程。

1、集成版本信息

|------|-----------------|
| 服务 | 版本 |
| 九尾 | 1.4.1-孵化 |
| 安巴里 | 2.7.3 |
| 高清图 | 3.1.0 |
| 操作系统 | CentOS 7.4.1708 |

背景:基于HDP 3.1.0版本,我司完成了Apache版本的所有组件替换,然后集成了Apache Kyuubi。

集成Apache的主要组件版本信息如下

|------------|-------|
| 服务 | 版本 |
| 高清文件系统 | 3.3.0 |
| 纱 | 3.3.0 |
| MapReduce2 | 3.3.0 |
| 蜂巢 | 3.1.2 |
| 火花 | 3.1.1 |

2、集成步骤

自定义组件添加分为两大部分,一部分是需要将组件的可文件打包成RPM,另一部分是在Ambari 中添加组件的配置信息,启动脚本等。

2.1 制作RPM包

使用Ambari 安装或集成大数据组件时,需要将组件格式制作成 rpm 格式。

2.1.1 下载并解压Apache Kyuubi

下载地址:https://kyuubi.apache.org/releases.html

在本次集成中我们选择的是1.4.1-incubating版本。

执行tar zxf apache-kyuubi-1.4.1-incubating-bin.tgz,Kyuubi安装包结构简介

apache-kyuubi-1.4.1-incubating-bin├── DISCLAIMER├── LICENSE├── NOTICE├── RELEASE├── beeline-jars├── bin├── conf|   ├── kyuubi-defaults.conf.template│   ├── kyuubi-env.sh.template│   └── log4j2.properties.template├── docker│   ├── Dockerfile│   ├── helm│   ├── kyuubi-configmap.yaml│   ├── kyuubi-deployment.yaml│   ├── kyuubi-pod.yaml│   └── kyuubi-service.yaml├── externals│  └── engines├── jars├── licenses├── logs├── pid└── work

2.1.2 创建RPM包制作环境

安装 rpm-build 包

yum install rpm-build

安装 rpmdevtools

yum install rpmdevtools

创建工作空间

rpmdev-setuptree

安装工作空间简介

2.1.3 制作RPM包

2.1.3.1 编辑Spec文件

制作rpm包需要用spec格式的文件。根据Kyuubi解压后的目录结构,需要自行编辑spec文件以包含所有需要的目录及文件。部分截图概要

%descriptionkyuubi%files%dir %attr(0755, root, root) "/usr/hdp/3.1.0.0-78/kyuubi"%attr(0644, root, root) "/usr/hdp/3.1.0.0-78/kyuubi/DISCLAIMER"%attr(0644, root, root) "/usr/hdp/3.1.0.0-78/kyuubi/LICENSE"%attr(0644, root, root) "/usr/hdp/3.1.0.0-78/kyuubi/NOTICE"%attr(0644, root, root) "/usr/hdp/3.1.0.0-78/kyuubi/RELEASE"%dir %attr(0777, root, root) "/usr/hdp/3.1.0.0-78/kyuubi/beeline-jars"%dir %attr(0777, root, root) "/usr/hdp/3.1.0.0-78/kyuubi/logs"%dir %attr(0777, root, root) "/usr/hdp/3.1.0.0-78/kyuubi/pid"%dir %attr(0777, root, root) "/usr/hdp/3.1.0.0-78/kyuubi/work"
%changelog

2.1.3.2 更换文件

创建目录

cd /root/rpmbuild/BUILDROOTmkdir -p /root/rpmbuild/BUILDROOT/kyuubi_3_1_0_0_78-2.3.2.3.1.0.0-78.x86_64/usr/hdp/3.1.0.0-78/kyuubi

转到apache-kyuubi-1.4.1-incubating-bin.tgz解压后目录

cp -r * /root/rpmbuild/BUILDROOT/kyuubi_3_1_0_0_78-2.3.2.3.1.0.0-78.x86_64/usr/hdp/3.1.0.0-78/kyuubi

将kyuubi.spec文件放到/root/rpmbuild/SPECS文件夹中

cp kyuubi.spec /root/rpmbuild/SPECS/

2.1.3.3 执行打包。

cd /root/rpmbuild/SPECSrpmbuild -ba kyuubi.spec

查看 rpm 包

打包好的rpm包在/root/rpmbuild/RPMS虚拟机

2.1.4****更新YUM源

将上步骤中生成rpm的目标yum源中对应目录,保持完成后执行更新操作

createrepo --update ./

2.2 Ambari集成Apache Kyuubi

2.2.1 公民服务目录结构

在Ambari中添加自定义服务,需要配置文件以Spark详细说明

配置

该目录下存放的是spark的属性配置文件,对应Ambari页面的属性配置页面,可以设置默认值,类型,描述等信息。

包/脚本

该目录下存放服务操作相关的脚本,如服务启动,服务停止,服务检查等。

包/模板

该目录可存放,存放的是组件属性的配置信息,和配置目录下的配置对应,这个关系是如果我们在Ambari页面修改了属性信息,则修改信息会自动填充该目录下的文件的属性,所以,这个目录下的属性是最新的,并且是服务要调用

包裹/警报

该程序存放告急配置,例如程序断网、运行时告急

快速链接

该目录下存放的是快速链接配置,Ambari页面通过该配置可以跳转到我们想要跳转的页面。

指标.json

用来配置指标相关配置

kerberos.json

用来配置kerberos认证

元信息文件

这个文件很重要,主要是配置服务名称、服务类型、服务操作脚本、服务组件、指标以及快速链接等信息。

2.2.2 添加Kyuubi组件

在司的应用场景中,主要是用Kyuubi来替换Spark Thrift Server,依赖于Spark服务中集成Kyuubi组件。实现如下图所示的效果

2.2.2.1 编辑metainfo.xml

在ambari-server/src/main/resources/stacks/HDP/3.0/services/SPARK目录中首先编辑metainfo.xml,为spark服务添加Kyuubi组件

<component>  <name>KYUUBI_SEVER</name>  <displayName>KYUUBI SEVER</displayName>  <category>SLAVE</category>  <cardinality>0+</cardinality>  <versionAdvertised>true</versionAdvertised>  <dependencies>    <dependency>      <name>HDFS/HDFS_CLIENT</name>      <scope>host</scope>      <auto-deploy>        <enabled>true</enabled>      </auto-deploy>    </dependency>    <dependency>      <name>MAPREDUCE2/MAPREDUCE2_CLIENT</name>      <scope>host</scope>      <auto-deploy>        <enabled>true</enabled>      </auto-deploy>    </dependency>    <dependency>      <name>YARN/YARN_CLIENT</name>      <scope>host</scope>      <auto-deploy>        <enabled>true</enabled>      </auto-deploy>    </dependency>    <dependency>      <name>SPARK/SPARK_CLIENT</name>      <scope>host</scope>      <auto-deploy>        <enabled>true</enabled>      </auto-deploy>    </dependency>    <dependency>      <name>HIVE/HIVE_METASTORE</name>      <scope>cluster</scope>      <auto-deploy>        <enabled>true</enabled>      </auto-deploy>    </dependency>  </dependencies>  <commandScript>    <script>scripts/kyuubi_server.py</script>    <scriptType>PYTHON</scriptType>    <timeout>600</timeout>   </commandScript>  <logs>    <log>      <logId>kyuubi_server</logId>      <primary>true</primary>    </log>  </logs></component>

其中kyuubi_server.py定义了安装、配置、启动、停止、获取服务状态等功能。

<configuration-dependencies>  <config-type>core-site</config-type>  <config-type>spark-defaults</config-type>  <config-type>spark-env</config-type>  <config-type>spark-log4j-properties</config-type>  <config-type>spark-metrics-properties</config-type>  <config-type>spark-thrift-sparkconf</config-type>  <config-type>spark-hive-site-override</config-type>  <config-type>spark-thrift-fairscheduler</config-type>  <config-type>kyuubi-defaults</config-type>  <config-type>kyuubi-env</config-type>  <config-type>kyuubi-log4j-properties</config-type>  <config-type>ranger-spark-audit</config-type>  <config-type>ranger-spark-security</config-type></configuration-dependencies>

在< configuration-dependencies ></ configuration-dependencies >标签内添加kyuubi相关配置项,包括kyuubi-defaults、kyuubi-env、kyuubi-log4j-properties、ranger-spark-audit、ranger-spark-security。

<osSpecific>  <osFamily>redhat7,amazonlinux2,redhat6,suse11,suse12</osFamily>  <packages>    <package>      <name>spark2_${stack_version}</name>    </package>    <package>      <name>spark2_${stack_version}-python</name>    </package>    <package>      <name>kyuubi_${stack_version}</name>    </package>  </packages></osSpecific>

在< package > </ package >标签内添加kyuubi rpm包名称信息

2.2.2.2 编辑kyuubi_server.py文件

#!/usr/bin/env pythonimport osfrom resource_management import *
class KyuubiServer(Script):    def install(self, env):        self.install_packages(env)
    def configure(self, env, upgrade_type=None, config_dir=None):        import kyuubi_params        env.set_params(kyuubi_params)
        Directory([kyuubi_params.kyuubi_log_dir, kyuubi_params.kyuubi_pid_dir, kyuubi_params.kyuubi_metrics_dir, kyuubi_params.kyuubi_operation_log_dir],                  owner=kyuubi_params.kyuubi_user,                  group=kyuubi_params.kyuubi_group,                  mode=0775,                  create_parents = True                  )
        kyuubi_defaults = dict(kyuubi_params.config['configurations']['kyuubi-defaults'])
        PropertiesFile(format("{kyuubi_conf_dir}/kyuubi-defaults.conf"),               properties = kyuubi_defaults,               key_value_delimiter = " ",               owner=kyuubi_params.kyuubi_user,               group=kyuubi_params.kyuubi_group,               mode=0644               )
        # create kyuubi-env.sh in kyuubi install dir        File(os.path.join(kyuubi_params.kyuubi_conf_dir, 'kyuubi-env.sh'),             owner=kyuubi_params.kyuubi_user,             group=kyuubi_params.kyuubi_group,             content=InlineTemplate(kyuubi_params.kyuubi_env_sh),             mode=0644,        )
        #create log4j.properties kyuubi install dir        File(os.path.join(kyuubi_params.kyuubi_conf_dir, 'log4j.properties'),             owner=kyuubi_params.kyuubi_user,             group=kyuubi_params.kyuubi_group,             content=kyuubi_params.kyuubi_log4j_properties,             mode=0644,        )
    def start(self, env, upgrade_type=None):        import kyuubi_params        env.set_params(kyuubi_params)
        self.configure(env)        Execute(kyuubi_params.kyuubi_start_cmd,user=kyuubi_params.kyuubi_user,environment={'JAVA_HOME': kyuubi_params.java_home})
    def stop(self, env, upgrade_type=None):        import kyuubi_params        env.set_params(kyuubi_params)        self.configure(env)
        Execute(kyuubi_params.kyuubi_stop_cmd,user=kyuubi_params.kyuubi_user,environment={'JAVA_HOME': kyuubi_params.java_home})
    def status(self, env):        import kyuubi_params        env.set_params(kyuubi_params)        check_process_status(kyuubi_params.kyuubi_pid_file)
    def get_user(self):        import kyuubi_params        return kyuubi_params.kyuubi_user
    def get_pid_files(self):        import kyuubi_params        return [kyuubi_params.kyuubi_pid_file]
if __name__ == "__main__":    KyuubiServer().execute()

kyuubi_server.py定义了安装、配置、启动、停止Kyuubi服务的逻辑。

kyuubi_params配置变量以及需要的相关参数,由于篇幅原因这里不推荐使用。

2.2.2.3 编辑kyuubi_default.xml文件

<?xml version="1.0" encoding="UTF-8"?><configuration supports_final="true">    <property>        <name>kyuubi.ha.zookeeper.quorum</name>        <value>{{cluster_zookeeper_quorum}}</value>        <description>            The connection string for the zookeeper ensemble        </description>        <on-ambari-upgrade add="true"/>    </property>
    <property>        <name>kyuubi.frontend.thrift.binary.bind.port</name>        <value>10009</value>        <description>            Port of the machine on which to run the thrift frontend service via binary protocol.        </description>        <on-ambari-upgrade add="true"/>    </property>
    <property>        <name>kyuubi.ha.zookeeper.session.timeout</name>        <value>600000</value>        <description>            The timeout(ms) of a connected session to be idled        </description>        <on-ambari-upgrade add="true"/>    </property>    <property>        <name>kyuubi.session.engine.initialize.timeout</name>        <value>300000</value>        <description>            Timeout for starting the background engine, e.g. SparkSQLEngine.        </description>        <on-ambari-upgrade add="true"/>    </property>    <property>        <name>kyuubi.authentication</name>        <value>{{kyuubi_authentication}}</value>        <description>            Client authentication types        </description>        <on-ambari-upgrade add="true"/>    </property>    <property>        <name>spark.master</name>        <value>yarn</value>        <description>            The deploying mode of spark application.        </description>        <on-ambari-upgrade add="true"/>    </property>    <property>        <name>spark.submit.deployMode</name>        <value>cluster</value>        <description>spark submit deploy mode</description>        <on-ambari-upgrade add="true"/>    </property>    <property>        <name>spark.yarn.queue</name>        <value>default</value>        <description>            The name of the YARN queue to which the application is submitted.        </description>        <depends-on>            <property>                <type>capacity-scheduler</type>                <name>yarn.scheduler.capacity.root.queues</name>            </property>        </depends-on>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>spark.yarn.driver.memory</name>        <value>4g</value>        <description>spark yarn driver momory</description>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>spark.executor.memory</name>        <value>4g</value>        <description>spark.executor.memory</description>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>spark.sql.extensions</name>        <value>org.apache.submarine.spark.security.api.RangerSparkSQLExtension</value>        <description>spark sql ranger extension</description>        <on-ambari-upgrade add="false"/>    </property></configuration>

其中kyuubi.ha.zookeeper.quorum属性配置的值值为{{cluster_zookeeper_quorum}},会在kyuubi安装中自动替换为当前zk集群的信息。

kyuubi.authentication属性配置的值值为{{kyuubi_authentication}},在kyuubi安装中判断当前集群是否开启kerberos认证,来设置true或者false。

2.2.2.4 编辑kyuubi_env.xml文件

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration supports_adding_forbidden="true">    <property>        <name>kyuubi_user</name>        <display-name>Kyuubi User</display-name>        <value>spark</value>        <property-type>USER</property-type>        <value-attributes>            <type>user</type>            <overridable>false</overridable>            <user-groups>                <property>                    <type>cluster-env</type>                    <name>user_group</name>                </property>                <property>                    <type>kyuubi-env</type>                    <name>kyuubi_group</name>                </property>            </user-groups>        </value-attributes>        <on-ambari-upgrade add="true"/>    </property>    <property>        <name>kyuubi_group</name>        <display-name>Kyuubi Group</display-name>        <value>spark</value>        <property-type>GROUP</property-type>        <description>kyuubi group</description>        <value-attributes>            <type>user</type>        </value-attributes>        <on-ambari-upgrade add="true"/>    </property>    <property>        <name>kyuubi_log_dir</name>        <display-name>Kyuubi Log directory</display-name>        <value>/var/log/kyuubi</value>        <description>Kyuubi Log Dir</description>        <value-attributes>            <type>directory</type>        </value-attributes>        <on-ambari-upgrade add="true"/>    </property>    <property>        <name>kyuubi_pid_dir</name>        <display-name>Kyuubi PID directory</display-name>        <value>/var/run/kyuubi</value>        <value-attributes>            <type>directory</type>        </value-attributes>        <on-ambari-upgrade add="true"/>    </property>
    <!-- kyuubi-env.sh -->    <property>        <name>content</name>        <description>This is the jinja template for kyuubi-env.sh file</description>        <value>#!/usr/bin/env bash
export JAVA_HOME={{java_home}}export HADOOP_CONF_DIR=/etc/hadoop/confexport SPARK_HOME=/usr/hdp/current/spark-clientexport SPARK_CONF_DIR=/etc/spark/confexport KYUUBI_LOG_DIR={{kyuubi_log_dir}}export KYUUBI_PID_DIR={{kyuubi_pid_dir}}        </value>        <value-attributes>            <type>content</type>        </value-attributes>        <on-ambari-upgrade add="true"/>    </property></configuration>

本文件主要设置JAVA_HOME、HADOOP_CONF_DIR、KYUUBI_LOG_DIR、KYUUBI_PID_DIR等相关路径信息

2.2.2.5 编辑kyuubi-log4j-properties.xml文件

<?xml version="1.0" encoding="UTF-8"?><configuration supports_final="false" supports_adding_forbidden="true">    <property>        <name>content</name>        <description>Kyuubi-log4j-Properties</description>        <value># Set everything to be logged to the consolelog4j.rootCategory=INFO, consolelog4j.appender.console=org.apache.log4j.ConsoleAppenderlog4j.appender.console.target=System.errlog4j.appender.console.layout=org.apache.log4j.PatternLayoutlog4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss.SSS} %p %c{2}: %m%n
# Set the default kyuubi-ctl log level to WARN. When running the kyuubi-ctl, the# log level for this class is used to overwrite the root logger's log level.log4j.logger.org.apache.kyuubi.ctl.ServiceControlCli=ERROR        </value>        <value-attributes>            <type>content</type>            <show-property-name>false</show-property-name>        </value-attributes>        <on-ambari-upgrade add="true"/>    </property></configuration>

2.2.2.6 编辑ranger-spark-security.xml文件

<?xml version="1.0"?><configuration>    <property>        <name>ranger.plugin.spark.service.name</name>        <value>{{repo_name}}</value>        <description>Name of the Ranger service containing policies for this SPARK instance</description>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>ranger.plugin.spark.policy.source.impl</name>        <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value>        <description>Class to retrieve policies from the source</description>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>ranger.plugin.spark.policy.rest.url</name>        <value>{{policymgr_mgr_url}}</value>        <description>URL to Ranger Admin</description>        <on-ambari-upgrade add="false"/>        <depends-on>            <property>                <type>admin-properties</type>                <name>policymgr_external_url</name>            </property>        </depends-on>    </property>    <property>        <name>ranger.plugin.spark.policy.pollIntervalMs</name>        <value>30000</value>        <description>How often to poll for changes in policies?</description>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>ranger.plugin.spark.policy.cache.dir</name>        <value>/etc/ranger/{{repo_name}}/policycache</value>        <description>Directory where Ranger policies are cached after successful retrieval from the source</description>        <on-ambari-upgrade add="false"/>    </property></configuration>

本文件主要用来配置spark ranger相关参数。

2.2.2.7 编辑ranger-spark-audit.xml文件

<?xml version="1.0"?><configuration>    <property>        <name>xasecure.audit.is.enabled</name>        <value>true</value>        <description>Is Audit enabled?</description>        <value-attributes>            <type>boolean</type>        </value-attributes>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>xasecure.audit.destination.db</name>        <value>false</value>        <display-name>Audit to DB</display-name>        <description>Is Audit to DB enabled?</description>        <value-attributes>            <type>boolean</type>        </value-attributes>        <depends-on>            <property>                <type>ranger-env</type>                <name>xasecure.audit.destination.db</name>            </property>        </depends-on>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>xasecure.audit.destination.db.jdbc.driver</name>        <value>{{jdbc_driver}}</value>        <description>Audit DB JDBC Driver</description>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>xasecure.audit.destination.db.jdbc.url</name>        <value>{{audit_jdbc_url}}</value>        <description>Audit DB JDBC URL</description>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>xasecure.audit.destination.db.password</name>        <value>{{xa_audit_db_password}}</value>        <property-type>PASSWORD</property-type>        <description>Audit DB JDBC Password</description>        <value-attributes>            <type>password</type>        </value-attributes>        <on-ambari-upgrade add="false"/>    </property>    <property>        <name>xasecure.audit.destination.db.user</name>        <value>{{xa_audit_db_user}}</value>        <description>Audit DB JDBC User</description>        <on-ambari-upgrade add="false"/>    </property></configuration>

本文件主要用来配置spark ranger audit 相关参数。

2.2.2.8 编辑alerts.json文件

"KYUUBI_SEVER": [  {    "name": "kyuubi_server_status",    "label": "Kyuubi Server",    "description": "This host-level alert is triggered if the Kyuubi Server cannot be determined to be up.",    "interval": 1,    "scope": "ANY",    "source": {      "type": "SCRIPT",      "path": "DIF/3.0/services/SPARK/package/scripts/alerts/alert_kyuubi_server_port.py",      "parameters": [        {          "name": "check.command.timeout",          "display_name": "Command Timeout",          "value": 120.0,          "type": "NUMERIC",          "description": "The maximum time before check command will be killed by timeout",          "units": "seconds",          "threshold": "CRITICAL"        }      ]    }  }]

在alert.json文件中添加检测kyuubi服务器是否启动的告警检测配置,每隔120秒检测一次kyuubi服务器服务是否正常。检测逻辑由alert_kyuubi_server_port.py实现

2.2.2.9 编辑alert_kyuubi_server_port.py文件

alert_kyuubi_server_port.py的实现逻辑可以参考alert_spark_thrift_port.py,在此不具体实现逻辑,原理就是定时执行beeline连接操作判断网站上的连接成功。

2.2.2.10 编辑kerberos.json文件

{  "name": "kyuubi_service_keytab",  "principal": {    "value": "spark/_HOST@${realm}",    "type" : "service",    "configuration": "kyuubi-defaults/kyuubi.kinit.principal",    "local_username" : "${spark-env/spark_user}"  },  "keytab": {    "file": "${keytab_dir}/spark.service.keytab",    "owner": {      "name": "${spark-env/spark_user}",      "access": "r"    },    "group": {      "name": "${cluster-env/user_group}",      "access": ""    },    "configuration": "kyuubi-defaults/kyuubi.kinit.keytab"  }}

在kerberos.json文件中添加自动生成principal、keytab配置到kyuubi-defaults的逻辑。在集群启动kerberos认证的时候,会自动在kyuubi-defaults中添加kyuubi.kinit.keytab及kyuubi.kinit.principal配置项。

2.2.2.11 更新 ambari-server、ambari-agent RPM

将上述修改以及添加的文件内容更新到ambari-server、ambari-agent RPM包中的相应目录中。

对于已经安装的套件,可以通过如下操作进行:

1.卸载spark服务

2.将上述添加的文件放到如下目录的位置:

/var/lib/ambari-server/resources/stacks/HDP/3.0/services/SPARK/var/lib/ambari-agent/cache/stacks/DIF/3.0/services/SPARK
  1. 在ambari-serer执行sudo ambari-server restart

  2. 在ambari-agent所在节点执行sudo ambari-agent restart

  3. 重新安装spark服务

3、效果展示

3.1 安装效果展示

安装spark时,支持选择kyuubi服务器组件

安装过程中可以在界面上配置kyuubi参数

3.2 安装成功后效果展示

安装成功

Kyuubi相关配置页面展示成功

3.3 停掉Kyuubi Server效果展示

界面上停止掉kyuubi服务器

界面展示停止成功

界面告警提示kyuubi server停止

重新启动服务器

说警消失

使用ambari-qa用户在后台通过beeline连接,可以连接成功

在yarn界面上看到用户ambari-qa应用程序运行成功

引用文献:

Ambari集成Apache Kyuubi实践

相关推荐
油头少年_w4 小时前
大数据导论及分布式存储HadoopHDFS入门
大数据·hadoop·hdfs
工业互联网专业6 小时前
Python毕业设计选题:基于Hadoop的租房数据分析系统的设计与实现
vue.js·hadoop·python·flask·毕业设计·源码·课程设计
bigdata-余建新14 小时前
HDFS和HBase跨集群数据迁移 源码
hadoop·hdfs·hbase
Mephisto.java15 小时前
【大数据学习 | kafka高级部分】文件清除原理
大数据·hadoop·zookeeper·spark·kafka·hbase·flume
m0_3755997315 小时前
Hadoop:单节点配置YARN
hadoop·yarn
大数据魔法师17 小时前
Hadoop生态圈框架部署(五)- Zookeeper完全分布式部署
hadoop·分布式·zookeeper
FserSuN1 天前
Apache Calcite - 查询优化之自定义优化规则
apache·calcite
数据要素X1 天前
【数据仓库】Hive 拉链表实践
大数据·数据库·数据仓库·人工智能·hive·hadoop·安全
黑风风1 天前
Ubuntu 22 安装 Apache Doris 3.0.3 笔记
笔记·ubuntu·apache
Francek Chen1 天前
【大数据技术基础 | 实验八】HBase实验:新建HBase表
大数据·数据库·hadoop·分布式·zookeeper·hbase