Hue3.9.0-cdh5.14.0安装

hue的安装需要准备如下的前置条件

1、Java-jdk8+

2、python2.6+

3、安装时非root用户

4、另行安装maven和ant,并配置HOME

5、hue安装节点需要hadoop、hive的配置文件

6、准备一个mysql库来做hue的元数据库

第一步:前置准备中的Javajdk和python准备本简单就不演示了,maven和ant的包,解压既安装,配置好HOME测试版本无问题即可

bash 复制代码
export MAVEN_HOME=/pot/apache-maven-3.3.9
export ANT_HOME=/opt/apache-ant-1.8.1
export PATH=$PATH:$MAVEN_HOME/bin
export PATH=$PATH:$ANT_HOME/bin

第二步:安装hue需要的第三方环境

bash 复制代码
yum install -y asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libtidy libxml2-devel libxslt-devel make mysql mysql-devel openldap-devel python-devel sqlite-devel openssl-devel gmp-devel

第三步:编译hue,首先将hue的安装包解压

bash 复制代码
tar -zxvf hue-3.9.0-cdh5.14.0.tar.gz
mv hue-3.9.0-cdh5.14.0 hue3.9

随后进入hue的HOME目录里面,编译hue,编译后会生成一个desktop路径,下面称为安装目录

bash 复制代码
cd hue3.9
make apps

编译大概会输出很多东西,中途会卡顿,无需额外操作,大概需要3分钟左右,中途不报错就行

第四步:新建系统用户hue,将整个hue赋权给该用户,下面的操作都使用该用户操作

bash 复制代码
useradd hue
passwd hue
输入密码

chown -R hue:hue /opt/hue3.9/
su hue

第五步:配置desktop/conf/hue.ini文件,找到并修改如下内容,除了secret_key其他的改成你自己的,内容大概在配置文件中的20行左右

bash 复制代码
secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o
http_host=hdp3
http_port=8888
time_zone=Asia/Shanghai

第六步:运行启动命令,初步启动hue,确保服务自身没问题

bash 复制代码
build/env/bin/supervisor

服务正常的话,会输出如下的服务信息

bash 复制代码
[hue@hdp3 hue3.9]$ ./build/env/bin/supervisor 
[INFO] Not running as root, skipping privilege drop
starting server with options:
{'daemonize': False,
 'host': '0.0.0.0',
 'pidfile': None,
 'port': 8888,
 'server_group': 'hue',
 'server_name': 'localhost',
 'server_user': 'hue',
 'ssl_certificate': None,
 'ssl_certificate_chain': None,
 'ssl_cipher_list': 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA',
 'ssl_private_key': None,
 'threads': 50,
 'workdir': None}

此时访问安装节点的8888端口,出现如下界面即可,不要着急下一步,回到安装步骤上,把前台的服务Ctrl+C关掉

第七步:Hadoop需要一些Hue的配置支持,修改hdfs-site.xml追加如下内容

xml 复制代码
<!-- 打开hadoop的hdfs-web操作支持 -->
<property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
</property>

<!-- 关掉权限检查 -->
<property>
    <name>dfs.permissions.enabled</name>
    <value>false</value>
</property>

修改core-site.xml,追加如下内容

xml 复制代码
<!-- 给hue用户鉴权,告诉hadoop所有节点(*)都可以通过hue用户身份代理的形式操作hdfs -->
<property>
    <name>hadoop.proxyuser.hue.hosts</name>
    <value>*</value>
</property>
<property>
    <name>hadoop.proxyuser.hue.groups</name>
    <value>*</value>
</property>

如果你的Hadoop是高可用的,需要在httpfs-site.xml文件中追加如下配置

xml 复制代码
<!-- 所有的节点都能访问httpfs服务 -->
<property>
    <name>httpfs.proxyuser.httpfs.hosts</name>
    <value>*</value>
</property>
<property>
    <name>httpfs.proxyuser.httpfs.groups</name>
    <value>*</value>
</property>

区别是:WebHDFS是HDFS内置的组件,已经运行于NameNode和DataNode中。对HDFS文件的读写,将会重定向到文件所在的DataNode,并且会完全利用HDFS的带宽。HttpFS是独立于HDFS的一个服务。对HDFS文件的读写,将会通过它进行中转,它能限制带宽占用,且hue关联高可用hdfs走的是HttpFS。

用scp同步Hadoop节点之间的配置

第八步:开始配置Hue的hdfs配置,但是这里有个问题,后面说,修改hue.ini文件,找打文件的850行左右,修改hadoop-hdfs的配置

下面的yarn配置也要改,而且注意改的时候有的要解开注释,然后yarn的HA有一些配置需要复制

txt 复制代码
[hadoop]

  # Configuration for HDFS NameNode
  # ------------------------------------------------------------------------
  [[hdfs_clusters]]
    # HA support by using HttpFs

    [[[default]]]
      # Enter the filesystem uri
      fs_defaultfs=hdfs://hdp1

      # 如果是高可用的需要再次定义NN通道名
      logical_name=hdp1

      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      # Default port is 14000 for HttpFs.
      webhdfs_url=http://hdp1/webhdfs/v1

      # Change this if your HDFS cluster is Kerberos-secured
      ## security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

      # Directory of the Hadoop configuration
      hadoop_conf_dir=/opt/hadoop-2.7.2/etc/hadoop

  # Configuration for YARN (MR2)
  # ------------------------------------------------------------------------
  [[yarn_clusters]]

    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      resourcemanager_host=hdp2

      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8032

      # 是否将作业提交到此群集,并监控作业执行情况
      submit_to=True

      # 如果是高可用这里要指定RN通道名
      logical_name=yrc

      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false

      # URL of the ResourceManager API
      resourcemanager_api_url=http://hdp2:8088

      # URL of the ProxyServer API
      proxy_api_url=http://hdp2:8088

      # 历史服务器
      history_server_api_url=http://hdp3:19888

      # spark的历史服务器地址
      spark_history_server_url=http://hdp3:18088

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

    # HA support by specifying multiple clusters.
    # Redefine different properties there.
    # e.g.

    [[[ha]]]
      # Resource Manager logical name (required for HA)
	  logical_name=yrc

      # Un-comment to enable
      ## submit_to=True

      # URL of the ResourceManager API
      resourcemanager_api_url=http://hdp3:8088

      # URL of the ProxyServer API
      proxy_api_url=http://hdp3:8088
	  
	  # Enter the host on which you are running the ResourceManager
      resourcemanager_host=hdp3

      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8032

第九步:配置hive,找到hue配置文件中的[beeswax]配置内容

第十步:修改hue的元数据库,首先在准备好的mysql服务中新建数据库并且赋权

bash 复制代码
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'Hue123!';
flush privileges;

修后修改hue的配置文件,找到[database],修改如下属性

把mysql的driver包,放到安装目录的desktop/libs下面

bash 复制代码
[hue@hdp3 libs]$ ll |grep mysql
-rw-r--r-- 1 hue hue 872303 10月 29 22:13 mysql-connector-java-5.1.27-bin.jar

随后在安装目录下,执行初始化命令

bash 复制代码
./build/env/bin/hue syncdb

然后回出现下面的输出,跟着提示走

bash 复制代码
[hue@hdp3 hue3.9]$ ./build/env/bin/hue syncdb
Syncing...
Creating tables ...
Creating table auth_permission
Creating table auth_group_permissions
Creating table auth_group
Creating table auth_user_groups
Creating table auth_user_user_permissions
Creating table auth_user
Creating table django_openid_auth_nonce
Creating table django_openid_auth_association
Creating table django_openid_auth_useropenid
Creating table django_content_type
Creating table django_session
Creating table django_site
Creating table django_admin_log
Creating table south_migrationhistory
Creating table axes_accessattempt
Creating table axes_accesslog

You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (leave blank to use 'hue'): hue           #这一步是在设置hue的管理用户就是8888web的管理
Email address:       #邮箱不用输,直接回车
Password:            #这里是hue管理用户的密码
Password (again): 
Superuser created successfully.
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)

Synced:
 > django.contrib.auth
 > django_openid_auth
 > django.contrib.contenttypes
 > django.contrib.sessions
 > django.contrib.sites
 > django.contrib.staticfiles
 > django.contrib.admin
 > south
 > axes
 > about
 > filebrowser
 > help
 > impala
 > jobbrowser
 > metastore
 > proxy
 > rdbms
 > zookeeper
 > indexer
 > dashboard

Not synced (use migrations):
 - django_extensions
 - desktop
 - beeswax
 - hbase
 - jobsub
 - oozie
 - pig
 - search
 - security
 - spark
 - sqoop
 - useradmin
 - notebook
(use ./manage.py migrate to migrate these)

在上面的输出结束并出现(use ./manage.py migrate to migrate these)的提示之后,运行下面的命令

bash 复制代码
./build/env/bin/hue migrate

会输出很长一段的日志,不用管

bash 复制代码
[hue@hdp3 hue3.9]$ ./build/env/bin/hue migrate
Running migrations for django_extensions:
 - Migrating forwards to 0001_empty.
 > django_extensions:0001_empty
 - Loading initial data for django_extensions.
Installed 0 object(s) from 0 fixture(s)
Running migrations for desktop:
 - Migrating forwards to 0027_truncate_documents.
 > pig:0001_initial
 > oozie:0001_initial
 > oozie:0002_auto__add_hive
 > oozie:0003_auto__add_sqoop
 > oozie:0004_auto__add_ssh
 > oozie:0005_auto__add_shell
 > oozie:0006_auto__chg_field_java_files__chg_field_java_archives__chg_field_sqoop_f
 > oozie:0007_auto__chg_field_sqoop_script_path
 > oozie:0008_auto__add_distcp
 > oozie:0009_auto__add_decision
 > oozie:0010_auto__add_fs
 > oozie:0011_auto__add_email
 > oozie:0012_auto__add_subworkflow__chg_field_email_subject__chg_field_email_body
 > oozie:0013_auto__add_generic
 > oozie:0014_auto__add_decisionend
 > oozie:0015_auto__add_field_dataset_advanced_start_instance__add_field_dataset_ins
 > oozie:0016_auto__add_field_coordinator_job_properties
 > oozie:0017_auto__add_bundledcoordinator__add_bundle
 > oozie:0018_auto__add_field_workflow_managed
 > oozie:0019_auto__add_field_java_capture_output
 > oozie:0020_chg_large_varchars_to_textfields
 > oozie:0021_auto__chg_field_java_args__add_field_job_is_trashed
 > oozie:0022_auto__chg_field_mapreduce_node_ptr__chg_field_start_node_ptr
 > oozie:0022_change_examples_path_format
 - Migration 'oozie:0022_change_examples_path_format' is marked for no-dry-run.
 > oozie:0023_auto__add_field_node_data__add_field_job_data
 > oozie:0024_auto__chg_field_subworkflow_sub_workflow
 > oozie:0025_change_examples_path_format
 - Migration 'oozie:0025_change_examples_path_format' is marked for no-dry-run.
 > desktop:0001_initial
 > desktop:0002_add_groups_and_homedirs
 > desktop:0003_group_permissions
 > desktop:0004_grouprelations
 > desktop:0005_settings
 > desktop:0006_settings_add_tour
 > beeswax:0001_initial
 > beeswax:0002_auto__add_field_queryhistory_notify
 > beeswax:0003_auto__add_field_queryhistory_server_name__add_field_queryhistory_serve
 > beeswax:0004_auto__add_session__add_field_queryhistory_server_type__add_field_query
 > beeswax:0005_auto__add_field_queryhistory_statement_number
 > beeswax:0006_auto__add_field_session_application
 > beeswax:0007_auto__add_field_savedquery_is_trashed
 > beeswax:0008_auto__add_field_queryhistory_query_type
 > beeswax:0009_auto__add_field_savedquery_is_redacted__add_field_queryhistory_is_reda
 > desktop:0007_auto__add_documentpermission__add_documenttag__add_document
 > desktop:0008_documentpermission_m2m_tables
 > desktop:0009_auto__chg_field_document_name
 > desktop:0010_auto__add_document2__chg_field_userpreferences_key__chg_field_userpref
 > desktop:0011_auto__chg_field_document2_uuid
 > desktop:0012_auto__chg_field_documentpermission_perms
 > desktop:0013_auto__add_unique_documenttag_owner_tag
 > desktop:0014_auto__add_unique_document_content_type_object_id
 > desktop:0015_auto__add_unique_documentpermission_doc_perms
 > desktop:0016_auto__add_unique_document2_uuid_version_is_history
 > desktop:0017_auto__add_document2permission__add_unique_document2permission_doc_perm
 > desktop:0018_auto__add_field_document2_parent_directory
 > desktop:0019_auto
 > desktop:0020_auto__del_field_document2permission_all
 > desktop:0021_auto__add_defaultconfiguration__add_unique_defaultconfiguration_app_is
 > desktop:0022_auto__del_field_defaultconfiguration_group__del_unique_defaultconfigur
 > desktop:0023_auto__del_unique_defaultconfiguration_app_is_default_user__add_field_d
 > desktop:0024_auto__add_field_document2_is_managed
 > desktop:0025_auto__add_field_document2_is_trashed
 > desktop:0026_change_is_trashed_default_to_false
 - Migration 'desktop:0026_change_is_trashed_default_to_false' is marked for no-dry-run.
 > desktop:0027_truncate_documents
 - Loading initial data for desktop.
Installed 0 object(s) from 0 fixture(s)
Running migrations for beeswax:
 - Migrating forwards to 0014_auto__add_field_queryhistory_is_cleared.
 > beeswax:0009_auto__chg_field_queryhistory_server_port
 > beeswax:0010_merge_database_state
 > beeswax:0011_auto__chg_field_savedquery_name
 > beeswax:0012_auto__add_field_queryhistory_extra
 > beeswax:0013_auto__add_field_session_properties
 > beeswax:0014_auto__add_field_queryhistory_is_cleared
 - Loading initial data for beeswax.
Installed 0 object(s) from 0 fixture(s)
Running migrations for hbase:
 - Migrating forwards to 0001_initial.
 > hbase:0001_initial
 - Loading initial data for hbase.
Installed 0 object(s) from 0 fixture(s)
Running migrations for jobsub:
 - Migrating forwards to 0006_chg_varchars_to_textfields.
 > jobsub:0001_initial
 > jobsub:0002_auto__add_ooziestreamingaction__add_oozieaction__add_oozieworkflow__ad
 > jobsub:0003_convertCharFieldtoTextField
 > jobsub:0004_hue1_to_hue2
 - Migration 'jobsub:0004_hue1_to_hue2' is marked for no-dry-run.
 > jobsub:0005_unify_with_oozie
 - Migration 'jobsub:0005_unify_with_oozie' is marked for no-dry-run.
 > jobsub:0006_chg_varchars_to_textfields
 - Loading initial data for jobsub.
Installed 0 object(s) from 0 fixture(s)
Running migrations for oozie:
 - Migrating forwards to 0027_auto__chg_field_node_name__chg_field_job_name.
 > oozie:0026_set_default_data_values
 - Migration 'oozie:0026_set_default_data_values' is marked for no-dry-run.
 > oozie:0027_auto__chg_field_node_name__chg_field_job_name
 - Loading initial data for oozie.
Installed 0 object(s) from 0 fixture(s)
Running migrations for pig:
- Nothing to migrate.
 - Loading initial data for pig.
Installed 0 object(s) from 0 fixture(s)
Running migrations for search:
 - Migrating forwards to 0003_auto__add_field_collection_owner.
 > search:0001_initial
 > search:0002_auto__del_core__add_collection
 > search:0003_auto__add_field_collection_owner
 - Loading initial data for search.
Installed 0 object(s) from 0 fixture(s)
? You have no migrations for the 'security' app. You might want some.
Running migrations for spark:
 - Migrating forwards to 0001_initial.
 > spark:0001_initial
 - Loading initial data for spark.
Installed 0 object(s) from 0 fixture(s)
Running migrations for sqoop:
 - Migrating forwards to 0001_initial.
 > sqoop:0001_initial
 - Loading initial data for sqoop.
Installed 0 object(s) from 0 fixture(s)
Running migrations for useradmin:
 - Migrating forwards to 0008_convert_documents.
 > useradmin:0001_permissions_and_profiles
 - Migration 'useradmin:0001_permissions_and_profiles' is marked for no-dry-run.
 > useradmin:0002_add_ldap_support
 - Migration 'useradmin:0002_add_ldap_support' is marked for no-dry-run.
 > useradmin:0003_remove_metastore_readonly_huepermission
 - Migration 'useradmin:0003_remove_metastore_readonly_huepermission' is marked for no-dry-run.
 > useradmin:0004_add_field_UserProfile_first_login
 > useradmin:0005_auto__add_field_userprofile_last_activity
 > useradmin:0006_auto__add_index_userprofile_last_activity
 > useradmin:0007_remove_s3_access
 > useradmin:0008_convert_documents
 - Migration 'useradmin:0008_convert_documents' is marked for no-dry-run.
Starting document conversions...

Finished running document conversions.

 - Loading initial data for useradmin.
Installed 0 object(s) from 0 fixture(s)
Running migrations for notebook:
 - Migrating forwards to 0001_initial.
 > notebook:0001_initial
 - Loading initial data for notebook.
Installed 0 object(s) from 0 fixture(s)

随后看准备好的mysql库,正常出出现很多表

现在,启动Hadoop和Hiveserver2,启动hadoop的时候,要单独运行sbin/httpfs.sh start命令启动hadoop的https服务,最后就可以重启hue,访问8888端口了

正常使用你初始化元数据库时,创建的管理用户登录就行,一切正常的话,你就可以运行hive语句了,并且右上角可以查看yarn上的任务日志

最后要说一点,就是我前面配置hdfs的时候说的这里有个问题,当你点击页面的时候不难发现,你只能看到yarn任务的列表,无法点击直接跳到日志详情去,而且从左侧预览hdfs文件也会报一个oozie的错误,这是因为hue本身访问hdfs文件依赖的是oozie这个软件,个人测试或者学习安装一般都是cdh版本的hue,而oozie无论是本身的运行逻辑,还是cdh影响这些都是很大的不可控因素,比如cdh它的hdfs默认用户以及相关权限底层控制的都很复杂,所以我们直接使用这个组件的话,不太现实,所以我个人建议,用hue写Hivesql就行,毕竟有提示,至于其他的操作,其实本身上hue就不太行,推荐大家用zeppelin,我自己的测试集群用的就是,并且就是因为zeppelin写代码相关的时候没有提示,所以才留着hue的,不然我早删了它了。

zeppelin的安装方式见--》大数据原生集群 (Hadoop2.X为核心) 本地测试环境搭建四

相关推荐
Lorin 洛林1 分钟前
Hadoop 系列 MapReduce:Map、Shuffle、Reduce
大数据·hadoop·mapreduce
DolphinScheduler社区17 分钟前
大数据调度组件之Apache DolphinScheduler
大数据
SelectDB技术团队17 分钟前
兼顾高性能与低成本,浅析 Apache Doris 异步物化视图原理及典型场景
大数据·数据库·数据仓库·数据分析·doris
panpantt3211 小时前
【参会邀请】第二届大数据与数据挖掘国际会议(BDDM 2024)邀您相聚江城!
大数据·人工智能·数据挖掘
青云交1 小时前
大数据新视界 -- 大数据大厂之 Impala 性能优化:跨数据中心环境下的挑战与对策(上)(27 / 30)
大数据·性能优化·impala·案例分析·代码示例·跨数据中心·挑战对策
soso19682 小时前
DataWorks快速入门
大数据·数据仓库·信息可视化
The_Ticker2 小时前
CFD平台如何接入实时行情源
java·大数据·数据库·人工智能·算法·区块链·软件工程
java1234_小锋2 小时前
Elasticsearch中的节点(比如共20个),其中的10个选了一个master,另外10个选了另一个master,怎么办?
大数据·elasticsearch·jenkins
Elastic 中国社区官方博客2 小时前
Elasticsearch 开放推理 API 增加了对 IBM watsonx.ai Slate 嵌入模型的支持
大数据·数据库·人工智能·elasticsearch·搜索引擎·ai·全文检索
我的运维人生2 小时前
Elasticsearch实战应用:构建高效搜索与分析平台
大数据·elasticsearch·jenkins·运维开发·技术共享