Hadoop入门机安装hadoop

0目录

|----------------------------------------------|
| 1.Hadoop 入门 2.linux 安装hadoop |

1.Hadoop入门

|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 定义 Hadoop 是一个由Apache基金会所开发的分布式系统基础架构。用户可以在不了解分布式底层细节的情况下,开发分布式程序。充分利用集群的威力进行高速运算和存储。 |
| 优势 高可靠性:Hadoop底层维护多个数据副本,所以即使hadoop某个计算元素或存储出现故障,也不会导致数据的丢失 高扩展性:在集群间分配任务数据,可方便的扩展以千计的节点 高效性:在MapReduce的思想下,Hadoop是并行工作的,以加快任何处理速度 高容错性:能够自动将失败的任务重新分配 |
| Hadoop 1.x;2.x 和3.x的区别 |
| HDFS 概述 Hadoop Distributed File System 简称HDFS,是一个分布式文件系统 HDFS 架构概述 NameNode (nn):存储文件的元数据,如文件名,文件目录结构,文件属性 DataNode (dn):在本地文件系统存储文件块数据,以及块数据的校验和 Secondary NameNode (2nn): 每隔一段时间对NameNode元数据备份 |
| YARN 概述 YetAnother Resource Negotiator 简称YARN,另一种资源协调者,是Hadoop的资源管理器 YARN 架构概述 ResourceManager (RM):整个集群资源(内存、cpu等)的老大 NodeManager :单个节点服务器资源老大 ApplicationMaster :单个任务运行的老大 Container :容器,相当于一台独立的服务器,里面封装了任务运行所需的资源,如内存、cpu、磁盘、网络等 |
| MapReduce 架构概述 MapReduce 将计算过程分成2个阶段,map和reduce map 阶段并行处理输入数据 Reudce 阶段对map结果进行汇总 |
| 补充hadoop生态圈 |

2.Linux安装hadoop

|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1.1 安装jDK:略 |
| 1.2 下载安装Hadoop 解压至opt/soft目录下,改名为hadoop313 更改所属用户为root 配置环境变量:vim /etc/profilre;配置完成后source /etc/profile # HADOOP_HOME export HADOOP_HOME=/opt/soft/hadoop313 export PATH=PATH:HADOOP_HOME/bin:HADOOP_HOME/sbin:HADOOP_HOME/lib export HDFS_NAMENODE_USER=root export HDFS_DATANODE_USER=root export HDFS_SECONDARYNAMENODE=root export HDFS_JOURNALNODE_USER=root export HDFS_ZKFC_USER=root export YARN_RESOURCEMANAGER_USER=root export YARN_NODEMANAGER_USER=root export HADOOP_MAPRED_HOME=HADOOP_HOME** **export HADOOP_COMMON_HOME=HADOOP_HOME export HADOOP_HDFS_HOME=HADOOP_HOME** **export HADOOP_YARN_HOME=HADOOP_HOME export HADOOP_INSTALL=HADOOP_HOME** **export HADOOP_COMMON_LIB_NATIVE_DIR=HADOOP_HOME/lib/native export HADOOP_LIBEXEC_DIR=HADOOP_HOME/libexec** **export JAVA_LIBRARY_PATH=HADOOP_HOME/lib/native export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop 创建数据目录data 切换至hadoop目录,查看目录下文件,准备进行配置 cd /opt/soft/hadoop313/etc/hadoop |
| 1.3 配置单机Hadoop (1)配置core-site.xml <configuration> <!-- 指定NameNode的地址 --> <property> <name>fs.defaultFS</name> <value>hdfs://kb129:9000</value> </property> <!-- 指定hadoop数据的存储目录 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/soft/hadoop313/data</value> </property> <!-- 配置HDFS网页登录使用的静态用户为root --> <property> <name>hadoop.http.staticuser.user</name> <value>root</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property> </configuration> 1. 配置hdfs-site.xml <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/opt/soft/hadoop313/data/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/opt/soft/hadoop313/data/dfs/data</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> (3)编辑hadoop-env.sh: > (4)配置yarn-site.xml <configuration> <!-- Site specific YARN configuration properties --> <!-- 每隔20s测试连接 --> <property> <name>yarn.resourcemanager.connect.retry-interval.ms</name> <value>20000</value> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> </property> <property> <name>yarn.nodemanager.localizer.address</name> <value>kb129:8040</value> </property> <property> <name>yarn.nodemanager.address</name> <value>kb129:8050</value> </property> <property> <name>yarn.nodemanager.webapp.address</name> <value>kb129:8042</value> </property> <!-- 指定MapReduce走shuffle --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/opt/soft/hadoop313/yarndata/yarn</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/opt/soft/hadoop313/yarndata/log</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration> 更改workers内容为kb129 (4)配置mapred-site.xml <configuration> <!-- 指定MapReduce程序运行在Yarn上 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>kb129:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>kb129:19888</value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>2048</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>2048</value> </property> <property> <name>mapreduce.application.classpath</name> <value>/opt/soft/hadoop313/etc/hadoop:/opt/soft/hadoop313/share/hadoop/common/lib/*:/opt/soft/hadoop313/share/hadoop/common/*:/opt/soft/hadoop313/share/hadoop/hdfs/*:/opt/soft/hadoop313/share/hadoop/hdfs/lib/*:/opt/soft/hadoop313/share/hadoop/mapreduce/*:/opt/soft/hadoop313/share/hadoop/mapreduce/lib/*:/opt/soft/hadoop313/share/hadoop/yarn/*:/opt/soft/hadoop313/share/hadoop/yarn/lib/*</value> </property> </configuration> |
| 1.4 启动测试hadoop (1)设置免密登录 回到根目录下配置kb129免密登录:ssh-keygen -t rsa -P "" 将本地主机的公钥文件(~/.ssh/id_rsa.pub)拷贝到远程主机 kb128 的 root 用户的 .ssh/authorized_keys 文件中,通过 SSH 连接到远程主机时可以使用公钥进行身份验证:cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys 将本地主机的公钥添加到远程主机的授权密钥列表中,以便实现通过 SSH 公钥身份验证来连接远程主机:ssh-copy-id -i ~/.ssh/id_rsa.pub -p22 root@kb128 检测登录 (2)bin目录下初始化集群hadoop namenode -format 开始 检查是否都开启 关闭 1. 网页测试:浏览器中输入网址: http://192.168.142.129:9870/ |

相关推荐
java1234_小锋10 分钟前
详细描述一下Elasticsearch索引文档的过程?
大数据·elasticsearch·搜索引擎
黄焖鸡能干四碗11 分钟前
【软件设计文档】详细设计说明书模板和实际项目案例参照,概要设计说明书,需求设计书,软件设计报告(Word原件)
大数据·软件需求·设计规范·规格说明书·1024程序员节
股票GPT分析1 小时前
《Python 股票交易分析:开启智能投资新时代》(二)
大数据·服务器·python·c#·fastapi
Viktor_Ye2 小时前
实现金蝶云星空与钉钉数据无缝集成的技术方法
java·大数据·钉钉
.生产的驴2 小时前
Docker Seata分布式事务保护搭建 DB数据源版搭建 结合Nacos服务注册
数据库·分布式·后端·spring cloud·docker·容器·负载均衡
烟雨长虹,孤鹜齐飞2 小时前
【分布式锁解决超卖问题】setnx实现
redis·分布式·学习·缓存·java-ee
Mephisto.java3 小时前
【大数据学习 | Spark-Core】关于distinct算子
大数据·hive·hadoop·redis·spark·hbase
King's King3 小时前
蜜雪冰城也入局智慧物流,包括智能控制系统集成、机器人研发销售,开始招兵买马了...
大数据·人工智能·机器人
十二点的泡面3 小时前
大数据面试题每日练习-- Hadoop是什么?
大数据·hadoop·分布式
说私域4 小时前
社交电商专业赋能高校教育与产业协同发展:定制开发AI智能名片及2+1链动商城小程序的创新驱动
大数据·人工智能·小程序