macos安装local模式spark

文章目录

配置说明

Scala - 3.18+

Spark - 3.5.0

Hadoop - 3.3.6

安装hadoop

  1. 这里下载相应版本的hadoop
  2. 下载后解压,配置系统环境变量
shell 复制代码
> sudo vim /etc/profile

添加以下两行

shell 复制代码
export HADOOP_HOME=/Users/collinsliu/hadoop-3.3.6/
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

请自行替换位置

然后执行并生效系统环境变量

shell 复制代码
> source /etc/profile

安装Spark

  1. 这里下载相应版本的Spark
  2. 下载后解压,同时类似于hadoop,配置系统环境变量
shell 复制代码
> sudo vim /etc/profile

添加以下两行

shell 复制代码
export SPARK_HOME=/Users/collinsliu/spark-3.5.0
export PATH=$PATH:$SPARK_HOME/bin

请自行替换位置

然后执行并生效系统环境变量

shell 复制代码
> source /etc/profile
  1. 然后配置spark连接hadoop,形成local模式:
    a. 首先进入conf文件夹
shell 复制代码
> cd /Users/collinsliu/spark-3.5.0/conf

b. 其次替换配置文件

shell 复制代码
> cp spark-env.sh.template spark-env.sh
> vim spark-env.sh

c. 添加以下三条连接,使得spark能够找到对应的hadoop和相应的包

shell 复制代码
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_311.jdk/Contents/Home
export HADOOP_CONF_DIR=/Users/collinsliu/hadoop-3.3.6/etc/hadoop
export SPARK_DIST_CLASSPATH=$(/Users/collinsliu/hadoop-3.3.6/bin/hadoop classpath)

测试安装成功

  1. 使用内置命令测试
shell 复制代码
> cd /Users/collinsliu/spark-3.5.0/
> ./run-example SparkPi

可以看到很多输出,最后找到

shell 复制代码
...
24/02/07 00:31:33 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks resource profile 0
24/02/07 00:31:33 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0) (192.168.0.100, executor driver, partition 0, PROCESS_LOCAL, 8263 bytes) 
24/02/07 00:31:33 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1) (192.168.0.100, executor driver, partition 1, PROCESS_LOCAL, 8263 bytes) 
24/02/07 00:31:33 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
24/02/07 00:31:33 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
24/02/07 00:31:34 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1101 bytes result sent to driver
24/02/07 00:31:34 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1101 bytes result sent to driver
24/02/07 00:31:34 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 1120 ms on 192.168.0.100 (executor driver) (1/2)
24/02/07 00:31:34 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 923 ms on 192.168.0.100 (executor driver) (2/2)
24/02/07 00:31:34 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
24/02/07 00:31:34 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 1.737 s
24/02/07 00:31:34 INFO DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job
24/02/07 00:31:34 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished
24/02/07 00:31:34 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 1.807145 s
Pi is roughly 3.1405357026785135

说明安装成功

  1. 打开sparkshell
shell 复制代码
> spark-shell

出现以下内容

shell 复制代码
24/02/07 00:48:12 WARN Utils: Your hostname, Collinss-MacBook-Air.local resolves to a loopback address: 127.0.0.1; using 192.168.0.100 instead (on interface en0)
24/02/07 00:48:12 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.5.0
      /_/
         
Using Scala version 2.13.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_311)
Type in expressions to have them evaluated.
Type :help for more information.
24/02/07 00:48:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://192.168.0.100:4040
Spark context available as 'sc' (master = local[*], app id = local-1707238103536).
Spark session available as 'spark'.

scala> 

说明安装成功

相关推荐
城事漫游Molly12 分钟前
案例研究:如何明智地选择案例、精巧地界定边界、深刻地进行分析?
大数据·人工智能·ai写作·论文笔记
LaughingZhu39 分钟前
Product Hunt 每日热榜 | 2026-05-12
大数据·人工智能·经验分享·神经网络·产品运营
eastyuxiao1 小时前
数字孪生(Digital Twin)从入门到实战教程
大数据·人工智能·数字孪生
皮皮学姐分享-ppx1 小时前
上市公司数字技术风险暴露数据(2010-2024)|《经济研究》同款大模型测算
大数据·网络·数据库·人工智能·chatgpt·制造
数字会议深科技1 小时前
政务表决会议升级方案解析|多形态大型表决系统融合方案科普
大数据·人工智能·政务·无纸化·会议厂商·ai会议生态服务商·表决系统
互联网科技看点2 小时前
泛微・齐业成核心优势深度解析:数智化费控管理标杆
大数据·人工智能·云计算
财经资讯数据_灵砚智能3 小时前
基于全球经济类多源新闻的NLP情感分析与数据可视化(日间)2026年5月13日
大数据·人工智能·python·信息可视化·自然语言处理
霑潇雨3 小时前
Spark学习基础转换算子案例(单词计数(WordCount))
java·大数据·分布式·学习·spark·maven
Vwms3 小时前
2026年电商行业WMS系统选型指南
大数据·人工智能·产品运营
最后一支迷迭香4 小时前
苹果的MacOS系统适合做Java开发吗
java·开发语言·macos