【Bigtop】利用Bigtop3.2.0编译大数据组件RPM包

利用Bigtop3.2.0编译大数据组件RPM包

前言

原文参考:Bigtop 从0开始

参考了上述的博文自己尝试了编译组件,过程还是遇到很多问题,一一记录,方便后人。

Bigtop项目官网:BigTop

正文

Mvn本地目录的修改

我在编译过程中启动bigtop镜像的时候,把mvn挂载的目录改了,因此要在修改/usr/local/maven/conf/settings.xml时填写相关配置信息,这个一定要注意,不然每次编译都重新下载依赖:

xml 复制代码
<localRepository>/root/.m2/repository/</localRepository>

这个是目前困扰我时间最久的一个组件,Flink的编译过程中,会去下载nodejs,但是nodejs.org被墙了,所以每次编译到runtime-web都会超时失败:

java 复制代码
INFO: I/O exception (java.net.SocketException) caught when processing request to {s}-https://nodejs.org:443: Network is unreachable (connect failed)

解决办法是手动下载版本包放到mvn仓库下,但是这里有个坑,如果直接使用v16.13.2版本的node,会在这一步成功后,卡在下一步,并且报一个权限异常的问题:

bash 复制代码
/ws/build/flink/rpm/BuLD/flink-1.15.3/flink-runtime-web/web-dashboard/node/npm: Permission denied

因此,整体的解决方法是,修改flink源码包中的flink-runtime-web/pom.xml文件的npm部分为下面这样:

xml 复制代码
<plugin>
               <groupId>com.github.eirslett</groupId>
               <artifactId>frontend-maven-plugin</artifactId>
               <version>1.11.0</version>
               <executions>
                   <execution>
                       <id>install node and npm</id>
                       <goals>
                           <goal>install-node-and-npm</goal>
                       </goals>
                       <configuration>
                           <!--这里修改了node的版本-->
                           <nodeVersion>v12.22.1</nodeVersion>
                           <npmVersion>6.14.12</npmVersion>
                       </configuration>
                   </execution>
                   <execution>
                       <id>npm install husky</id>
                       <goals>
                           <goal>npm</goal>
                       </goals>
                       <configuration>
                           <arguments>install husky --registry=https://registry.npmmirror.com</arguments>
                       </configuration>
                   </execution>
                   <execution>
                       <id>npm install</id>
                       <goals>
                           <goal>npm</goal>
                       </goals>
                       <configuration>
                           <arguments>install --cache-max=0 --no-save --registry=https://registry.npmmirror.com</arguments>
                           <environmentVariables>
                               <HUSKY_SKIP_INSTALL>true</HUSKY_SKIP_INSTALL>
                           </environmentVariables>
                       </configuration>
                   </execution>
                   <execution>
                       <id>npm install local</id>
                       <goals>
                           <goal>npm</goal>
                       </goals>
                       <configuration>
                           <arguments>install --registry=https://registry.npmmirror.com --force</arguments>
                       </configuration>
                   </execution>
                   <execution>
                       <id>npmrun ci-check</id>
                       <goals>
                           <goal>npm</goal>
                       </goals>
                       <configuration>
                           <arguments>run ci-check</arguments>
                       </configuration>
                   </execution>
               </executions>
               <configuration>
                   <workingDirectory>web-dashboard</workingDirectory>
               </configuration>
           </plugin>

最主要的就是修改nodejs的版本,降到12.22.1手动把nodejs包放到mvn本地目录下:

这里注意下存放在本地的包版本号前不加v,手动下载的时候自带的名称里是有v的,另外,所有组件编译碰到nodejs下载失败的都可以参照这个操作

这样就没啥大问题了,接着就是等待编译通过了,其他没遇到什么大问题:

Kafka

先执行一次kafka的编译,让编译流程自动把kafka的源码包下载下来,bigtop3.2.0分支对应的kafka版本是2.8.1

bash 复制代码
./gradlew kafka-clean kafka-pkg -PparentDir=/usr/bigtop -PpkgSuffix -PbuildThreads=16C repo 

然后我们开始做修改。

grgit版本

首先是grgit的版本要修改一下,默认用的4.1.0,对应的pom文件已经404了,对应PR在这里:MINOR: Bump version of grgit to 4.1.1

修改${SOURCE_CODE}/gradle/dependencies.gradle中grgit的版本:

java 复制代码
versions += [
  activation: "1.1.1",
  apacheda: "1.0.2",
  apacheds: "2.0.0-M24",
  argparse4j: "0.7.0",
  bcpkix: "1.66",
  checkstyle: "8.36.2",
  commonsCli: "1.4",
  gradle: "6.8.1",
  gradleVersionsPlugin: "0.36.0",
  grgit: "4.1.1", // 修改这一行
  httpclient: "4.5.13",
]

手动准备gradle的文件

先手动下载gradle-wrapper.jar到${SOURCE_CODE}/gradle/wrapper/gradle-wrapper.jar,这里用的版本是6.8.1

bash 复制代码
curl -s -S --retry 3 -L -o "gradle-wrapper.jar"  https://mirror.ghproxy.com/https://raw.githubusercontent.com/gradle/gradle/v6.8.1/gradle/wrapper/gradle-wrapper.jar

然后把gradle-6.8.1-all.zip文件准备好,做成个http的服务,我这里直接用python启了一个SimpleHTTPServer,接着我们修改${SOURCE_CODE}/gradle/wrapper/gradle-wrapper.properties文件:

bash 复制代码
distributionBase=GRADLE_USER_HOME
distributionPath=wrapper/dists
# 修改这里为对应的http服务
distributionUrl=http://172.18.2.31:444/gradle-6.8.1-all.zip
zipStoreBase=GRADLE_USER_HOME
zipStorePath=wrapper/dists

Hadoop

Bigtop3.2.0使用的是hadoop3.3.4,YarnUI的编译过程会使用到nodejs,版本默认也用的是12.22.1,如果之前编译flink时已经把版本包下载到本地,就不用再做额外操作了,否则参考Flink章节把nodejs先存到本地。

修改mvn仓库

修改${SOURCE_CODE}/pom.xml中的mvn配置:

xml 复制代码
    <distMgmtSnapshotsId>apache.snapshots.https</distMgmtSnapshotsId>
    <distMgmtSnapshotsName>Apache Development Snapshot Repository</distMgmtSnapshotsName>
    <!--这一行修改为阿里的-->
    <distMgmtSnapshotsUrl>https://maven.aliyun.com/repository/apache-snapshots</distMgmtSnapshotsUrl>
    <distMgmtStagingId>apache.staging.https</distMgmtStagingId>
    <distMgmtStagingName>Apache Release Distribution Repository</distMgmtStagingName>
    <!--这一行修改为阿里的-->
    <distMgmtStagingUrl>https://maven.aliyun.com/repository/central</distMgmtStagingUrl>

合入Patch

这里要手动合一个Patch,否则编译YarnUI的时候还是会报如下的错:

bash 复制代码
[INFO] error triple-beam@1.4.1: The engine "node" is incompatible with this module. Expected version ">= 14.0.0". Got "12.22.1"

对应的issue:hadoop-deb FAILED on project hadoop-yarn-applications-catalog-webapp

将一下内容保存到bigtop-packages/src/common/hadoop/patch8-YARN-11528-triple-beam.diff

bash 复制代码
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp/package.json b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp/package.json
index f09442cfc4e87..59cc3da179fd0 100644
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp/package.json
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp/package.json
@@ -19,6 +19,9 @@
         "shelljs": "^0.2.6",
         "apidoc": "0.17.7"
     },
+    "resolutions": {
+        "triple-beam": "1.3.0"
+    },
     "scripts": {
         "prestart": "npm install & mvn clean package",
         "pretest": "npm install"

然后执行编译即可:

Tez

phantomjs下载不了

过程中会从github拉取phantomjs的安装压缩包,大概率会超时失败:

bash 复制代码
Saving to /tmp/phantomjs/phantomjs-2.1.1-linux-x86_64.tar.bz2\nReceiving...\n\nError making request.
Error: socket hang up
    at TLSSocket.onHangUp (_tls_wrap.js:1097:19)
    at TLSSocket.g (events.js:273:16)
    at emitNone (events.js:85:20)
    at TLSSocket.emit (events.js:179:7)
    at endReadableNT (_stream_readable.js:913:12)
    at _combinedTickCallback (internal/process/next_tick.js:74:11)
    at process._tickCallback (internal/process/next_tick.js:98:9)
    
Please report this full log at https://github.com/Medium/phantomjs"

网上各种让用npm手动安装的,这里其实只需要手动下载一下放在/tmp/phantomjs/phantomjs-2.1.1-linux-x86_64.tar.bz2即可,编译过程自己就会拿tmp目录下的压缩包

allow-root

默认pom.xml把sudo禁用了,会得到下面的报错:

bash 复制代码
[INFO] Running 'bower install --allow-root=false' in /ws/build/tez/rpm/BUILD/apache-tez-0.10.1-src/tez-ui/src/main/webapp
[ERROR] bower ESUDO         Cannot be run with sudo
[ERROR] 
[ERROR] Additional error details:
[ERROR] Since bower is a user command, there is no need to execute it with superuser permissions.
[ERROR] If you're having permission errors when using bower without sudo, please spend a few minutes learning more about how your system should work and make any necessary repairs.
[ERROR] 
[ERROR] http://www.joyent.com/blog/installing-node-and-npm
[ERROR] https://gist.github.com/isaacs/579814
[ERROR] 
[ERROR] You can however run a command with sudo using "--allow-root" option

手动把allow-root打开,改为true

然后一把过:

Zeppelin

替换下载路径

变异过程中zeppelin会去下载部分组件的源码包,直接连的apache源,很慢:

这里最好替换成国内的原,但是版本比较老,阿里和清华的源已经没了,这里目前尝试可以替换成华为源:

rlang/pom.xml和spark/pom.xml

xml 复制代码
        <spark.src.download.url>
            https://mirrors.huaweicloud.com/apache/spark/${spark.archive}/${spark.archive}.tgz
        </spark.src.download.url>
        <spark.bin.download.url>
            https://mirrors.huaweicloud.com/apache/spark/${spark.archive}/${spark.archive}-bin-without-hadoop.tgz
        </spark.bin.download.url>

flink/flink-scala-2.11/flink-scala-parent/pom.xml
flink/flink-scala-parent/pom.xml
flink/flink-scala-2.12/flink-scala-parent/pom.xml

xml 复制代码
  <properties>    
  	<flink.bin.download.url>https://mirrors.huaweicloud.com/apache/flink/flink-${flink.version}/flink-${flink.version}-bin-scala_${flink.scala.binary.version}.tgz</flink.bin.download.url>
  </properties>

git设置

过程中有个前端依赖会从git拉取代码:

这样拉不到,要配下git的配置:

bash 复制代码
git config --global url.https://gh-proxy.com/https://github.com/.insteadOf git://github.com/

OK,接下来等编译通过:

相关推荐
用户0328472220701 天前
如何搭建本地yum源(上)
运维
得物技术3 天前
从埋点需求到规则资产:Hermes Agent 重构得物数仓工作流
大数据·llm·ai编程
久美子3 天前
AI驱动数仓建设的Harness工程实践——本体建模、知识分层与上下文工程
大数据
大树884 天前
金刚石散热越强,管路越先见顶
大数据·运维·服务器·人工智能·ai
摇滚侠4 天前
Linux CentOS7 rpm 安装 MySQL 5.7
linux·运维·mysql
大志哥1234 天前
ES和Logstash日志链路系统上线后遭遇切片爆炸(解决)
大数据·elasticsearch
霸道流氓气质4 天前
领域驱动设计(DDD)在 Spring Boot 微服务中的实践指南
运维·spring boot·微服务
Inhand陈工4 天前
基于台达PLC与映翰通IG502的智慧水产养殖精准投喂与远程运维解决方案
运维·人工智能·物联网·阿里云·信息与通信
果丁智能4 天前
物联网智能锁赋能集中式住宿:身份核验与远程权限管控的全链路技术实践
大数据·人工智能·物联网·智能家居
酣大智4 天前
ARP代理--工作原理
运维·网络·arp·arp代理