【Ubuntu datasophon1.2.1 二开之五:解决HIVE安装问题】

Ubuntu datasophon1.2.1 二开之五:解决HIVE安装问题

背景

安装完YARN之后,继续往下安装HIVE。这个组件问题不少,下面逐一说明

问题及解决

为hive增加HA参数

解决办法:我看原来的service_ddl.json,缺失HA参数,增加了几个如下截图:

javascript 复制代码
{
      "name": "hive.server2.thrift.bind.host",
      "label": "HiveServer2绑定主机",
      "description": "HiveServer2服务绑定到的主机名或IP,0.0.0.0表示绑定所有网络接口",
      "required": true,
      "type": "select",
      "value": "0.0.0.0",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "0.0.0.0",
      "selectValue": ["0.0.0.0", "${host}", "localhost", "127.0.0.1"]
    },
    {
      "name": "hive.metastore.thrift.bind.host",
      "label": "MetaStore绑定主机",
      "description": "Hive MetaStore服务绑定到的主机名或IP,0.0.0.0表示绑定所有网络接口",
      "required": true,
      "type": "select",
      "value": "0.0.0.0",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "0.0.0.0",
      "selectValue": ["0.0.0.0", "${host}", "localhost", "127.0.0.1"]
    },
    {
      "name": "hive.metastore.uris",
      "label": "Hive MetaStore服务地址",
      "description": "多个MetaStore用逗号分隔,用于HA。对于MetaStore节点应留空或注释",
      "required": false,
      "type": "input",
      "value": "thrift://ddp1:9083,thrift://ddp2:9083",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": ""
    },
    {
      "name": "hive.metastore.failure.retries",
      "label": "MetaStore连接重试次数",
      "description": "连接MetaStore失败时的重试次数",
      "required": false,
      "type": "input",
      "value": "3",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "3"
    },
    {
      "name": "hive.metastore.client.connect.retry.delay",
      "label": "MetaStore重试延迟",
      "description": "连接MetaStore失败时的重试延迟时间",
      "required": false,
      "type": "input",
      "value": "5s",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "5s"
    },
    {
      "name": "role.type",
      "label": "角色类型",
      "description": "标识节点角色类型,用于条件配置",
      "required": false,
      "type": "select",
      "value": "combined",
      "configurableInWizard": true,
      "hidden": true,
      "defaultValue": "combined",
      "selectValue": ["metastore", "hiveserver2", "combined", "client"]
    },
    {
      "name": "hive.metastore.server.max.threads",
      "label": "MetaStore最大线程数",
      "description": "MetaStore服务处理请求的最大线程数",
      "required": false,
      "type": "input",
      "value": "100",
      "configurableInWizard": true,
      "hidden": false,
      "defaultValue": "100"
    }

java.lang.NoClassDefFoundError: org/apache/tez/dag/api/TezConfiguration

复制代码
       at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession$AbstractTriggerValidator.startTriggerValidator(TezSessionPoolSession.java:74)
       at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.initTriggers(TezSessionPoolManager.java:207)
       at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:114)
       at org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:839)
       at org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:822)
       at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:745)
       at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1037)
       at org.apache.hive.service.server.HiveServer2.access$1600(HiveServer2.java:140)
       at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1305)
       at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1149)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)

解决办法:这个问题,是因为hive版本跟hadoop版本不匹配,原来hive版本为3.1.0 hadoop 3.3.3 导致启动时报类找不到或者方法错误之类,后来升级到hive 3.1.3才能解决

安装后,找不到control_hive.sh

将上面的control_hive.sh 打包到hive.tar.gz.先解压原来的hive.tar.gz到临时目录,然后上传control_hive.sh 到解包后的根目录,然后

javascript 复制代码
tar -zcvf hive-3.1.3.tar.gz   xxxx/hive-3.1.3/

然后复制hive-3.1.3.tar.gz 到/opt/datasophon/DDP/package,然后生成md5:

md5sum hive-3.1.3.tar.gz | awk '{print $1}' > hive-3.1.3.tar.gz.md5

hive-3.1.3 启动时报:bin/hiveserver2

Error opening zip file or JAR manifest missing : bin/.../jmx/jmx_prometheus_javaagent-0.16.1.jar

Unable to determine Hadoop version information.

'hadoop version' returned:

Error opening zip file or JAR manifest missing : bin/.../jmx/jmx_prometheus_javaagent-0.16.1.jar

Error occurred during initialization of VM agent library failed to init: instrument

解决:

1.在hive-3.1.3目录下创建jmx目录,然后下载jmx_prometheus_javaagent-0.16.1.jar到该目录

2.在该目录下配置hiveserver2.yaml和metastore.yaml,内容如下:

hiveserver2.yaml

javascript 复制代码
startDelaySeconds: 0
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
rules:
  - pattern: ".*"

metastore.yaml

javascript 复制代码
# Hive Metastore JMX Exporter Configuration
startDelaySeconds: 0
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
rules:
  - pattern: ".*"

在HA环境下,初始化schema失败

只是在HA环境下,多节点同时(也有可能先后)初始化schema失败,比如我安装时:分配服务Master角色:

HiveServer2:ddp1,ddp2

HiveMetaStore:ddp1,ddp2

ddp1,ddp2都执行初始化失败,基于二开环境已经引入redis,节点之间加锁,只允许一个节点初始化就比较简单,涉及代码如下:

RedisLockUtil.java

javascript 复制代码
// RedisLockUtil.java - 添加详细日志版本
package com.datasophon.common.redis.utils;

import com.datasophon.common.redis.manager.RedisManager;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.UUID;

public class RedisLockUtil {

    private static final Logger logger = LoggerFactory.getLogger(RedisLockUtil.class);

    // Lua脚本:原子获取锁
    private static final String ACQUIRE_LOCK_SCRIPT =
            "local lockKey = KEYS[1]\n" +
                    "local lockValue = ARGV[1]\n" +
                    "local timeout = ARGV[2]\n" +
                    "local result = redis.call('setnx', lockKey, lockValue)\n" +
                    "if result == 1 then\n" +
                    "    redis.call('expire', lockKey, timeout)\n" +
                    "    return 1\n" +
                    "else\n" +
                    "    local currentValue = redis.call('get', lockKey)\n" +
                    "    if currentValue == lockValue then\n" +
                    "        redis.call('expire', lockKey, timeout)\n" +
                    "        return 1\n" +
                    "    else\n" +
                    "        local ttl = redis.call('ttl', lockKey)\n" +
                    "        if ttl == -1 then\n" +
                    "            redis.call('expire', lockKey, timeout)\n" +
                    "        end\n" +
                    "        return 0\n" +
                    "    end\n" +
                    "end";

    // Lua脚本:原子释放锁
    private static final String RELEASE_LOCK_SCRIPT =
            "local lockKey = KEYS[1]\n" +
                    "local lockValue = ARGV[1]\n" +
                    "local currentValue = redis.call('get', lockKey)\n" +
                    "if currentValue == lockValue then\n" +
                    "    return redis.call('del', lockKey)\n" +
                    "else\n" +
                    "    return 0\n" +
                    "end";

    /**
     * 尝试获取分布式锁(带详细日志)
     */
    public static String tryLock(String lockKey, long timeout, long waitTime) {
        String lockValue = UUID.randomUUID().toString();
        long startTime = System.currentTimeMillis();

        logger.info("🔐 开始获取Redis锁: {}", lockKey);
        logger.info("  锁值: {}", lockValue);
        logger.info("  超时: {}秒, 等待时间: {}毫秒", timeout, waitTime);

        int attempt = 0;
        while (System.currentTimeMillis() - startTime < waitTime) {
            attempt++;
            logger.info("  尝试获取锁 (第{}次)", attempt);

            try {
                // 使用RedisManager执行Lua脚本获取锁
                Long result = RedisManager.execute(jedis ->
                        (Long) jedis.eval(ACQUIRE_LOCK_SCRIPT,
                                1, lockKey, lockValue, String.valueOf(timeout))
                );

                logger.info("  Redis执行结果: {}", result);

                if (result != null && result == 1) {
                    logger.info("✅ 成功获取锁: {}, 值: {}", lockKey, lockValue);
                    return lockValue;
                } else {
                    logger.info("⚠️ 获取锁失败,锁可能被其他进程持有");

                    // 检查当前锁的状态
                    try {
                        String currentHolder = RedisManager.execute(jedis -> jedis.get(lockKey));
                        long ttl = RedisManager.execute(jedis -> jedis.ttl(lockKey));
                        logger.info("  当前锁持有者: {}, TTL: {}秒", currentHolder, ttl);
                    } catch (Exception e) {
                        logger.warn("  无法获取锁状态: {}", e.getMessage());
                    }
                }

                // 等待后重试
                long sleepTime = 100L;
                logger.info("  等待 {}ms 后重试...", sleepTime);
                Thread.sleep(sleepTime);

            } catch (Exception e) {
                logger.error("❌ 获取锁时发生异常", e);
                logger.error("  异常详情: ", e);

                // 如果是连接异常,可能是Redis服务问题
                if (e.getMessage() != null &&
                        (e.getMessage().contains("Connection") ||
                                e.getMessage().contains("timeout"))) {
                    logger.error("  ⚠️ 可能是Redis连接问题,请检查Redis服务状态");
                }

                try {
                    Thread.sleep(100);
                } catch (InterruptedException ie) {
                    Thread.currentThread().interrupt();
                    break;
                }
            }
        }

        logger.warn("⏰ 在 {}ms 内未能获取锁: {}", waitTime, lockKey);
        return null;
    }

    /**
     * 释放分布式锁(带详细日志)
     */
    public static boolean unlock(String lockKey, String lockValue) {
        logger.info("🔓 尝试释放锁: {}", lockKey);
        logger.info("  期望锁值: {}", lockValue);

        if (lockValue == null) {
            logger.warn("⚠️ 锁值为null,无法释放");
            return false;
        }

        try {
            // 先获取当前锁的值
            String currentValue = RedisManager.execute(jedis -> jedis.get(lockKey));
            logger.info("  当前锁的值: {}", currentValue);

            Long result = RedisManager.execute(jedis ->
                    (Long) jedis.eval(RELEASE_LOCK_SCRIPT,
                            1, lockKey, lockValue)
            );

            boolean success = result != null && result == 1;
            if (success) {
                logger.info("✅ 成功释放锁: {}", lockKey);
            } else {
                logger.warn("⚠️ 释放锁失败或不是锁的持有者");
                logger.info("  当前值: {}, 期望值: {}", currentValue, lockValue);
                if (currentValue != null && !currentValue.equals(lockValue)) {
                    logger.warn("  ⚠️ 锁值不匹配,可能是锁已过期或被其他进程持有");
                }
            }
            return success;

        } catch (Exception e) {
            logger.error("❌ 释放锁时发生异常: {}", lockKey, e);
            logger.error("  异常堆栈: ", e);
            return false;
        }
    }

    /**
     * 执行带锁的操作(带详细日志)
     */
    public static <T> T executeWithLock(String lockKey, long lockTimeout, long waitTime,
                                        LockOperation<T> operation) throws Exception {
        logger.info("🔄 =========================================");
        logger.info("🔄 开始执行带锁操作");
        logger.info("🔄 锁键: {}", lockKey);
        logger.info("🔄 锁超时: {}秒", lockTimeout);
        logger.info("🔄 等待时间: {}毫秒", waitTime);
        logger.info("🔄 =========================================");

        String lockValue = null;
        try {
            // 1. 获取锁
            logger.info("1️⃣ 阶段: 获取锁");
            lockValue = tryLock(lockKey, lockTimeout, waitTime);

            if (lockValue == null) {
                logger.error("❌ 无法获取锁,操作中止");
                throw new RuntimeException("Failed to acquire lock for: " + lockKey);
            }

            logger.info("✅ 成功获取锁,准备执行操作");

            // 2. 执行实际操作
            logger.info("2️⃣ 阶段: 执行操作");
            T result = operation.execute();
            logger.info("✅ 操作执行完成,结果: {}", result);

            return result;

        } catch (Exception e) {
            logger.error("❌ 执行带锁操作时发生异常", e);
            logger.error("  操作类型: {}", operation.getClass().getSimpleName());
            logger.error("  异常详情: ", e);
            throw e;

        } finally {
            // 3. 释放锁
            logger.info("3️⃣ 阶段: 释放锁");
            if (lockValue != null) {
                boolean released = unlock(lockKey, lockValue);
                if (released) {
                    logger.info("✅ 锁已释放");
                } else {
                    logger.warn("⚠️ 锁释放失败,可能需要手动清理");
                }
            } else {
                logger.info("ℹ️ 无需释放锁(未成功获取)");
            }

            logger.info("🔄 =========================================");
            logger.info("🔄 带锁操作执行完成");
            logger.info("🔄 =========================================");
        }
    }

    @FunctionalInterface
    public interface LockOperation<T> {
        T execute() throws Exception;
    }

    /**
     * 测试Redis连接
     */
    public static boolean testRedisConnection() {
        logger.info("🛠️ 测试Redis连接...");

        try {
            String result = RedisManager.execute(jedis -> jedis.ping());
            boolean connected = "PONG".equals(result);

            if (connected) {
                logger.info("✅ Redis连接正常: {}", result);
            } else {
                logger.warn("⚠️ Redis连接异常,返回: {}", result);
            }

            return connected;

        } catch (Exception e) {
            logger.error("❌ Redis连接测试失败", e);
            return false;
        }
    }

    /**
     * 查看锁状态
     */
    public static void checkLockStatus(String lockKey) {
        logger.info("🔍 检查锁状态: {}", lockKey);

        try {
            String value = RedisManager.execute(jedis -> jedis.get(lockKey));
            long ttl = RedisManager.execute(jedis -> jedis.ttl(lockKey));

            if (value == null) {
                logger.info("  锁不存在或已过期");
            } else {
                logger.info("  锁值: {}", value);
                logger.info("  TTL: {}秒", ttl);

                if (ttl == -1) {
                    logger.warn("  ⚠️ 锁没有设置过期时间(永不过期)");
                } else if (ttl == -2) {
                    logger.info("  锁已过期但未删除");
                } else if (ttl < 60) {
                    logger.info("  锁将在 {} 秒后过期", ttl);
                }
            }

        } catch (Exception e) {
            logger.error("❌ 检查锁状态失败", e);
        }
    }
}

这个文件是新增的

HiveSchemaService.java

javascript 复制代码
// HiveSchemaService.java - 修正编译错误版本
package com.datasophon.worker.service;

import com.datasophon.common.Constants;
import com.datasophon.common.redis.utils.RedisLockUtil;
import com.datasophon.common.utils.ExecResult;
import com.datasophon.common.utils.ShellUtils;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;

public class HiveSchemaService {

    private static final Logger logger = LoggerFactory.getLogger(HiveSchemaService.class);

    // Redis锁配置
    private static final String SCHEMA_LOCK_KEY = "datasophon:hive:schema:init:lock";
    private static final long LOCK_TIMEOUT = 300; // 锁超时时间(秒)
    private static final long LOCK_WAIT_TIME = 30000; // 等待锁的时间(毫秒)

    /**
     * 初始化Hive Schema(带详细日志)
     */
    public static boolean initSchemaWithLock(String installPath, String packageName) {
        logger.info("🚀 =========================================");
        logger.info("🚀 Hive Schema 初始化开始");
        logger.info("🚀 安装路径: {}", installPath);
        logger.info("🚀 包名: {}", packageName);
        logger.info("🚀 Redis锁键: {}", SCHEMA_LOCK_KEY);
        logger.info("🚀 =========================================");

        try {
            // 1. 先测试Redis连接
            logger.info("1️⃣ 测试Redis连接...");
            boolean redisConnected = RedisLockUtil.testRedisConnection();

            if (!redisConnected) {
                logger.warn("⚠️ Redis连接测试失败,尝试直接初始化");
                return initSchemaDirect(installPath, packageName);
            }

            // 2. 检查是否已初始化
            logger.info("2️⃣ 检查Schema是否已初始化...");
            if (isSchemaInitialized(installPath, packageName)) {
                logger.info("✅ Schema已经初始化,跳过初始化过程");
                return true;
            }

            // 3. 使用Redis锁执行初始化
            logger.info("3️⃣ 使用Redis锁执行Schema初始化...");
            boolean result = RedisLockUtil.executeWithLock(
                    SCHEMA_LOCK_KEY,
                    LOCK_TIMEOUT,
                    LOCK_WAIT_TIME,
                    () -> {
                        logger.info("🔒 在锁保护区内,开始执行Schema初始化");

                        // 双重检查
                        if (isSchemaInitialized(installPath, packageName)) {
                            logger.info("✅ 在等待锁期间,Schema已被其他节点初始化");
                            return true;
                        }

                        return initSchemaDirect(installPath, packageName);
                    }
            );

            logger.info("🎉 Schema初始化结果: {}", result);
            return result;

        } catch (Exception e) {
            logger.error("❌ Schema初始化过程中发生异常", e);
            logger.error("  异常堆栈: ", e);

            // 异常降级:尝试直接初始化
            logger.warn("🔄 尝试直接初始化(绕过Redis锁)...");
            try {
                boolean directResult = initSchemaDirect(installPath, packageName);
                logger.info("直接初始化结果: {}", directResult);
                return directResult;
            } catch (Exception ex) {
                logger.error("❌ 直接初始化也失败了", ex);
                return false;
            }

        } finally {
            logger.info("🚀 =========================================");
            logger.info("🚀 Hive Schema 初始化结束");
            logger.info("🚀 =========================================");
        }
    }

    /**
     * 直接初始化Schema(无Redis锁)
     */
    public static boolean initSchemaDirect(String installPath, String packageName) {
        logger.info("🔧 开始直接初始化Schema(无锁)");

        String workDir = installPath + Constants.SLASH + packageName;
        logger.info("  工作目录: {}", workDir);

        try {
            // 1. 检查schematool是否存在
            logger.info("  检查schematool...");
            ArrayList<String> checkCmd = new ArrayList<>();
            checkCmd.add("bin/schematool");
            checkCmd.add("-help");

            // 修正点1:统一使用4个参数的方法
            ExecResult checkResult = ShellUtils.execWithStatus(workDir, checkCmd, 10L);
            if (!checkResult.getExecResult()) {
                logger.error("❌ schematool不存在或无法执行");
                return false;
            }

            // 2. 执行初始化
            logger.info("  执行Schema初始化...");
            ArrayList<String> commands = new ArrayList<>();
            commands.add("bin/schematool");
            commands.add("-dbType");
            commands.add("mysql");
            commands.add("-initSchema");

            // 设置环境变量
            Map<String, String> env = new HashMap<>();
            env.put("HIVE_HOME", workDir);
            String hadoopHome = System.getenv("HADOOP_HOME");
            if (hadoopHome != null) {
                env.put("HADOOP_HOME", hadoopHome);
                env.put("PATH", workDir + "/bin:" + hadoopHome + "/bin:" + System.getenv("PATH"));
            }

            logger.info("  命令: {}", String.join(" ", commands));
            logger.info("  环境变量: {}", env);

            // 修正点2:使用正确的参数顺序
            // 注意:这里需要查看ShellUtils类的实际方法签名
            // 假设ShellUtils有这个方法:execWithStatus(workDir, commands, env, timeout)
            ExecResult execResult = ShellUtils.execWithStatus(workDir, commands,  300L,logger);

            if (execResult.getExecResult()) {
                logger.info("✅ Schema初始化成功!");
                if (execResult.getExecOut() != null) {
                    logger.debug("输出: {}", execResult.getExecOut());
                }
                return true;
            } else {
                logger.error("❌ Schema初始化失败");
                if (execResult.getExecErrOut() != null) {
                    logger.error("错误输出: {}", execResult.getExecErrOut());
                }
                return false;
            }

        } catch (Exception e) {
            logger.error("🔥 Schema初始化异常", e);
            return false;
        }
    }

    /**
     * 检查Schema是否已初始化
     */
    public static boolean isSchemaInitialized(String installPath, String packageName) {
        try {
            ArrayList<String> commands = new ArrayList<>();
            commands.add("bin/schematool");
            commands.add("-dbType");
            commands.add("mysql");
            commands.add("-info");

            String workDir = installPath + Constants.SLASH + packageName;

            // 修正点3:统一参数调用
            ExecResult execResult = ShellUtils.execWithStatus(workDir, commands, 30L);

            if (execResult.getExecResult()) {
                String output = execResult.getExecOut();
                boolean initialized = output != null && output.contains("Verification completed");
                logger.info("Schema初始化检查: {}", initialized ? "已初始化" : "未初始化");
                return initialized;
            }

            logger.warn("Schema检查失败");
            return false;

        } catch (Exception e) {
            logger.warn("检查Schema状态时异常", e);
            return false;
        }
    }
}

这个文件也是新增的

HiveServer2HandlerStrategy.java

javascript 复制代码
package com.datasophon.worker.strategy;

import cn.hutool.core.io.FileUtil;
import com.datasophon.common.Constants;
import com.datasophon.common.cache.CacheUtils;
import com.datasophon.common.command.ServiceRoleOperateCommand;
import com.datasophon.common.enums.CommandType;
import com.datasophon.common.utils.ExecResult;
import com.datasophon.common.utils.PropertyUtils;
import com.datasophon.common.utils.ShellUtils;
import com.datasophon.worker.handler.ServiceHandler;
import com.datasophon.worker.service.HiveSchemaService;
import com.datasophon.worker.utils.KerberosUtils;

import java.util.ArrayList;

public class HiveServer2HandlerStrategy extends AbstractHandlerStrategy implements ServiceRoleStrategy {

    public HiveServer2HandlerStrategy(String serviceName, String serviceRoleName) {
        super(serviceName, serviceRoleName);
    }

    @Override
    public ExecResult handler(ServiceRoleOperateCommand command) {
        ExecResult startResult = new ExecResult();
        final String workPath = Constants.INSTALL_PATH + Constants.SLASH + command.getDecompressPackageName();
        ServiceHandler serviceHandler = new ServiceHandler(command.getServiceName(), command.getServiceRoleName());

        if (command.getEnableRangerPlugin()) {
            logger.info("start to enable hive hdfs plugin");
            ArrayList<String> commands = new ArrayList<>();
            commands.add("sh");
            commands.add("./enable-hive-plugin.sh");
            if (!FileUtil.exist(Constants.INSTALL_PATH + Constants.SLASH + command.getDecompressPackageName()
                    + "/ranger-hive-plugin/success.id")) {
                ExecResult execResult = ShellUtils.execWithStatus(Constants.INSTALL_PATH + Constants.SLASH
                        + command.getDecompressPackageName() + "/ranger-hive-plugin", commands, 30L, logger);
                if (execResult.getExecResult()) {
                    logger.info("enable ranger hive plugin success");
                    FileUtil.writeUtf8String("success", Constants.INSTALL_PATH + Constants.SLASH
                            + command.getDecompressPackageName() + "/ranger-hive-plugin/success.id");
                } else {
                    logger.info("enable ranger hive plugin failed");
                    return execResult;
                }
            }
        }

        logger.info("command is slave : {}", command.isSlave());

        if (command.getCommandType().equals(CommandType.INSTALL_SERVICE) && !command.isSlave()) {
            // 使用分布式锁初始化Hive database
            logger.info("start to init hive schema with distributed lock");

            // 先检查是否已初始化
            if (HiveSchemaService.isSchemaInitialized(
                    Constants.INSTALL_PATH,
                    command.getDecompressPackageName())) {
                logger.info("hive schema already initialized, skipping...");
            } else {
                // 使用Redis分布式锁进行初始化
                boolean initSuccess = HiveSchemaService.initSchemaWithLock(
                        Constants.INSTALL_PATH,
                        command.getDecompressPackageName());

                if (!initSuccess) {
                    logger.info("init hive schema failed or was executed by another node");
                    // 这里不返回失败,因为可能是其他节点正在初始化
                    // 可以记录日志并继续
                } else {
                    logger.info("init hive schema success");
                }
            }
        }

        if (command.getEnableKerberos()) {
            logger.info("start to get hive keytab file");
            String hostname = CacheUtils.getString(Constants.HOSTNAME);
            KerberosUtils.createKeytabDir();
            if (!FileUtil.exist("/etc/security/keytab/hive.service.keytab")) {
                KerberosUtils.downloadKeytabFromMaster("hive/" + hostname, "hive.service.keytab");
            }
        }

        if (command.getCommandType().equals(CommandType.INSTALL_SERVICE)) {
            String hadoopHome = PropertyUtils.getString("HADOOP_HOME");
            ShellUtils.exceShell("sudo -u hdfs " + hadoopHome + "/bin/hdfs dfs -mkdir -p /user/hive/warehouse");
            ShellUtils.exceShell("sudo -u hdfs " + hadoopHome + "/bin/hdfs dfs -mkdir -p /tmp/hive");
            ShellUtils
                    .exceShell("sudo -u hdfs " + hadoopHome + "/bin/hdfs dfs -chown hive:hadoop /user/hive/warehouse");
            ShellUtils.exceShell("sudo -u hdfs " + hadoopHome + "/bin/hdfs dfs -chown hive:hadoop /tmp/hive");
            ShellUtils.exceShell("sudo -u hdfs " + hadoopHome + "/bin/hdfs dfs -chmod 777 /tmp/hive");

            // 存在 tez 则创建软连接
            final String tezHomePath = Constants.INSTALL_PATH + Constants.SLASH + "tez";
            if (FileUtil.exist(tezHomePath)) {
                ShellUtils.exceShell("ln -s " + tezHomePath + "/conf/tez-site.xml " + workPath + "/conf/tez-site.xml");
            }
        }

        startResult = serviceHandler.start(command.getStartRunner(), command.getStatusRunner(),
                command.getDecompressPackageName(), command.getRunAs());
        return startResult;
    }
}

主要改了这几行:

javascript 复制代码
if (command.getCommandType().equals(CommandType.INSTALL_SERVICE) && !command.isSlave()) {
            // 使用分布式锁初始化Hive database
            logger.info("start to init hive schema with distributed lock");

            // 先检查是否已初始化
            if (HiveSchemaService.isSchemaInitialized(
                    Constants.INSTALL_PATH,
                    command.getDecompressPackageName())) {
                logger.info("hive schema already initialized, skipping...");
            } else {
                // 使用Redis分布式锁进行初始化
                boolean initSuccess = HiveSchemaService.initSchemaWithLock(
                        Constants.INSTALL_PATH,
                        command.getDecompressPackageName());

                if (!initSuccess) {
                    logger.info("init hive schema failed or was executed by another node");
                    // 这里不返回失败,因为可能是其他节点正在初始化
                    // 可以记录日志并继续
                } else {
                    logger.info("init hive schema success");
                }
            }

hive 初始化 schema 数据库表为空的问题

解决:即使上面redis锁加上后,hive数据库依然是空的,后来手工执行初始化发现缺少mysql8驱动,原来的jdbc 驱动类已经改成mysql8,下载jdbc驱动到hive-3.1.3/lib目录,重新打tar包,重新安装

ddp2,执行初始化schema失败

解决:上面加完驱动后,ddp1初始化成功,但是ddp2失败,我就感到奇怪,后来在ddp2上,单独执行初始化命令,发现hive用户没创建,只能回头修改hdfs安装时,创建hadoop用户,在原来基础上增加hive用户,然后编译打包(datasophon-worker),删除yarn,hdfs,hive 重头安装

代码如下:

javascript 复制代码
package com.datasophon.worker.service;

import java.io.*;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.util.ArrayList;
import java.util.List;

public class CreateHadoopUsers {
    private static final DateTimeFormatter dtf = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss");

    public static void main(String[] args) {
        System.out.println("=== 创建Hadoop用户(完整正确配置) ===");

        try {
            // 先清理(可选)
            // cleanupExisting();

            // 创建所有需要的组
            createGroup("hadoop");
            createGroup("hdfs");
            createGroup("yarn");
            createGroup("mapred");

            // 创建用户(一次性正确配置)
            createHadoopUser("hdfs", "hadoop", "hdfs");
            createHadoopUser("yarn", "hadoop", "yarn");
            createHadoopUser("mapred", "hadoop", "mapred");

            // 新增:创建 hive 用户
            createHadoopUser("hive", "hadoop", "hdfs");  // 主组 hadoop,附加组 hdfs(用于访问 HDFS)

            // 详细验证
            verifyCompleteSetup();

            // 显示最终摘要
            showFinalSummary();

            log("=== Hadoop用户创建完成 ===");

        } catch (Exception e) {
            logError("程序异常: " + e.getMessage());
            e.printStackTrace();
        }
    }

    private static void log(String message) {
        System.out.println("[" + LocalDateTime.now().format(dtf) + "] " + message);
    }

    private static void logError(String message) {
        System.err.println("[" + LocalDateTime.now().format(dtf) + "] ERROR: " + message);
    }

    private static boolean executeCommand(List<String> command) throws Exception {
        log("执行: " + String.join(" ", command));

        ProcessBuilder pb = new ProcessBuilder(command);
        Process process = pb.start();

        BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream()));
        BufferedReader errorReader = new BufferedReader(new InputStreamReader(process.getErrorStream()));

        StringBuilder output = new StringBuilder();
        StringBuilder errorOutput = new StringBuilder();

        String line;
        while ((line = reader.readLine()) != null) {
            output.append(line).append("\n");
        }
        while ((line = errorReader.readLine()) != null) {
            errorOutput.append(line).append("\n");
        }

        int exitCode = process.waitFor();

        if (!output.toString().isEmpty()) {
            log("输出: " + output.toString().trim());
        }
        if (!errorOutput.toString().isEmpty()) {
            log("错误: " + errorOutput.toString().trim());
        }

        return exitCode == 0;
    }

    private static void cleanupExisting() throws Exception {
        log("=== 清理现有配置 ===");

        // 删除用户(包括 hive)
        String[] users = {"hdfs", "yarn", "mapred", "hive"};
        for (String user : users) {
            log("清理用户: " + user);
            List<String> command = new ArrayList<>();
            command.add("userdel");
            command.add("-r");
            command.add("-f");
            command.add(user);

            executeCommand(command);
        }

        // 删除组(稍后创建)
        String[] groups = {"hdfs", "yarn", "mapred", "hadoop"};
        for (String group : groups) {
            log("清理组: " + group);
            List<String> command = new ArrayList<>();
            command.add("groupdel");
            command.add(group);

            executeCommand(command);
        }
    }

    private static void createGroup(String groupName) throws Exception {
        log("创建组: " + groupName);
        List<String> command = new ArrayList<>();
        command.add("groupadd");
        command.add(groupName);

        if (executeCommand(command)) {
            log("✓ 组创建成功: " + groupName);
        } else {
            log("提示: 组 " + groupName + " 可能已存在");
        }
    }

    private static void createHadoopUser(String userName, String primaryGroup, String secondaryGroup) throws Exception {
        log("创建用户: " + userName + " (主组: " + primaryGroup + ", 附加组: " + secondaryGroup + ")");

        List<String> command = new ArrayList<>();
        command.add("useradd");
        command.add("-r");           // 系统用户
        command.add("-m");           // 创建家目录
        command.add("-s");           // 登录shell
        command.add("/bin/bash");
        command.add("-g");           // 主要组
        command.add(primaryGroup);
        command.add("-G");           // 附加组(可以有多个)
        command.add(secondaryGroup);
        command.add(userName);

        if (executeCommand(command)) {
            log("✓ 用户创建成功: " + userName);
            showUserDetails(userName);
        } else {
            // 如果用户已存在,删除后重试
            log("用户可能已存在,尝试删除后重新创建");

            // 删除现有用户
            List<String> delCommand = new ArrayList<>();
            delCommand.add("userdel");
            delCommand.add("-r");
            delCommand.add("-f");
            delCommand.add(userName);
            executeCommand(delCommand);

            // 重新创建
            if (executeCommand(command)) {
                log("✓ 用户重新创建成功: " + userName);
                showUserDetails(userName);
            } else {
                logError("用户创建失败: " + userName);
            }
        }
    }

    private static void showUserDetails(String userName) throws Exception {
        List<String> command = new ArrayList<>();
        command.add("id");
        command.add(userName);

        ProcessBuilder pb = new ProcessBuilder(command);
        Process process = pb.start();
        BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream()));

        String info = reader.readLine();
        process.waitFor();

        if (info != null) {
            log("用户信息: " + info);
        }

        // 显示家目录
        command.clear();
        command.add("ls");
        command.add("-ld");
        command.add("/home/" + userName);

        pb = new ProcessBuilder(command);
        process = pb.start();
        reader = new BufferedReader(new InputStreamReader(process.getInputStream()));

        String homeDir = reader.readLine();
        process.waitFor();

        if (homeDir != null) {
            log("家目录: " + homeDir);
        }
    }

    private static void verifyCompleteSetup() throws Exception {
        log("=== 完整配置验证 ===");

        // 更新:包含 hive 用户
        String[][] userConfigs = {
                {"hdfs", "hadoop", "hdfs"},
                {"yarn", "hadoop", "yarn"},
                {"mapred", "hadoop", "mapred"},
                {"hive", "hadoop", "hdfs"}  // hive 用户验证
        };

        boolean allCorrect = true;

        for (String[] config : userConfigs) {
            String userName = config[0];
            String expectedPrimary = config[1];
            String expectedSecondary = config[2];

            log("验证用户: " + userName);

            // 检查用户是否存在
            List<String> checkCmd = new ArrayList<>();
            checkCmd.add("id");
            checkCmd.add(userName);

            ProcessBuilder pb = new ProcessBuilder(checkCmd);
            Process process = pb.start();
            int exitCode = process.waitFor();

            if (exitCode != 0) {
                logError("✗ 用户不存在: " + userName);
                allCorrect = false;
                continue;
            }

            // 获取详细组信息
            List<String> groupCmd = new ArrayList<>();
            groupCmd.add("id");
            groupCmd.add("-gn");
            groupCmd.add(userName);

            pb = new ProcessBuilder(groupCmd);
            process = pb.start();
            BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream()));
            String primaryGroup = reader.readLine();
            process.waitFor();

            groupCmd.set(1, "-Gn");
            pb = new ProcessBuilder(groupCmd);
            process = pb.start();
            reader = new BufferedReader(new InputStreamReader(process.getInputStream()));
            String allGroups = reader.readLine();
            process.waitFor();

            // 验证
            boolean primaryOk = expectedPrimary.equals(primaryGroup);
            boolean secondaryOk = allGroups != null && allGroups.contains(expectedSecondary);

            if (primaryOk && secondaryOk) {
                log("✓ 配置正确: 主组=" + primaryGroup + ", 包含组=" + expectedSecondary);
            } else {
                logError("✗ 配置错误:");
                if (!primaryOk) {
                    logError("   期望主组: " + expectedPrimary + ", 实际主组: " + primaryGroup);
                }
                if (!secondaryOk) {
                    logError("   期望包含组: " + expectedSecondary + ", 实际组: " + allGroups);
                }
                allCorrect = false;
            }
        }

        // 验证所有组
        log("验证所有组...");
        String[] groups = {"hadoop", "hdfs", "yarn", "mapred"};
        for (String group : groups) {
            List<String> cmd = new ArrayList<>();
            cmd.add("getent");
            cmd.add("group");
            cmd.add(group);

            ProcessBuilder pb = new ProcessBuilder(cmd);
            Process process = pb.start();
            int exitCode = process.waitFor();

            if (exitCode == 0) {
                log("✓ 组存在: " + group);
            } else {
                logError("✗ 组不存在: " + group);
                allCorrect = false;
            }
        }

        if (allCorrect) {
            log("✓ 所有配置验证通过");
        } else {
            logError("存在配置问题,请检查上述错误");
        }
    }

    private static void showFinalSummary() throws Exception {
        log("=== 最终配置摘要 ===");

        // 用户信息
        log("用户信息:");
        String[] users = {"hdfs", "yarn", "mapred", "hive"};  // 包含 hive
        for (String user : users) {
            List<String> cmd = new ArrayList<>();
            cmd.add("id");
            cmd.add(user);

            ProcessBuilder pb = new ProcessBuilder(cmd);
            Process process = pb.start();
            BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream()));
            String info = reader.readLine();
            process.waitFor();

            if (info != null) {
                log("  " + info);
            }
        }

        // 组信息
        log("组信息:");
        String[] groups = {"hadoop", "hdfs", "yarn", "mapred"};
        for (String group : groups) {
            List<String> cmd = new ArrayList<>();
            cmd.add("getent");
            cmd.add("group");
            cmd.add(group);

            ProcessBuilder pb = new ProcessBuilder(cmd);
            Process process = pb.start();
            BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream()));
            String info = reader.readLine();
            process.waitFor();

            if (info != null) {
                log("  " + info);
            }
        }

        // 家目录
        log("家目录:");
        for (String user : users) {
            List<String> cmd = new ArrayList<>();
            cmd.add("ls");
            cmd.add("-ld");
            cmd.add("/home/" + user);

            ProcessBuilder pb = new ProcessBuilder(cmd);
            Process process = pb.start();
            BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream()));
            String info = reader.readLine();
            process.waitFor();

            if (info != null) {
                log("  " + info);
            }
        }

        // 额外:显示关键目录权限
        log("关键目录权限:");
        String[] importantDirs = {
                "/opt/datasophon/hive-3.1.3",
                "/opt/datasophon/hadoop-3.3.3",
                "/user/hive/warehouse",
                "/tmp/hive"
        };

        for (String dir : importantDirs) {
            List<String> cmd = new ArrayList<>();
            cmd.add("ls");
            cmd.add("-ld");
            cmd.add(dir);

            ProcessBuilder pb = new ProcessBuilder(cmd);
            Process process = pb.start();
            BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream()));
            String info = reader.readLine();
            process.waitFor();

            if (info != null) {
                log("  " + info);
            } else {
                log("  目录不存在: " + dir);
            }
        }
    }
}

ddp1启动hiveserver2,hivemetastore没问题,但是ddp2 启动失败

满以为上面的问题就是最后一个问题。结果发现ddp2启动这两服务失败,跟踪发现control_hive.sh的status()方法检测进程是否存在有问题(应该跟以前的hdfs 安装的脚本保持一致),修改内容如下:

javascript 复制代码
status(){
  if [ -f $pid ]; then
    ARGET_PID=`cat $pid`
    kill -0 $ARGET_PID
    if [ $? -eq 0 ]
    then
      echo "$command is  running "
    else
      echo "$command  is not running"
      exit 1
    fi
  else
    echo "$command  pid file is not exists"
    exit 1
        fi
}

最后

安装hive之前,我觉得问题不大,没想到搞得天翻地覆,问题重重。还好坚持下来。 如需沟通:lita2lz

相关推荐
小白同学_C12 小时前
Lab4-Lab: traps && MIT6.1810操作系统工程【持续更新】 _
linux·c/c++·操作系统os
今天只学一颗糖12 小时前
1、《深入理解计算机系统》--计算机系统介绍
linux·笔记·学习·系统架构
不做无法实现的梦~14 小时前
ros2实现路径规划---nav2部分
linux·stm32·嵌入式硬件·机器人·自动驾驶
默|笙15 小时前
【Linux】fd_重定向本质
linux·运维·服务器
陈苏同学16 小时前
[已解决] Solving environment: failed with repodata from current_repodata.json (python其实已经被AutoDL装好了!)
linux·python·conda
“αβ”16 小时前
网络层协议 -- ICMP协议
linux·服务器·网络·网络协议·icmp·traceroute·ping
不爱学习的老登17 小时前
Windows客户端与Linux服务器配置ssh无密码登录
linux·服务器·windows
十日十行18 小时前
修复root用户登录的浏览器无法跳转vscode问题
ubuntu
小王C语言18 小时前
进程状态和进程优先级
linux·运维·服务器
xlp666hub18 小时前
【字符设备驱动】:从基础到实战(下)
linux·面试