BE节点经常挂掉:[IO_ERROR]failed to list /proc/27349/fd/: No such file or directory

最近BE节点经常挂掉

复制代码
Caused by: java.lang.RuntimeException: Failed to execute internal SQL. org.apache.doris.common.UserException: errCode = 2, detailMessage = There is no scanNode Backend available.[10031: not alive] OriginStatement{originStmt='SELECT * FROM __internal_schema.column_statistics WHERE tbl_id=27273 AND idx_id=-1 AND col_id='CREATE_AID'', idx=0}
        at org.apache.doris.qe.StmtExecutor.executeInternalQuery(StmtExecutor.java:2509)
        at org.apache.doris.statistics.util.StatisticsUtil.execStatisticQuery(StatisticsUtil.java:131)
        at org.apache.doris.statistics.StatisticsRepository.loadColStats(StatisticsRepository.java:439)
        at org.apache.doris.statistics.ColumnStatisticsCacheLoader.loadFromStatsTable(ColumnStatisticsCacheLoader.java:56)
        at org.apache.doris.statistics.ColumnStatisticsCacheLoader.doLoad(ColumnStatisticsCacheLoader.java:38)
        at org.apache.doris.statistics.ColumnStatisticsCacheLoader.doLoad(ColumnStatisticsCacheLoader.java:31)
        at org.apache.doris.statistics.StatisticsCacheLoader.lambda$asyncLoad$0(StatisticsCacheLoader.java:48)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
        ... 3 more
Caused by: org.apache.doris.common.UserException: errCode = 2, detailMessage = There is no scanNode Backend available.[10031: not alive]
        at org.apache.doris.qe.SimpleScheduler.getHost(SimpleScheduler.java:147)
        at org.apache.doris.qe.Coordinator.computeFragmentHosts(Coordinator.java:1806)
        at org.apache.doris.qe.Coordinator.computeFragmentExecParams(Coordinator.java:1267)
        at org.apache.doris.qe.Coordinator.exec(Coordinator.java:573)
        at org.apache.doris.qe.StmtExecutor.executeInternalQuery(StmtExecutor.java:2505)
        ... 10 more

be.out也看不出什么有用日志,查看be.WARNING,发现了如下错误,但还不知道如何解决,先记录一下问题

IO_ERROR\]failed to list /proc/27349/fd/: (2), No such file or directory W1121 09:36:26.929662 27477 doris_metrics.cpp:379] failed to count fd: [IO_ERROR]failed to list /proc/27349/fd/: (2), No such file or directory 0. /root/src/doris-2.0/be/src/common/stack_trace.cpp:302: StackTrace::tryCapture() @ 0x000000000b9e64c7 in /xxsys/doris-2.0.2/be/lib/doris_be 1. /root/src/doris-2.0/be/src/common/stack_trace.h:0: doris::get_stack_trace[abi:cxx11]() @ 0x000000000b9e4ae5 in /xxsys/doris-2.0.2/be/lib/doris_be 2. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:173: doris::Status doris::Status::Error, std::allocator > const&, std::__cxx11::basic_string, std::allocator > >(int, std::basic_string_view >, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator >&&) @ 0x000000000aecc168 in /xxsys/doris-2.0.2/be/lib/doris_be 3. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187: doris::io::LocalFileSystem::list_impl(std::filesystem::__cxx11::path const&, bool, std::vector >*, bool*) @ 0x000000000aec6eac in /xxsys/doris-2.0.2/be/lib/doris_be 4. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:360: doris::io::LocalFileSystem::iterate_directory_impl(std::__cxx11::basic_string, std::allocator > const&, std::function const&) @ 0x000000000aec7fcf in /xxsys/doris-2.0.2/be/lib/doris_be 5. /root/src/doris-2.0/be/src/common/status.h:348: doris::io::LocalFileSystem::iterate_directory(std::__cxx11::basic_string, std::allocator > const&, std::function const&) @ 0x000000000aec7e4d in /xxsys/doris-2.0.2/be/lib/doris_be 6. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:244: doris::DorisMetrics::_update_process_fd_num() @ 0x000000000b97a65a in /xxsys/doris-2.0.2/be/lib/doris_be 7. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_tree.h:368: doris::MetricRegistry::trigger_all_hooks(bool) const @ 0x000000000b9ba69f in /xxsys/doris-2.0.2/be/lib/doris_be 8. /root/src/doris-2.0/be/src/util/time.h:50: doris::Daemon::calculate_metrics_thread() @ 0x000000000ae9cc0c in /xxsys/doris-2.0.2/be/lib/doris_be 9. /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562: doris::Thread::supervise_thread(void*) @ 0x000000000ba1819a in /xxsys/doris-2.0.2/be/lib/doris_be 10. start_thread @ 0x00007f2f98172aa1 in ? 11. __clone @ 0x00007f2f988f8c4d in ?

相关推荐
Faith_xzc22 天前
Apache Doris FE 问题排查与故障分析全景指南
大数据·数据仓库·apache·doris
寂夜了无痕22 天前
doris manager 安装部署 、管理已有doris集群、使用studio进行SQL查询
doris·doris manager·doris studio
江畔独步24 天前
Doris与DS结合实现MySQL侧的Upsert功能
数据仓库·mysql·doris·upsert
涤生大数据1 个月前
Apache Doris 在数据仓库中的作用与应用实践
数据仓库·apache·doris
IT成长日记1 个月前
【Doris基础】Doris中的Replica详解:Replica原理、架构
apache·doris·replica
IT成长日记1 个月前
【Doris基础】Apache Doris中的Coordinator节点作用详解
apache·doris·coordinator
IT成长日记1 个月前
【Doris基础】Apache Doris vs 传统数据仓库:架构与性能的全面对比
数据仓库·架构·doris·doris vs 传统数据仓库
IT成长日记1 个月前
【Doris基础】Doris中的Tablet详解:核心存储单元的设计与实现
apache·doris·tablet
IT成长日记1 个月前
【Doris基础】Apache Doris中FE和BE的职责详解
apache·doris·be·fe·职责
IT成长日记1 个月前
【Doris入门】Doris初识:分布式分析型数据库的核心价值与架构解析
数据库·分布式·架构·doris