数据治理DataHub安装部署

通过DataHub进行数据目录管理、数据治理、追踪数据血缘,分析数据集概况。

github地址:https://github.com/datahub-project/datahub

官网地址:https://datahubproject.io/docs/ datahub 模块介绍

https://www.yii666.com/blog/465017.html

安装环境

本文部署环境为Centos7 X86。

安装部署Python3

部署安装 python3 (需要openssl 1.1.1+版本)

遇错:ImportError: urllib3 v2.0 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'OpenSSL 1.0.2k-fips 26 Jan 2017'. See: https://github.com/urllib3/urllib3/issues/2168

解决:降低urllib3版本

bash 复制代码
python3 -m pip install urllib3==1.26.6
python -c "import ssl; print(ssl.OPENSSL_VERSION)"

升级openssl版本

1、下载openssl安装包 https://www.openssl.org/source/

2、解压安装包并进入目录,配置文件,将安装目录指定到/usr/local/openssl

bash 复制代码
./config --prefix=/usr/local/openssl enable-ssl3 shared

3、编译并安装,时间较长,等待即可,如果缺少依赖包需要提前安装,如gcc。

bash 复制代码
yum -y install gcc gcc-c++ autoconf pcre pcre-devel make automake
make && make install

4、重命名原来的openssl命令、openssl目录、openssl文件库(centos7 系统自带的openssl路径,根据机器而定)

bash 复制代码
mv /usr/bin/openssl  /usr/bin/openssl.old
mv /usr/include/openssl/ /usr/include/openssl.old
mv /usr/lib64/libssl.so /usr/lib64/libssl.so.old

5、上述openssl命令、目录、文件库用新装的openssl中文件替换,创建软连接到原文件。

java 复制代码
ln -s /usr/local/openssl/bin/openssl /usr/bin/openssl
ln -s /usr/local/openssl/include/openssl /usr/include/openssl
ln -s /usr/local/openssl/lib/libssl.so /usr/lib64/libssl.so

6、在/etc/ld.so.conf文件中写入openssl库文件的搜索路径,使修改后的/etc/ld.so.conf生效

bash 复制代码
echo "/usr/local/openssl/lib" >> /etc/ld.so.conf
/usr/sbin/ldconfig -v

7、重新查看版本号,为最新版本1.1.1n,完成版本升级

bash 复制代码
openssl version -a

重新编译python3

1、更新系统依赖包

bash 复制代码
yum -y groupinstall "Development tools"
yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel libffi-devel

2、下载安装包

bash 复制代码
wget https://www.python.org/ftp/python/3.8.3/Python-3.8.3.tgz
tar -zxvf  Python-3.8.3.tgz

3、编译,指定保存目录,指定引用的openssl目录

bash 复制代码
./configure --prefix=/usr/local/python3 --with-openssl=/usr/local/openssl

4、修改Moudles/Setup文件

将ssl更改为自安装的ssl路径,且将ssl 注释放开。

5、编译安装

bash 复制代码
make && make install

6、脚本文件放到系统可执行文件目录

bash 复制代码
ln -s /usr/local/python3/bin/python3 /usr/bin/python3
ln -s /usr/local/python3/bin/pip3 /usr/bin/pip3

7、安装测试

bash 复制代码
python3 -V

安装datahub

1、升级pip

bash 复制代码
python3 -m pip install --upgrade pip wheel setuptools

2、datahub环境检查

bash 复制代码
python3 -m pip uninstall datahub acryl-datahub || true

3、 安装datahub

bash 复制代码
python3 -m pip install --upgrade acryl-datahub

4、查看datahub版本

bash 复制代码
python3 -m datahub version

5、启动datahub

bash 复制代码
python3 -m datahub docker quickstart
或
datahub docker quickstart

如果出现 "-bash: datahub: command not found",则需要 'python3 -m'

6、停止datahub

bash 复制代码
python3 -m datahub docker quickstart --stop

7、重置datahub

bash 复制代码
python3 -m datahub docker nuke

8、升级本地数据中心

bash 复制代码
python3 -m datahub docker quickstart

9、自定义方式启动datahub

bash 复制代码
python3 -m datahub docker quickstart --quickstart-compose-file /root/.datahub/docker-compose.yml

摄取样例元数据

python3 -m datahub docker ingest-sample-data

按序启动docker容器

bash 复制代码
broker
zookeeper
schema-registry
mysql
elasticsearch-setup
kafka-setup
mysql-setup
elasticsearch
datahub-upgrade
datahub-gms
datahub-actions
datahub-frontend-react

docker-compose.yml文件自定义内容如下:

yaml 复制代码
networks:
  default:
    name: datahub_network
services:
  broker:
    container_name: broker
    depends_on:
      zookeeper:
        condition: service_healthy
    environment:
    - KAFKA_BROKER_ID=1
    - KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
    - KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
    - KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
    - KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1
    - KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS=0
    - KAFKA_HEAP_OPTS=-Xms256m -Xmx256m
    - KAFKA_CONFLUENT_SUPPORT_METRICS_ENABLE=false
    healthcheck:
      interval: 1s
      retries: 5
      start_period: 30s
      test: nc -z broker $${DATAHUB_MAPPED_KAFKA_BROKER_PORT:-9092}
      timeout: 5s
    hostname: broker
    image: confluentinc/cp-kafka:7.4.0
    ports:
    - ${DATAHUB_MAPPED_KAFKA_BROKER_PORT:-49092}:9092
    volumes:
    - broker:/var/lib/kafka/data/
  datahub-actions:
    container_name: datahub-actions
    depends_on:
      datahub-gms:
        condition: service_healthy
    environment:
    - DATAHUB_GMS_HOST=datahub-gms
    - DATAHUB_GMS_PORT=8080
    - DATAHUB_GMS_PROTOCOL=http
    - DATAHUB_SYSTEM_CLIENT_ID=__datahub_system
    - DATAHUB_SYSTEM_CLIENT_SECRET=JohnSnowKnowsNothing
    - KAFKA_BOOTSTRAP_SERVER=broker:29092
    - KAFKA_PROPERTIES_SECURITY_PROTOCOL=PLAINTEXT
    - METADATA_AUDIT_EVENT_NAME=MetadataAuditEvent_v4
    - METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME=MetadataChangeLog_Versioned_v1
    - SCHEMA_REGISTRY_URL=http://schema-registry:8081
    hostname: actions
    image: acryldata/datahub-actions:${ACTIONS_VERSION:-head}
  datahub-frontend-react:
    container_name: datahub-frontend-react
    depends_on:
      datahub-gms:
        condition: service_healthy
    environment:
    - DATAHUB_GMS_HOST=datahub-gms
    - DATAHUB_GMS_PORT=8080
    - DATAHUB_SECRET=YouKnowNothing
    - DATAHUB_APP_VERSION=1.0
    - DATAHUB_PLAY_MEM_BUFFER_SIZE=10MB
    - JAVA_OPTS=-Xms512m -Xmx512m -Dhttp.port=9002 -Dconfig.file=datahub-frontend/conf/application.conf -Djava.security.auth.login.config=datahub-frontend/conf/jaas.conf -Dlogback.configurationFile=datahub-frontend/conf/logback.xml -Dlogback.debug=false -Dpidfile.path=/dev/null
    - KAFKA_BOOTSTRAP_SERVER=broker:29092
    - DATAHUB_TRACKING_TOPIC=DataHubUsageEvent_v1
    - ELASTIC_CLIENT_HOST=elasticsearch
    - ELASTIC_CLIENT_PORT=9200
    hostname: datahub-frontend-react
    image: ${DATAHUB_FRONTEND_IMAGE:-linkedin/datahub-frontend-react}:${DATAHUB_VERSION:-head}
    ports:
    - ${DATAHUB_MAPPED_FRONTEND_PORT:-9002}:9002
    volumes:
    - ${HOME}/.datahub/plugins:/etc/datahub/plugins
  datahub-gms:
    container_name: datahub-gms
    depends_on:
      datahub-upgrade:
        condition: service_completed_successfully
    environment:
    - DATAHUB_SERVER_TYPE=${DATAHUB_SERVER_TYPE:-quickstart}
    - DATAHUB_TELEMETRY_ENABLED=${DATAHUB_TELEMETRY_ENABLED:-true}
    - DATAHUB_UPGRADE_HISTORY_KAFKA_CONSUMER_GROUP_ID=generic-duhe-consumer-job-client-gms
    - EBEAN_DATASOURCE_DRIVER=com.mysql.jdbc.Driver
    - EBEAN_DATASOURCE_HOST=mysql:3306
    - EBEAN_DATASOURCE_PASSWORD=datahub
    - EBEAN_DATASOURCE_URL=jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8
    - EBEAN_DATASOURCE_USERNAME=datahub
    - ELASTICSEARCH_HOST=elasticsearch
    - ELASTICSEARCH_INDEX_BUILDER_MAPPINGS_REINDEX=true
    - ELASTICSEARCH_INDEX_BUILDER_SETTINGS_REINDEX=true
    - ELASTICSEARCH_PORT=9200
    - ENTITY_REGISTRY_CONFIG_PATH=/datahub/datahub-gms/resources/entity-registry.yml
    - ENTITY_SERVICE_ENABLE_RETENTION=true
    - ES_BULK_REFRESH_POLICY=WAIT_UNTIL
    - GRAPH_SERVICE_DIFF_MODE_ENABLED=true
    - GRAPH_SERVICE_IMPL=elasticsearch
    - JAVA_OPTS=-Xms1g -Xmx1g
    - KAFKA_BOOTSTRAP_SERVER=broker:29092
    - KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
    - MAE_CONSUMER_ENABLED=true
    - MCE_CONSUMER_ENABLED=true
    - PE_CONSUMER_ENABLED=true
    - UI_INGESTION_ENABLED=true
    healthcheck:
      interval: 1s
      retries: 3
      start_period: 90s
      test: curl -sS --fail http://datahub-gms:${DATAHUB_MAPPED_GMS_PORT:-8080}/health
      timeout: 5s
    hostname: datahub-gms
    image: ${DATAHUB_GMS_IMAGE:-linkedin/datahub-gms}:${DATAHUB_VERSION:-head}
    ports:
    - ${DATAHUB_MAPPED_GMS_PORT:-48080}:8080
    volumes:
    - ${HOME}/.datahub/plugins:/etc/datahub/plugins
  datahub-upgrade:
    command:
    - -u
    - SystemUpdate
    container_name: datahub-upgrade
    depends_on:
      elasticsearch-setup:
        condition: service_completed_successfully
      kafka-setup:
        condition: service_completed_successfully
      mysql-setup:
        condition: service_completed_successfully
    environment:
    - EBEAN_DATASOURCE_USERNAME=datahub
    - EBEAN_DATASOURCE_PASSWORD=datahub
    - EBEAN_DATASOURCE_HOST=mysql:3306
    - EBEAN_DATASOURCE_URL=jdbc:mysql://mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8
    - EBEAN_DATASOURCE_DRIVER=com.mysql.jdbc.Driver
    - KAFKA_BOOTSTRAP_SERVER=broker:29092
    - KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
    - ELASTICSEARCH_HOST=elasticsearch
    - ELASTICSEARCH_PORT=9200
    - ELASTICSEARCH_INDEX_BUILDER_MAPPINGS_REINDEX=true
    - ELASTICSEARCH_INDEX_BUILDER_SETTINGS_REINDEX=true
    - ELASTICSEARCH_BUILD_INDICES_CLONE_INDICES=false
    - GRAPH_SERVICE_IMPL=elasticsearch
    - DATAHUB_GMS_HOST=datahub-gms
    - DATAHUB_GMS_PORT=8080
    - ENTITY_REGISTRY_CONFIG_PATH=/datahub/datahub-gms/resources/entity-registry.yml
    hostname: datahub-upgrade
    image: ${DATAHUB_UPGRADE_IMAGE:-acryldata/datahub-upgrade}:${DATAHUB_VERSION:-head}
    labels:
      datahub_setup_job: true
  elasticsearch:
    container_name: elasticsearch
    deploy:
      resources:
        limits:
          memory: 1G
    environment:
    - discovery.type=single-node
    - xpack.security.enabled=false
    - ES_JAVA_OPTS=-Xms256m -Xmx512m -Dlog4j2.formatMsgNoLookups=true
    healthcheck:
      interval: 1s
      retries: 3
      start_period: 20s
      test: curl -sS --fail http://elasticsearch:$${DATAHUB_MAPPED_ELASTIC_PORT:-9200}/_cluster/health?wait_for_status=yellow&timeout=0s
      timeout: 5s
    hostname: elasticsearch
    image: elasticsearch:7.10.1
    ports:
    - ${DATAHUB_MAPPED_ELASTIC_PORT:-49200}:9200
    volumes:
    - esdata:/usr/share/elasticsearch/data
  elasticsearch-setup:
    container_name: elasticsearch-setup
    depends_on:
      elasticsearch:
        condition: service_healthy
    environment:
    - ELASTICSEARCH_HOST=elasticsearch
    - ELASTICSEARCH_PORT=9200
    - ELASTICSEARCH_PROTOCOL=http
    hostname: elasticsearch-setup
    image: ${DATAHUB_ELASTIC_SETUP_IMAGE:-linkedin/datahub-elasticsearch-setup}:${DATAHUB_VERSION:-head}
    labels:
      datahub_setup_job: true
  kafka-setup:
    container_name: kafka-setup
    depends_on:
      broker:
        condition: service_healthy
      schema-registry:
        condition: service_healthy
    environment:
    - DATAHUB_PRECREATE_TOPICS=${DATAHUB_PRECREATE_TOPICS:-false}
    - KAFKA_BOOTSTRAP_SERVER=broker:29092
    - KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
    - USE_CONFLUENT_SCHEMA_REGISTRY=TRUE
    hostname: kafka-setup
    image: ${DATAHUB_KAFKA_SETUP_IMAGE:-linkedin/datahub-kafka-setup}:${DATAHUB_VERSION:-head}
    labels:
      datahub_setup_job: true
  mysql:
    command: --character-set-server=utf8mb4 --collation-server=utf8mb4_bin --default-authentication-plugin=mysql_native_password
    container_name: mysql_datahub
    environment:
    - MYSQL_DATABASE=datahub
    - MYSQL_USER=datahub
    - MYSQL_PASSWORD=datahub
    - MYSQL_ROOT_PASSWORD=datahub
    healthcheck:
      interval: 1s
      retries: 5
      start_period: 2s
      test: mysqladmin ping -h mysql -u $$MYSQL_USER --password=$$MYSQL_PASSWORD
      timeout: 5s
    hostname: mysql
    image: mysql:5.7
    ports:
    - ${DATAHUB_MAPPED_MYSQL_PORT:-43306}:3306
    restart: on-failure
    volumes:
    - ../mysql/init.sql:/docker-entrypoint-initdb.d/init.sql
    - mysqldata:/var/lib/mysql_datahub
  mysql-setup:
    container_name: mysql-setup
    depends_on:
      mysql:
        condition: service_healthy
    environment:
    - MYSQL_HOST=mysql
    - MYSQL_PORT=3306
    - MYSQL_USERNAME=datahub
    - MYSQL_PASSWORD=datahub
    - DATAHUB_DB_NAME=datahub
    hostname: mysql-setup
    image: ${DATAHUB_MYSQL_SETUP_IMAGE:-acryldata/datahub-mysql-setup}:${DATAHUB_VERSION:-head}
    labels:
      datahub_setup_job: true
  schema-registry:
    container_name: schema-registry
    depends_on:
      broker:
        condition: service_healthy
    environment:
    - SCHEMA_REGISTRY_HOST_NAME=schemaregistry
    - SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL=PLAINTEXT
    - SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS=broker:29092
    healthcheck:
      interval: 1s
      retries: 3
      start_period: 30s
      test: nc -z schema-registry ${DATAHUB_MAPPED_SCHEMA_REGISTRY_PORT:-8081}
      timeout: 5s
    hostname: schema-registry
    image: confluentinc/cp-schema-registry:7.4.0
    ports:
    - ${DATAHUB_MAPPED_SCHEMA_REGISTRY_PORT:-48081}:8081
  zookeeper:
    container_name: zookeeper
    environment:
    - ZOOKEEPER_CLIENT_PORT=2181
    - ZOOKEEPER_TICK_TIME=2000
    healthcheck:
      interval: 5s
      retries: 3
      start_period: 10s
      test: echo srvr | nc zookeeper $${DATAHUB_MAPPED_ZK_PORT:-2181}
      timeout: 5s
    hostname: zookeeper
    image: confluentinc/cp-zookeeper:7.4.0
    ports:
    - ${DATAHUB_MAPPED_ZK_PORT:-42181}:2181
    volumes:
    - zkdata:/var/lib/zookeeper
version: '3.9'
volumes:
  broker: null
  esdata: null
  mysqldata: null
  zkdata: null

启动完成后可以在本机查看http://ip:9002,访问页面ip:9002,账密:datahub/datahub

新增摄取器

MongoDB

bash 复制代码
  pip install 'acryl-datahub[mongodb]'

Microsoft SQL Server

bash 复制代码
 pip install 'acryl-datahub[mssql]'

MySQL

bash 复制代码
pip install 'acryl-datahub[mysql]'

Oracle

bash 复制代码
pip install 'acryl-datahub[oracle]'

Kafka

bash 复制代码
pip install 'acryl-datahub[kafka]'

Kafka Connect

bash 复制代码
   pip install 'acryl-datahub[kafka-connect]'

Hive

bash 复制代码
  pip install 'acryl-datahub[hive]'

Elasticsearch

bash 复制代码
pip install 'acryl-datahub[elasticsearch]'

Druid

bash 复制代码
pip install 'acryl-datahub[druid]'

ClickHouse

bash 复制代码
pip install 'acryl-datahub[clickhouse]'

相关问题处理

1 安装hive源报错

pip instal sasl3报错

bash 复制代码
Failed to build sasl3
ERROR: Could not build wheels for sasl3, which is required to install pyproject.toml-based projects

解决方法:

bash 复制代码
yum install gcc-c++ python-devel.x86_64 cyrus-sasl-devel.x86_64
pip install 'acryl-datahub[hive]'

datahub Cli方式进行hive元数据同步。

bash 复制代码
python3 -m datahub ingest -c hive.yml

hive.yml配置如下

yaml 复制代码
source:
  type: hive
  config:
    host_port: '192.168.1.186:10000'
    database: null
    database_pattern:
      allow:
        - ods
        - dwd
        - dws
        - ads
    username: hive
    #password:hive
    stateful_ingestion:
      enabled: false
    include_table_location_lineage: true
    env: PROD
    profiling:
      enabled: true
      catch_exceptions: true
      field_sample_values_limit: 20
      include_field_distinct_count: true
      include_field_distinct_value_frequencies: true
      include_field_histogram: false
      include_field_max_value: true
      include_field_mean_value: true
      include_field_median_value: true
      include_field_min_value: true
      include_field_null_count: true
      include_field_quantiles: false
      include_field_sample_values: true
      include_field_stddev_value: true
      #limit:
      #max_number_of_fields_to_profile: 
      max_workers: 10
      #offset:
      #partition_datetime:
      partition_profiling_enabled: true
      profile_table_level_only: false 
      query_combiner_enabled: true
      report_dropped_profiles: false
      turn_off_expensive_profiling_metrics: false
    options:
      connect_args:
        auth: KERBEROS
        kerberos_service_name: hive
        scheme: hive+http
sink:
  type: "datahub-rest"
  config:
    server: 'http://127.0.0.1:48080'

2 datahub启动kafka容器报错,目录权限不足

bash 复制代码
Running in Zookeeper mode...
===> Running preflight checks ... 
===> Check if /var/lib/kafka/data is writable ...
Command [/usr/local/bin/dub path /var/lib/kafka/data writable] FAILED !

解决:容器内用户对目录没有权限,修改挂载即当前用户appuser没有权限将目录挂载到宿主机。保持默认挂载即可,该目录可在/var/lib/docker/volumes/datahub_broker得到。

bash 复制代码
  volumes:    
   - broker:/var/lib/kafka/data/

3 datahub启动kafka容器报错,clusterID不匹配

bash 复制代码
ERROR Exiting Kafka due to fatal exception during startup. (kafka.Kafka$)
kafka.common.InconsistentClusterIdException: 
The Cluster ID Kk1OuOGLQh-2wiG2C2IuSw doesn't match stored clusterId Some(b9hEZxfhRlm1Mq_sD9TU-Q) in meta.properties. 
The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.

解决:新建容器实例后,集群id给了一个新的 Kk1OuOGLQh-2wiG2C2IuSw。原来挂载目录/var/lib/docker/volumes/datahub_broker/_data下有个meta.properties文件,修改其中的旧clusterID为新的即可。

4 datahub启动schema-registry容器报错,_schema 主题 连接失败。

bash 复制代码
[2023-08-30 03:10:07,579] INFO Kafka version: 7.4.0-ce (org.apache.kafka.common.utils.AppInfoParser)
[2023-08-30 03:10:07,579] INFO Kafka commitId: aee97a585bd06866 (org.apache.kafka.common.utils.AppInfoParser)
[2023-08-30 03:10:07,579] INFO Kafka startTimeMs: 1693365007579 (org.apache.kafka.common.utils.AppInfoParser)
[2023-08-30 03:10:07,590] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=schema-registry-schemaregistry-8081] Cluster ID: Kk1OuOGLQh-2wiG2C2IuSw (org.apache.kafka.clients.Metadata)
[2023-08-30 03:10:07,605] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=schema-registry-schemaregistry-8081] Assigned to partition(s): _schemas-0 (org.apache.kafka.clients.consumer.KafkaConsumer)
[2023-08-30 03:10:07,610] INFO Seeking to beginning for all partitions (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2023-08-30 03:10:07,611] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=schema-registry-schemaregistry-8081] Seeking to earliest offset of partition _schemas-0 (org.apache.kafka.clients.consumer.internals.SubscriptionState)
[2023-08-30 03:10:07,611] INFO Initialized last consumed offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2023-08-30 03:10:07,613] INFO [kafka-store-reader-thread-_schemas]: Starting (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2023-08-30 03:10:07,647] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=schema-registry-schemaregistry-8081] Resetting the last seen epoch of partition _schemas-0 to 0 since the associated topicId changed from null to 4SIrojX5QIKoK76CRCnvdQ (org.apache.kafka.clients.Metadata)
[2023-08-30 03:10:07,711] INFO [Producer clientId=producer-1] Resetting the last seen epoch of partition _schemas-0 to 0 since the associated topicId changed from null to 4SIrojX5QIKoK76CRCnvdQ (org.apache.kafka.clients.Metadata)
[2023-08-30 03:10:07,783] WARN [Producer clientId=producer-1] Received invalid metadata error in produce request on partition _schemas-0 due to org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2023-08-30 03:10:07,784] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry
	at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:352)
	at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:75)
	at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:99)
	at io.confluent.rest.Application.configureHandler(Application.java:296)
	at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:194)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
	at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:44)
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
	at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:148)
	at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:350)
	... 6 more
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
	at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:490)
	at io.confluent.kafka.schemaregistry.storage.KafkaStore.waitUntilKafkaReaderReachesLastOffset(KafkaStore.java:293)
	at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:146)
	... 7 more
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.
	at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:97)
	at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:79)
	at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
	at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:485)
	... 9 more
Caused by: org.apache.kafka.common.errors.NotLeaderOrFollowerException: For requests intended only for the leader, this error indicates that the broker is not the current leader. For requests intended for any replica, this error indicates that the broker is not a replica of the topic partition.

解决:删除topic,重启kafka,重启schema-registry。

相关推荐
wdfk_prog2 小时前
解决 `git cherry-pick` 引入大量新文件的问题
大数据·git·elasticsearch
洛阳纸贵3 小时前
JAVA高级工程师--Elasticsearch
大数据·elasticsearch·搜索引擎
TracyCoder1233 小时前
ElasticSearch内存管理与操作系统(二):深入解析 Circuit Breakers(熔断器)机制
大数据·elasticsearch·搜索引擎
外参财观4 小时前
从浏览器到“超级眼”:夸克的突围战
大数据
BYSJMG5 小时前
计算机毕设选题推荐:基于大数据的癌症数据分析与可视化系统
大数据·vue.js·python·数据挖掘·数据分析·课程设计
petrel20155 小时前
【Spark 核心内参】2026.1:JIRA vs GitHub Issues 治理模式大讨论与 4.2.0 预览版首发
大数据·spark
闻哥5 小时前
深入理解 ES 词库与 Lucene 倒排索引底层实现
java·大数据·jvm·elasticsearch·面试·springboot·lucene
TracyCoder1235 小时前
全面解析:Elasticsearch 性能优化指南
大数据·elasticsearch·性能优化
bigdata-rookie5 小时前
Starrocks 简介
大数据·数据库·数据仓库