Kafka学习之:mac 上安装 kafka

文章目录

  • [Brew 安装](#Brew 安装)
    • [1. xcode-select --install](#1. xcode-select --install)
    • [2. brew install kafka](#2. brew install kafka)
  • [Docker 安装](#Docker 安装)
    • [1. 构建 docker-compose.yml](#1. 构建 docker-compose.yml)
    • [2. 通过 docker 构建容器并对外提供服务](#2. 通过 docker 构建容器并对外提供服务)
  • [MAC 本机 Kafka 启动(使用 默认设置)](#MAC 本机 Kafka 启动(使用 默认设置))
    • [检测 kafka 是否正确启动](#检测 kafka 是否正确启动)
      • [方法1: 使用lsof命令](#方法1: 使用lsof命令)
      • [方法 2:使用Kafka命令行工具](#方法 2:使用Kafka命令行工具)
    • [server.properties 一览](#server.properties 一览)

Brew 安装

  • 打开终端

1. xcode-select --install

  • 首先要安装这个东西,否则安不了 kafka

    xcode-select --install
    
  • 安装完成后,你可能需要同意Xcode和相关工具的许可协议。这可以通过以下命令完成:

    sudo xcodebuild -license
    

2. brew install kafka

  • 安装 kafka

    python 复制代码
    brew install kafka

Docker 安装

  • 如果你已经安装了Docker,可以使用Docker来运行Kafka和ZooKeeper。

1. 构建 docker-compose.yml

  • 在终端中创建一个docker-compose.yml文件,内容如下:
python 复制代码
version: '2'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
  kafka:
    image: wurstmeister/kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: 127.0.0.1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

docker 文件解析

zookeeper
  • 首先通过 services 定义了容器使用的所有服务,由于 kafka 启动需要借助于 zookeeper,所以我们要同时在 docker 容器中具备 kafkazookeeper 两种服务
python 复制代码
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
  • zookeeper: 这是服务的名称,在这个配置中表示 Zookeeper 服务。
  • image: 使用的Docker镜像,此处是 wurstmeister/zookeeper。这是一个预配置的 Zookeeper 镜像,适合与Kafka协同工作。
  • ports:"2181:2181" 将容器内的 2181 端口映射到宿主机的 2181 端口。Zookeeper 默认监听 2181 端口,用于 kafka 客户端连接。
kafka
python 复制代码
  kafka:
    image: wurstmeister/kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: 127.0.0.1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
  • kafka 这是另一个服务,表示Kafka服务器。
  • image 使用的是 wurstmeister/kafka 镜像,这是一个预配置的Kafka镜像。
  • ports: "9092:9092" 将容器内的9092端口映射到宿主机的9092端口。Kafka默认监听9092端口,用于客户端连接。
  • environment 设置环境变量来配置Kafka。
  • KAFKA_ADVERTISED_HOST_NAME 告知Kafka如何将自己广告给客户端。在这里设置为127.0.0.1,意味着Kafka将只能从宿主机本地访问。
  • KAFKA_ZOOKEEPER_CONNECT 指定Kafka如何连接到Zookeeper。zookeeper:2181 表示使用此Docker Compose配置中定义的zookeeper服务及其2181端口。
  • volumes: /var/run/docker.sock:/var/run/docker.sock 将宿主机的Docker套接字文件挂载到容器中。这允许Kafka容器管理其他Docker容器,通常用于动态创建和管理Kafka集群的Broker。

2. 通过 docker 构建容器并对外提供服务

  • 选择适合你的需求的安装方法。安装完成后,你就可以开始使用Kafka进行消息队列的开发和测试了。
python 复制代码
docker-compose up

MAC 本机 Kafka 启动(使用 默认设置)

  • 首先启动 zookeeper
python 复制代码
brew services start zookeeper
  • 如果是本地安装的 kafka,则直接可以用下列命令启动,默认情况下Kafka会监听在9092端口上。这个默认行为是由Kafka的配置文件决定的,该配置文件通常位于/usr/local/etc/kafka/server.properties(如果你是通过Homebrew安装的Kafka)而我是 M1 芯片的 macbook,我的默认安装地址是 /opt/homebrew/etc/kafka/server.properties
python 复制代码
brew services start kafka
  • 如果更改了 server.properties 则需要重新运行 kafka 来生效设置
python 复制代码
brew services restart kafka

检测 kafka 是否正确启动

方法1: 使用lsof命令

python 复制代码
lsof -i :9092
  • 这个命令会列出所有监听在9092端口的进程。如果Kafka已经启动并且在此端口上监听,你应该能看到它的进程信息。如果这个命令返回了关于Kafka进程的信息,那么就意味着Kafka已经在9092端口上正确启动了。

方法 2:使用Kafka命令行工具

  • 执行以下命令来创建一个新主题:
python 复制代码
kafka-topics --create --topic test --partitions 1 --replication-factor 1 --bootstrap-server localhost:9092
  • 如果这个命令成功执行并没有报错,那么就意味着Kafka服务已经在监听9092端口并且可以正常工作。 如果出现错误,错误信息可能会提供为何无法连接到Kafka的线索。

我在运行上述内容的时候创建失败,原因是我没有事先启动 zookeeper,错误信息如下:

python 复制代码
[2024-03-29 21:54:03,707] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
[2024-03-29 21:54:03,811] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
[2024-03-29 21:54:03,913] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
[2024-03-29 21:54:04,117] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
[2024-03-29 21:54:04,522] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Node may not be available. (org.apache.kafka.clients.NetworkClient)
  • 然后查看了日志 /opt/homebrew/var/log/kafka/kafka_output.log 内容:
python 复制代码
java.net.ConnectException: Connection refused
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)
        at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682)
        at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:973)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:344)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289)
[2024-03-29 21:26:32,648] INFO Opening socket connection to server localhost/[0:0:0:0:0:0:0:1]:2181. (org.apache.zookeeper.ClientCnxn)
[2024-03-29 21:26:32,650] WARN Session 0x0 for server localhost/[0:0:0:0:0:0:0:1]:2181, Closing socket connection. Attempting reconnect except it is a SessionExpiredException. (org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)
        at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:682)
        at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:973)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:344)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289)
[2024-03-29 21:26:33,220] INFO [ZooKeeperClient Kafka server] Closing. (kafka.zookeeper.ZooKeeperClient)
  • 可以看到是没有启动 zookeeper,所以手动启动一下:
python 复制代码
brew services start zookeeper
  • 再重新启动一下 kafka
python 复制代码
brew services restart kafka
  • 然后就可以成功运行了,再次测试创建 topic 就成功了

server.properties 一览

  • 这里展示一下整体的 kafka 的 property 文档,让大家可以看清楚一些:
python 复制代码
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#
# This configuration file is intended for use in ZK-based mode, where Apache ZooKeeper is required.
# See kafka.server.KafkaConfig for additional details and defaults
#

############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0

############################# Socket Server Settings #############################

# The address the socket server listens on. If not configured, the host name will be equal to the value of
# java.net.InetAddress.getCanonicalHostName(), with PLAINTEXT listener name, and port 9092.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092

# Listener name, hostname and port the broker will advertise to clients.
# If not set, it uses the value for "listeners".
#advertised.listeners=PLAINTEXT://your.host.name:9092

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

# The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3

# The number of threads that the server uses for processing requests, which may include disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600


############################# Log Basics #############################

# A comma separated list of directories under which to store log files
log.dirs=/opt/homebrew/var/lib/kafka-logs

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Internal Topic Settings  #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=168

# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
#log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=localhost:2181

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=18000


############################# Group Coordinator Settings #############################

# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0
  • 其中最重要的是下面几个(一般需要改动的):

    • broker.id,如果你设置了一个运算集群,其中有多个 kafka 的节点,那么每个节点都是一个 broker,而每个 broker 的 id 都不能相同
    • #listeners=PLAINTEXT://:9092 这个一般是注释掉的(当默认只有一个节点的时候就默认使用 9092 端口启动这个 broker)但是如果有多个 broker 的时候,请把这里进行修改,使得每个 broker 的监听端口不同,修改的方式就是去掉注释变成 listeners=PLAINTEXT://:xxxx 其中 xxxx 是你自己想用的端口号
    • log.dirs=/opt/homebrew/var/lib/kafka-logs kafka 会默认将日志文件放到这个位置
    • zookeeper.connect=localhost:2181 这个是连接 zookeeper 的端口,也要配置正确
  • 还有更多设置的意义,如果大家想深入了解 kafka,建议去 这个网址 去看视频

相关推荐
LvManBa4 分钟前
Vue学习记录之三(ref全家桶)
javascript·vue.js·学习
救救孩子把10 分钟前
mac中git操作账号的删除
git·macos
happycao12340 分钟前
记一次kafka消息丢失问题排查
kafka
知识分享小能手1 小时前
mysql学习教程,从入门到精通,SQL DISTINCT 子句 (16)
大数据·开发语言·sql·学习·mysql·数据分析·数据库开发
sysin.org1 小时前
VMware ESXi 8.0U3b macOS Unlocker & OEM BIOS 2.7 集成网卡驱动和 NVMe 驱动 (集成驱动版)
macos·esxi·bios·unlocker·oem·2.7
yanling20231 小时前
Parallels Desktop 20 for Mac中文版发布了?会哪些新功能
macos·虚拟机·pd
喜欢猪猪1 小时前
Kafka是如何保证数据的安全性、可靠性和分区的
分布式·kafka
芊言芊语1 小时前
分布式消息服务Kafka版的详细解析和配置方式
分布式·kafka
不会敲代码的VanGogh2 小时前
【iOS】——应用启动流程
macos·ios·objective-c·cocoa
晓幂2 小时前
CTFShow-信息搜集
笔记·学习