前置条件
注意airnow的apikey,需要注册一个用户,然后获取apikey 注意可以看到APIKEY,取这个字段即可。docs.airnowapi.org/forecastsby...
Basic Java Source Code
下面是pulsar的代码。
ini
import org.apache.pulsar.client.api.MessageId;
import org.apache.pulsar.client.api.Schema;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.pulsar.core.PulsarProducerFactory;
import org.springframework.pulsar.core.PulsarTemplate;
@Autowired
private PulsarProducerFactory<Observation> producerFactory; PulsarTemplate<Observation> pulsarTemplate = new PulsarTemplate<(producerFactory); pulsarTemplate.setSchema(Schema.JSON(Observation.class));
MessageId msgid = pulsarTemplate.newMessage(observation2) .withMessageCustomizer((mb) -> mb.key(uuidKey.toString())) .withTopic(topicName) .send();

这段代码就是springboot发送信息到pulsar中。
Apache Flink Continuous SQL over Apache Pulsar Topics
接下来是在flink上消费pulsar数据
第一步需要一个pulsar集群,这里直接采用pulsar官网提供的docker-compose文件来开整了::
yaml
cat docker-compose.yaml
version: "3"
networks:
pulsar:
driver: bridge
services:
# Start zookeeper
zookeeper:
image: apachepulsar/pulsar:4.0.3
container_name: zookeeper
restart: on-failure
networks:
- pulsar
volumes:
- ./data/zookeeper:/pulsar/data/zookeeper
environment:
- metadataStoreUrl=zk:zookeeper:2181
- PULSAR_MEM=-Xms256m -Xmx256m -XX:MaxDirectMemorySize=256m
command: |
bash -c "bin/apply-config-from-env.py conf/zookeeper.conf && \
bin/generate-zookeeper-config.sh conf/zookeeper.conf && \
exec bin/pulsar zookeeper"
healthcheck:
test: ["CMD", "bin/pulsar-zookeeper-ruok.sh"]
interval: 10s
timeout: 5s
retries: 30
# Init cluster metadata
pulsar-init:
container_name: pulsar-init
hostname: pulsar-init
image: apachepulsar/pulsar:4.0.3
networks:
- pulsar
command:
[
"bash",
"-c",
"bin/pulsar initialize-cluster-metadata --cluster cluster-a --zookeeper zookeeper:2181 --configuration-store zookeeper:2181 --web-service-url http://broker:8080 --broker-service-url pulsar://broker:6650",
]
depends_on:
zookeeper:
condition: service_healthy
# Start bookie
bookie:
image: apachepulsar/pulsar:4.0.3
container_name: bookie
restart: on-failure
networks:
- pulsar
environment:
- clusterName=cluster-a
- zkServers=zookeeper:2181
- metadataServiceUri=metadata-store:zk:zookeeper:2181
# otherwise every time we run docker compose uo or down we fail to start due to Cookie
# See: https://github.com/apache/bookkeeper/blob/405e72acf42bb1104296447ea8840d805094c787/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Cookie.java#L57-68
- advertisedAddress=bookie
- BOOKIE_MEM=-Xms512m -Xmx512m -XX:MaxDirectMemorySize=256m
depends_on:
zookeeper:
condition: service_healthy
pulsar-init:
condition: service_completed_successfully
# Map the local directory to the container to avoid bookie startup failure due to insufficient container disks.
volumes:
- ./data/bookkeeper:/pulsar/data/bookkeeper
command: bash -c "bin/apply-config-from-env.py conf/bookkeeper.conf && exec bin/pulsar bookie"
# Start broker
broker:
image: apachepulsar/pulsar:4.0.3
container_name: broker
hostname: broker
restart: on-failure
networks:
- pulsar
environment:
- metadataStoreUrl=zk:zookeeper:2181
- zookeeperServers=zookeeper:2181
- clusterName=cluster-a
- managedLedgerDefaultEnsembleSize=1
- managedLedgerDefaultWriteQuorum=1
- managedLedgerDefaultAckQuorum=1
- advertisedAddress=broker
- advertisedListeners=external:pulsar://172.21.195.56:6650
- PULSAR_MEM=-Xms512m -Xmx512m -XX:MaxDirectMemorySize=256m
depends_on:
zookeeper:
condition: service_healthy
bookie:
condition: service_started
ports:
- "6650:6650"
- "8080:8080"
command: bash -c "bin/apply-config-from-env.py conf/broker.conf && exec bin/pulsar broker"
把上面文件保存为docker-compose.yaml,接下来操作创建目录
bash
sudo mkdir -p ./data/zookeeper ./data/bookkeeper
# this step might not be necessary on other than Linux platforms
sudosudo chown -R 10000 data
创建好目录之后来启动一下pulsar集群 docker-compose up -d
安装一个pulsar-manger来看一下:
创建成功后可以看到如下展示
运行springboot来将airnow的数据写入到pulsar中,我们来测试一下
当出现以上数据代表代码已经开始写数据到pulsar集群中。 可以直接通过界面查看到pulsar的topic
接下来开始部署一个flink-sql的查询容器了,同样的我们也需要一个docker-compose文件,这里我们使用的是这样一个项目https://github.com/Aiven-Open/sql-cli-for-apache-flink-docker/tree/main?tab=readme-ov-file
,需要注意的事我们使用的事streamnative的flink连接pulsar的插件,所以使用的flink的1.16版本(因为streamnative使用的就是1.16.0),我们注意修改一下docker-compose文件,让所有的镜像都是正常的flink-1.16.0的版本
注意按上面的照片修改好之后就可以正常使用flink-sql了。通过
docker-compose up -d
来启动环境。 这里我们需要把flink-sql-connector-pulsar的jar包导入到docker到container中,请看接下来到操作: 创建一个myjars的目录,里面放入这次需要的jar包 ls myjars flink-sql-avro-1.16.0.jar flink-sql-connector-pulsar-1.16.0.0.jar 接下来通过命令行把数据导入进去
|-----|-----------------------------------------------------------------------------------------------------------------------------|
| 1 2 | ruby SQL_CLIENT_CONTAINER_ID=$(docker ps -qf "name=sql-client") docker cp /myjars $SQL_CLIENT_CONTAINER_ID:/tmp/jars
|
CopyCopy

接下来进入到sql-client的容器中
bash
cp /tmp/jars/*.jar /opt/flink/lib/.
cp /tmp/jars/*.jar /opt/sql-client/lib/.
./sql-client.sh
当出现以下图片时,代表进入sql中 接下来可以运行sql来进行查询了
ini
CREATE CATALOG pulsar WITH (
'type' = 'pulsar-catalog',
'catalog-service-url' = 'pulsar://172.21.195.56:6650',
'catalog-admin-url' = 'http://172.21.195.56:8080'
);
USE CATALOG pulsar;
SHOW CURRENT DATABASE;
SHOW DATABASES;
set table.dynamic-table-options.enabled = true;
use `public/default`;
SHOW TABLES;

接下来进行flink-sql的查询语句了 第一条sql为:
csharp
select aqi, parameterName, dateObserved, hourObserved, latitude, longitude, localTimeZone, stateCode, reportingArea from airquality

同时也可以通过pulsar的命令行来查看进入到pulsar中的数据
ruby
docker exec -it broker /bin/bash
broker:/pulsar$ bin/pulsar-admin topics list public/default
persistent://public/default/airquality
persistent://public/default/aq-pm25
persistent://public/default/aq-ozone
persistent://public/default/aq-pm10
broker:/pulsar$ bin/pulsar-client consume persistent://public/default/airquality -s "my-subscription" -n 10
Mar 26, 2025 2:10:45 AM io.opentelemetry.api.GlobalOpenTelemetry maybeAutoConfigureAndSetGlobal
INFO: AutoConfiguredOpenTelemetrySdk found on classpath but automatic configuration is disabled. To enable, run your JVM with -Dotel.java.global-autoconfigure.enabled=true
2025-03-26T02:10:45,482+0000 [pulsar-client-io-1-3] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0x28e4a2c9, L:/127.0.0.1:40496 - R:localhost/127.0.0.1:6650]] Connected to server
2025-03-26T02:10:45,689+0000 [pulsar-client-lookup-21-1] INFO org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl - Starting Pulsar consumer status recorder with config: {"topicNames":["persistent://public/default/airquality"],"topicsPattern":null,"subscriptionName":"my-subscription","subscriptionType":"Exclusive","subscriptionProperties":null,"subscriptionMode":"Durable","receiverQueueSize":1000,"acknowledgementsGroupTimeMicros":100000,"maxAcknowledgmentGroupSize":1000,"negativeAckRedeliveryDelayMicros":60000000,"negativeAckPrecisionBitCnt":8,"maxTotalReceiverQueueSizeAcrossPartitions":50000,"consumerName":null,"ackTimeoutMillis":0,"tickDurationMillis":1000,"priorityLevel":0,"maxPendingChunkedMessage":10,"autoAckOldestChunkedMessageOnQueueFull":false,"expireTimeOfIncompleteChunkedMessageMillis":60000,"cryptoFailureAction":"FAIL","properties":{},"readCompacted":false,"subscriptionInitialPosition":"Latest","patternAutoDiscoveryPeriod":60,"regexSubscriptionMode":"PersistentOnly","deadLetterPolicy":null,"retryEnable":false,"autoUpdatePartitions":true,"autoUpdatePartitionsIntervalSeconds":60,"replicateSubscriptionState":false,"resetIncludeHead":false,"batchIndexAckEnabled":false,"ackReceiptEnabled":false,"poolMessages":true,"startPaused":false,"autoScaledReceiverQueueSizeEnabled":false,"topicConfigurations":[],"maxPendingChuckedMessage":10}
2025-03-26T02:10:45,717+0000 [pulsar-client-lookup-21-1] INFO org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl - Pulsar client config: {"serviceUrl":"pulsar://localhost:6650/","authPluginClassName":null,"authParams":null,"authParamMap":null,"operationTimeoutMs":30000,"lookupTimeoutMs":30000,"statsIntervalSeconds":60,"numIoThreads":6,"numListenerThreads":6,"connectionsPerBroker":1,"connectionMaxIdleSeconds":60,"useTcpNoDelay":true,"useTls":false,"tlsKeyFilePath":"","tlsCertificateFilePath":"","tlsTrustCertsFilePath":"","tlsAllowInsecureConnection":false,"tlsHostnameVerificationEnable":false,"sslFactoryPlugin":"org.apache.pulsar.common.util.DefaultPulsarSslFactory","sslFactoryPluginParams":null,"concurrentLookupRequest":5000,"maxLookupRequest":50000,"maxLookupRedirects":20,"maxNumberOfRejectedRequestPerConnection":50,"keepAliveIntervalSeconds":30,"connectionTimeoutMs":10000,"requestTimeoutMs":60000,"readTimeoutMs":60000,"autoCertRefreshSeconds":300,"initialBackoffIntervalNanos":100000000,"maxBackoffIntervalNanos":60000000000,"enableBusyWait":false,"listenerName":null,"useKeyStoreTls":false,"sslProvider":null,"tlsKeyStoreType":"JKS","tlsKeyStorePath":"","tlsKeyStorePassword":"*****","tlsTrustStoreType":"JKS","tlsTrustStorePath":"","tlsTrustStorePassword":"*****","tlsCiphers":[],"tlsProtocols":[],"memoryLimitBytes":0,"proxyServiceUrl":null,"proxyProtocol":null,"enableTransaction":false,"dnsLookupBindAddress":null,"dnsLookupBindPort":0,"dnsServerAddresses":[],"socks5ProxyAddress":null,"socks5ProxyUsername":null,"socks5ProxyPassword":null,"description":null,"lookupProperties":{},"openTelemetry":null}
2025-03-26T02:10:45,750+0000 [pulsar-client-io-1-6] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0xce2a1215, L:/172.19.0.4:58170 - R:172.21.195.56/172.21.195.56:6650]] Connected to server
2025-03-26T02:10:45,755+0000 [pulsar-client-io-1-6] INFO org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/airquality][my-subscription] Subscribing to topic on cnx [id: 0xce2a1215, L:/172.19.0.4:58170 - R:172.21.195.56/172.21.195.56:6650], consumerId 0
2025-03-26T02:10:45,815+0000 [pulsar-client-io-1-6] INFO org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/airquality][my-subscription] Subscribed to topic on 172.21.195.56/172.21.195.56:6650 -- consumer: 0
----- got message -----
publishTime:[1742955071690], eventTime:[0], key:[d75bdf37-93a4-4088-9a3c-94e778e41a2c], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":18,"localTimeZone":"PST","reportingArea":"Redwood City","stateCode":"CA","latitude":37.48,"longitude":-122.22,"parameterName":"O3","aqi":31,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
----- got message -----
publishTime:[1742955071794], eventTime:[0], key:[79d653fc-57c5-45d6-92af-7c4df28e0975], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":18,"localTimeZone":"PST","reportingArea":"Redwood City","stateCode":"CA","latitude":37.48,"longitude":-122.22,"parameterName":"PM2.5","aqi":36,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
----- got message -----
publishTime:[1742955101688], eventTime:[0], key:[105e644a-341c-4b7f-b6d3-3b2c2454ab37], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":21,"localTimeZone":"EST","reportingArea":"Northeast Urban","stateCode":"NJ","latitude":40.692,"longitude":-74.187,"parameterName":"O3","aqi":28,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
----- got message -----
publishTime:[1742955101858], eventTime:[0], key:[ea48f397-3aa6-4760-b414-bf71875bec06], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":21,"localTimeZone":"EST","reportingArea":"Northeast Urban","stateCode":"NJ","latitude":40.692,"longitude":-74.187,"parameterName":"PM2.5","aqi":35,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
2025-03-26T02:11:45,718+0000 [pulsar-timer-22-1] INFO org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl - [persistent://public/default/airquality] [my-subscription] [XKsbk] Prefetched messages: 0 --- Consume throughput received: 0.07 msgs/s --- 0.00 Mbit/s --- Ack sent rate: 0.07 ack/s --- Failed messages: 0 --- batch messages: 0 ---Failed acks: 0
----- got message -----
publishTime:[1742955132424], eventTime:[0], key:[face36b9-bec4-4217-a1c4-9b09b3488c6a], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":20,"localTimeZone":"CST","reportingArea":"El Paso","stateCode":"TX","latitude":31.8493,"longitude":-106.4375,"parameterName":"O3","aqi":48,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
----- got message -----
publishTime:[1742955132547], eventTime:[0], key:[0bf3f47f-b282-490a-937a-8f41cc44affa], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":20,"localTimeZone":"CST","reportingArea":"El Paso","stateCode":"TX","latitude":31.8493,"longitude":-106.4375,"parameterName":"PM2.5","aqi":37,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
----- got message -----
publishTime:[1742955132683], eventTime:[0], key:[a0cb48ea-a545-4d47-a659-dbd2526abd09], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":20,"localTimeZone":"CST","reportingArea":"El Paso","stateCode":"TX","latitude":31.8493,"longitude":-106.4375,"parameterName":"PM10","aqi":32,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
----- got message -----
publishTime:[1742955162421], eventTime:[0], key:[e5b14f19-9654-486c-996a-ae274512ccf8], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":18,"localTimeZone":"PST","reportingArea":"San Rafael","stateCode":"CA","latitude":37.97,"longitude":-122.52,"parameterName":"O3","aqi":38,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
----- got message -----
publishTime:[1742955162497], eventTime:[0], key:[0479660c-3a4b-4a90-9589-958fa52f7ab0], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":18,"localTimeZone":"PST","reportingArea":"San Rafael","stateCode":"CA","latitude":37.97,"longitude":-122.52,"parameterName":"PM2.5","aqi":57,"category":{"number":2,"name":"Moderate","additionalProperties":{}},"additionalProperties":{}}
2025-03-26T02:12:45,451+0000 [pulsar-client-io-1-3] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0x28e4a2c9, L:/127.0.0.1:40496 ! R:localhost/127.0.0.1:6650] Disconnected
2025-03-26T02:12:45,719+0000 [pulsar-timer-22-1] INFO org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl - [persistent://public/default/airquality] [my-subscription] [XKsbk] Prefetched messages: 0 --- Consume throughput received: 0.08 msgs/s --- 0.00 Mbit/s --- Ack sent rate: 0.08 ack/s --- Failed messages: 0 --- batch messages: 0 ---Failed acks: 0
----- got message -----
publishTime:[1742955191825], eventTime:[0], key:[814ebfbb-d723-4dd0-9dea-ed5bdab36528], properties:[], content:{"dateObserved":"2025-03-25","hourObserved":18,"localTimeZone":"PST","reportingArea":"San Rafael","stateCode":"CA","latitude":37.97,"longitude":-122.52,"parameterName":"O3","aqi":38,"category":{"number":1,"name":"Good","additionalProperties":{}},"additionalProperties":{}}
2025-03-26T02:13:11,919+0000 [pulsar-client-io-1-6] INFO org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/airquality] [my-subscription] Closed consumer
2025-03-26T02:13:11,921+0000 [main] INFO org.apache.pulsar.client.impl.PulsarClientImpl - Client closing. URL: pulsar://localhost:6650/
2025-03-26T02:13:11,928+0000 [pulsar-client-io-1-6] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0xce2a1215, L:/172.19.0.4:58170 ! R:172.21.195.56/172.21.195.56:6650] Disconnected
2025-03-26T02:13:13,937+0000 [main] INFO org.apache.pulsar.client.cli.PulsarClientTool - 10 messages successfully consumed
</code></pre></td></tr></tbody></table>
CopyCopy
在pulsar-manger的界面上也可以看到注册消费者  第二条sql语句
<table><tbody><tr><td><pre tabindex="0"><code>1
</code></pre></td><td><pre tabindex="0"><code>select max(aqi) as MaxAQI, parameterName, reportingArea from airquality group by parameterName, reportingArea;

第三条sql语句
csharp
select max(aqi) as MaxAQI, min(aqi) as MinAQI, avg(aqi) as AvgAQI, count(aqi) as RowCount, parameterName, reportingArea from airquality group by parameterName, reportingArea;

参考文档: hackernoon.com/spring-boot... flink.apache.org/2020/07/28/...
本文使用 文章同步助手 同步