Python 全栈系列236 rabbit_agent搭建

说明

通过rabbit_agent, 以接口方式实现对队列的标准操作,将pika包在微服务内,而不必在太多地方重复的去写。至少在服务端发布消息时,不必再去考虑这些问题。

在分布式任务的情况下,客户端本身会启动一个持续监听队列的客户端服务,这些应该是容易通过简单的配置来实现的。

在未来的应用上,我计划使用rabbitmq作为公网的消息队列,满足分布式计算的要求。例如,我部署了n个大模型,希望它们可以处理接口请求。很显然,一台服务器放不下n个大模型,但是用户可以把请求先发到消息队列,然后在不同的机器上启动大模型,分别接受来自队列的消息,处理后返回。

这样,只要在服务端有一个前端,可以转发、收集消息展示在应用前端,那么就可行了。

内容

模式1:简单队列,连通性测试

适合简单缓存

有P(Produce)和C(Consumer)两端。

生产端在建立连接后,声明队列,然后往里面发消息。

Connection -> Channel -> Queue -> Message

生产

生产者:将消息发送到队列。

模式:只有在有新的消息要发布时才连接队列。(然后就可以释放连接)

python 复制代码
import pika
credentials = pika.PlainCredentials('xxx', 'xxx')
import time
with pika.BlockingConnection(pika.ConnectionParameters('HOST', 11111, '/', credentials)) as connection:
    channel = connection.channel()
    channel.queue_declare(queue='hello')

    # 方式一:基本队列
    for i in range(100):
        time.sleep(0.1)
        channel.basic_publish(exchange='',
                            routing_key='hello',
                            body='Hello World!')
        print(" [x] Sent 'Hello World!'")
消费

消费者:将消息提取出来并打印。

模式:一直处于监听状态,所以连接需要一直保持。

python 复制代码
def callback(ch, method, properties, body):
    print(f" [x] Received {body}")

connection = pika.BlockingConnection(pika.ConnectionParameters('HOST', 11111, '/', credentials))
channel = connection.channel()
channel.queue_declare(queue='hello')
# 方式一:基本队列
channel.basic_consume(queue='hello',
                        auto_ack=True,
                        on_message_callback=callback)

print(' [*] Waiting for messages. To exit press CTRL+C')
channel.start_consuming()

模式2:工作队列,区分消费者

适合分布式任务

这个模式下,稍微有点复杂。

简单模式生产者:

python 复制代码
import sys
message = ' '.join(sys.argv[1:]) or "Hello World!"
channel.basic_publish(exchange='',
                      routing_key='hello',
                      body=message)
print(f" [x] Sent {message}")

如果这时候rabbitmq挂了,那么数据就会丢失,这时候要在声明队列时声明持久化的。但这要求队列在一开始就声明是持久化的。如果队列一开始没声明,再次声明会报错。

同时在发布消息的时候,也要声明持久化

python 复制代码
channel.queue_declare(queue='hello', durable=True)
channel.basic_publish(exchange='',
                      routing_key="hello1",
                      body=message,
                      properties=pika.BasicProperties(
                         delivery_mode = pika.DeliveryMode.Persistent
                      ))

配置完重启服务测试

python 复制代码
docker restart rabbitmq_24091_24092

ok,生产端的消息被持久化了,即使重启消息也没有丢失。

接下来就是客户端。也就是worker。

考虑到worker同样存在不可靠的情况,有可能消息消费到一半,然后worker挂了。所以这里主要是消息的应答机制。

默认情况下,worker采用自动应答机制。即获取消息就认为被正常消费。这适用于产品的稳定性很高,或者消息的重要性很低的情况(允许漏消息)。

python 复制代码
def callback(ch, method, properties, body):
    print(f" [x] Received {body.decode()}")
    time.sleep(body.count(b'.'))
    print(" [x] Done")
# 方式一:基本队列
channel.basic_consume(queue='hello',
                        on_message_callback=callback, auto_ack =True)    

如果要做更可靠的确认,可以采取这种手工应答的机制。即消费时声明不自动确认,然后在callback内部确认。

python 复制代码
# 手动确认
def callback(ch, method, properties, body):
    print(f" [x] Received {body.decode()}")
    time.sleep(body.count(b'.'))
    print(" [x] Done")

    ch.basic_ack(delivery_tag = method.delivery_tag)
channel.basic_consume(queue='hello',
                        on_message_callback=callback, auto_ack =False)    


print(' [*] Waiting for messages. To exit press CTRL+C')
channel.start_consuming()

最后是负载均衡,在这里,通过消费者声明自己的预取数量来完成。

python 复制代码
channel.basic_qos(prefetch_count=3)
channel.basic_consume(queue='hello1',
                        on_message_callback=callback, auto_ack =False)    

3 广播

我认为在复杂决策场景下可以用到。

emit_log.py

python 复制代码
channel = connection.channel()

channel.exchange_declare(exchange='logs', exchange_type='fanout')

message = ' '.join(sys.argv[1:]) or "info: Hello World!"
channel.basic_publish(exchange='logs', routing_key='', body=message)
print(f" [x] Sent {message}")
connection.close()

'''
如您所见,建立连接后我们声明了交换。此步骤是必要的,因为禁止发布到不存在的交易所。

如果还没有队列绑定到交换器,消息将会丢失,但这对我们来说没关系;如果还没有消费者在监听,我们可以安全地丢弃该消息。

python3 emit_log.py First message.
'''

receive_logs1.py

python 复制代码
channel = connection.channel()

channel.exchange_declare(exchange='logs', exchange_type='fanout')

result = channel.queue_declare(queue='', exclusive=True)
queue_name = result.method.queue

channel.queue_bind(exchange='logs', queue=queue_name)

print(' [*] Waiting for logs. To exit press CTRL+C')

def callback(ch, method, properties, body):
    print(f" [x] {body}")

channel.basic_consume(
    queue=queue_name, on_message_callback=callback, auto_ack=True)

channel.start_consuming()


'''

python3 receive_logs1.py >> logs_from_rabbit1.log
python3 receive_logs2.py >> logs_from_rabbit2.log
'''

在两个终端分别执行

python 复制代码
python3 receive_logs1.py >> logs_from_rabbit1.log
python3 receive_logs2.py >> logs_from_rabbit2.log

当关闭后,数据被写入日志

队列被自动删除

4 路由

fanout是无意识转发,direct可以通过不同的路由键值决定队列分发,或者消息丢弃(如严重程度低的)。这种过滤和转发是通过路由键来确定的 routing_key

5 主题

有点像正则,实现更复杂的过滤。

5 微服务

命名为rabbit_agent_24098,第一步先实现模式二(worker)和模式三(subscribe)

先获取到基本包,并安装,能省不少事

python 复制代码
wget  Basefuncs-1.2-py3-none-any.whl 
pip install Basefuncs-1.2-py3-none-any.whl

然后是简单的server_funcs.py,在server_funcs里定义了两个基础文件夹(现在看来也不是特别需要)

python 复制代码
# 【创建tornado所需问文件夹】
import os
# 如果路径不存在则创建
def create_folder_if_notexist(somepath):
    if not os.path.exists(somepath):
        os.makedirs(somepath)
    return True

m_static = os.path.join(os.getcwd(),'m_static')
m_template = os.path.join(os.getcwd(),'m_template')

create_folder_if_notexist(m_static)
create_folder_if_notexist(m_template)

settings = {
'static_path':m_static,
'template_path':m_template
}

# 如果需要序列化含np的内容
import json
from json import JSONEncoder
class MyEncoder(JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        if isinstance(obj, datetime):
            return obj.__str__()
        if isinstance(obj, dd.timedelta):
            return obj.__str__()
        else:
            return super(MyEncoder, self).default(obj)

# json.dumps(foo, cls=MyJsonEncoder)


from Basefuncs import * 
# 读取配置
conf_dict = get_conf_dict('configs.conf')

服务端:

python 复制代码
from server_funcs import *
import tornado.httpserver  # http服务器
import tornado.ioloop  # ?
import tornado.options  # 指定服务端口和路径解析
import tornado.web  # web模块
from tornado.options import define, options
import os.path  # 获取和生成template文件路径

import pika
import json
# 全局配置文件
# rabbit01 = conf_dict['rabbit01']

# 应用列表
app_list = []

IndexHandler_path = r'/'
class IndexHandler(tornado.web.RequestHandler):
    def get(self):
        self.write('【GET】This is Website for Internal API System')
        self.write('Please Refer to API document')
        print('Get got a request test')
        # print(buffer_dict)

    def post(self):

        request_body = self.request.body

        print('Trying Decode Json')
        some_dict = json.loads(request_body)
        print(some_dict)
        msg_dict = {}
        msg_dict['info'] = '【POST】This is Website for Internal API System'
        msg_dict['input_dict'] = some_dict
        self.write(json.dumps(msg_dict))
        print('Post got a request test')
IndexHandler_tuple = (IndexHandler_path,IndexHandler)
app_list.append(IndexHandler_tuple)


# 发布消息:传入对应的队列服务器名称,获取对应的服务器配置,然后进行连接并发布消息
## 模式二:Work Queues
WorkQMessageHandler_path = r'/send_workq_message/'
class WorkQMessageHandler(tornado.web.RequestHandler):
    def post(self):
        request_body = self.request.body
        some_dict = json.loads(request_body)
        # 1 第一层
        rabbit = some_dict['rabbit']
        exchange = some_dict.get('exchange') or ''
        exchange_type = some_dict.get('exchange_type') or ''
        # queue不可缺少
        queue = some_dict['queue']
        durable = some_dict.get('durable') 
        if durable is None:
            durable = True 
        routing_key = some_dict.get('routing_key') or ''
        message_list = some_dict['message_list']
        print(some_dict)

        the_rabbit_conf_dict = conf_dict[rabbit]
        credentials = pika.PlainCredentials(the_rabbit_conf_dict['user'], the_rabbit_conf_dict['pwd'])

        msg_dict = {}
        with pika.BlockingConnection(pika.ConnectionParameters(the_rabbit_conf_dict['host'], the_rabbit_conf_dict['port'], '/', credentials)) as connection:
            channel = connection.channel()
            if len(exchange.strip())>1:
                channel.exchange_declare(exchange=exchange, exchange_type=exchange_type)
            # 队列的持久化与否要一开始设置好
            if durable is True:
                channel.queue_declare(queue=queue, durable=True)
                for message in message_list:
                    print('a :',message )
                    channel.basic_publish(exchange= exchange, routing_key=routing_key, body=json.dumps(message),
                    properties=pika.BasicProperties(delivery_mode = pika.DeliveryMode.Persistent))
                msg_dict['durable'] = True
                msg_dict['status'] = True
            else:
                channel.queue_declare(queue=queue)
                for message in message_list:
                    print('b :',message )
                    channel.basic_publish(exchange= exchange, routing_key=routing_key, body=json.dumps(message) )
                msg_dict['durable'] = False
                msg_dict['status'] = True
        msg_dict['messages'] = len(message_list)
        self.write(json.dumps(msg_dict))
WorkQMessageHandler_tuple = (WorkQMessageHandler_path,WorkQMessageHandler)
app_list.append(WorkQMessageHandler_tuple)


## 模式三:Publish/Subscribe
## 模式二:Work Queues
SubscribeMessageHandler_path = r'/send_subscribe_message/'
class SubscribeMessageHandler(tornado.web.RequestHandler):
    def post(self):
        request_body = self.request.body
        some_dict = json.loads(request_body)
        # 1 第一层
        rabbit = some_dict['rabbit']
        exchange = some_dict.get('exchange') or ''
        exchange_type = some_dict.get('exchange_type') or ''
        routing_key = some_dict.get('routing_key') or ''
        message_list = some_dict['message_list']
        print(some_dict)

        the_rabbit_conf_dict = conf_dict[rabbit]
        credentials = pika.PlainCredentials(the_rabbit_conf_dict['user'], the_rabbit_conf_dict['pwd'])

        msg_dict = {}
        with pika.BlockingConnection(pika.ConnectionParameters(the_rabbit_conf_dict['host'], the_rabbit_conf_dict['port'], '/', credentials)) as connection:
            channel = connection.channel()
            if len(exchange.strip())>1:
                channel.exchange_declare(exchange=exchange, exchange_type=exchange_type)
            for message in message_list:
                channel.basic_publish(exchange= exchange, routing_key=routing_key, body=json.dumps(message))            


SubscribeMessageHandler_tuple = (SubscribeMessageHandler_path,SubscribeMessageHandler)
app_list.append(SubscribeMessageHandler_tuple)

if __name__ == '__main__':
    #
    tornado.options.parse_command_line()
    apps = tornado.web.Application(app_list, **settings)
    http_server = tornado.httpserver.HTTPServer(apps)
    define('port', default=8000, help='run on the given port', type=int)

    
    http_server.listen(options.port)
    # 单核

    # 多核打开注释
    # 0 是全部核
    # http_server.start(num_processes=10) # tornado将按照cpu核数来fork进程

    # ---启动
    print('Server Started')
    tornado.ioloop.IOLoop.instance().start()

本地开发测试完之后,发布为镜像,然后启动服务。

python 复制代码
docker run -d \
 --restart=always \
 --name=rabbit_agent_24098 \
 -v /etc/localtime:/etc/localtime  \
 -v /etc/timezone:/etc/timezone\
 -v /etc/hostname:/etc/hostname\
 -e "LANG=C.UTF-8" \
 -w /workspace\
 -p 24098:8000\
 myregistry.domain.com:24052/server.andy.rabbit_agent_24098:v100 \
 sh -c "python3 server.py"

模式二测试:WorkerQ

在生产端发送消息。声明了一个不持久化的队列,然后发送消息列表。注意:如果生产端声明非持久队列,那么消费端也要做同样的声明。否则会出现声明错误。另,如果消息ACK失败,RabbitMQ会在TTL之后将消息放回队列。如果消费者的通道断开连接,那么RabbitMQ也会将消息放回队列。

python 复制代码
import requests as req 
message_list = [{'msg_id':1,'msg':'first msg'},{'msg_id':2,'msg':'second msg'}]

# 1 模式2 WorkQ:服务端发送消息
para_dict = {}
para_dict['rabbit'] = 'rabbit01'
para_dict['routing_key'] = 'hello2'
para_dict['durable'] = False
para_dict['message_list'] = message_list
para_dict['queue'] = 'hello2'

# res = req.post('http://127.0.0.1:8000/send_workq_message/', json = para_dict)
res = req.post('http://WAN_IP:24098/send_workq_message/', json = para_dict)

在消费端执行消费。默认的情况下,body里存放的是二进制字符串。以下采取了自动和手动方式进行消息确认。

python 复制代码
import pika
import json
credentials = pika.PlainCredentials('xxx', 'xxx')
connection = pika.BlockingConnection(pika.ConnectionParameters('HOST', PORT, '/', credentials))

channel = connection.channel()

import time

# 自动确认
# def callback(ch, method, properties, body):
#     print(f" [x] Received {body.decode()}")
#     time.sleep(body.count(b'.'))
#     print(" [x] Done")
# # 方式一:基本队列
# channel.basic_consume(queue='hello',
#                         on_message_callback=callback, auto_ack =True)    
    
    
# 手动确认
# def callback(ch, method, properties, body):
#     input_data = json.loads(body.decode())
#     print(f" [x] Received ",input_data)
#     # time.sleep(body.count(b'.'))
#     print(" [x] Done")
#     ch.basic_ack(delivery_tag = method.delivery_tag)
def callback(ch, method, properties, body):
    # input_data = json.loads(body.decode())
    print(f" [x] Received ",body.decode())
    # time.sleep(body.count(b'.'))
    print(" [x] Done")
    ch.basic_ack(delivery_tag = method.delivery_tag)


# channel.queue_declare(queue='hello1')
channel.queue_declare(queue='hello1',durable=True)
channel.basic_qos(prefetch_count=3)
channel.basic_consume(queue='hello1',
                        on_message_callback=callback, auto_ack =False)    


print(' [*] Waiting for messages. To exit press CTRL+C')
channel.start_consuming()

模式三测试:订阅模式

python 复制代码
para_dict = {}
para_dict['rabbit'] = 'rabbit01'
para_dict['routing_key'] = None
para_dict['durable'] = False
para_dict['message_list'] = message_list
para_dict['exchange'] = 'logs'
para_dict['exchange_type'] = 'fanout'

res = req.post('http://127.0.0.1:8000/send_subscribe_message/', json = para_dict)

订阅的worker用了另一种形式:使用系统分配的默认队列,使用完之后自动删除

python 复制代码
#!/usr/bin/env python
import pika
credentials = pika.PlainCredentials(xxx, xxx)
connection = pika.BlockingConnection(pika.ConnectionParameters(WAN_IP, PORT, '/', credentials))

channel = connection.channel()

channel.exchange_declare(exchange='logs', exchange_type='fanout')

result = channel.queue_declare(queue='', exclusive=True)
queue_name = result.method.queue

channel.queue_bind(exchange='logs', queue=queue_name)

print(' [*] Waiting for logs. To exit press CTRL+C')

def callback(ch, method, properties, body):
    input_data = json.loads(body.decode())
    print(f" [x] ",input_data)

channel.basic_consume(
    queue=queue_name, on_message_callback=callback, auto_ack=True)

channel.start_consuming()

6 结束

到这里第一版就算完成了,可以开始先用RabbitMQ做一些应用。

目前能想到的是用于分布式任务,队列中存放任务的元信息。Worker可以通过直接或者间接方式取数。

直接方式是指worker直接发起数据库的拉取动作,获得数据然后执行。

间接方式则是worker向指定队列反馈消息,由另一个服务来分发数据文件(针对租用算力机没有额外端口的情况)

在应用上,可以

  • 1 为任务搭建具有前端的微服务,数据量不大,可以通过RabbitMQ直接传数据
  • 2 接受来自量化程序的交易消息
相关推荐
数据智能老司机4 小时前
精通 Python 设计模式——分布式系统模式
python·设计模式·架构
数据智能老司机5 小时前
精通 Python 设计模式——并发与异步模式
python·设计模式·编程语言
数据智能老司机5 小时前
精通 Python 设计模式——测试模式
python·设计模式·架构
数据智能老司机5 小时前
精通 Python 设计模式——性能模式
python·设计模式·架构
c8i5 小时前
drf初步梳理
python·django
每日AI新事件5 小时前
python的异步函数
python
这里有鱼汤7 小时前
miniQMT下载历史行情数据太慢怎么办?一招提速10倍!
前端·python
databook16 小时前
Manim实现脉冲闪烁特效
后端·python·动效
程序设计实验室16 小时前
2025年了,在 Django 之外,Python Web 框架还能怎么选?
python
倔强青铜三18 小时前
苦练Python第46天:文件写入与上下文管理器
人工智能·python·面试