milvus快速入门(包含图片搜索)

目录

[1 安装](#1 安装)

[2 使用案例](#2 使用案例)

[2.1 文档搜索比如rag](#2.1 文档搜索比如rag)

[2.2 图片检索以图搜图功能](#2.2 图片检索以图搜图功能)

[3 集成](#3 集成)


Milvus 提供强大的数据建模功能,使您能够将非结构化或多模式数据组织成结构化的 Collections。它支持多种数据类型,适用于不同的属性模型,包括常见的数字和字符类型、各种向量类型、数组、集合和 JSON

1 安装

https://milvus.io/docs/zh/install_standalone-docker-compose.md

复制代码
curl -SL https://github.com/docker/compose/releases/download/v2.30.3/docker-compose-linux-x86_64 -o /usr/local/bin/docker-compose
#将可执行权限赋予安装目标路径中的独立二进制文件
sudo chmod +x /usr/local/bin/docker-compose
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
wget https://github.com/milvus-io/milvus/releases/download/v2.6.9/milvus-standalone-docker-compose.yml -O docker-compose.yml

sudo docker compose up -d

Creating milvus-etcd  ... done
Creating milvus-minio ... done
Creating milvus-standalone ... done

如何添加密码功能

复制代码
#添加密码https://milvus.io/docs/zh/authenticate.md?tab=docker
...
common:
...
  security:
    authorizationEnabled: true
...
#默认密码root:Milvus

#pip install -U pymilvus
from pymilvus import MilvusClient

client = MilvusClient(
    uri='http://localhost:19530', # replace with your own Milvus server address
    token="root:Milvus"
) 

2 使用案例

2.1 文档搜索比如rag

用 Milvus 创建 RAG | Milvus 文档

这里以云上模型为例,自建embed的话可以参考我之前的文章

python 复制代码
import dashscope
from dashscope import TextEmbedding
from pymilvus import MilvusClient

dashscope.api_key=''
client = MilvusClient("http://127.0.0.1:19530")

def create_collection(collection_name):
    if client.has_collection(collection_name="demo_collection"):
        client.drop_collection(collection_name="demo_collection")
    client.create_collection(
        collection_name=collection_name,
        dimension=1024,  # The vectors we will use in this demo has 768 dimensions
        metric_type="IP",  # Inner product distance
        consistency_level="Bounded",  # Supported values are (`"Strong"`, `"Session"`, `"Bounded"`, `"Eventually"`). See https://milvus.io/docs/consistency.md#Consistency-Level for more details.
    )

    print(client.list_collections())
    print(client.describe_collection(collection_name=client.list_collections()[0]))

def emb_text(text):
    return (
         TextEmbedding.call(
            model="text-embedding-v4",
            input=text,
            dimension=1024
     )
    ).output['embeddings']

def insert_data(collection_name, data):
    res = client.insert(collection_name=collection_name, data=data)
    return  res

def search_data(collection_name):
    query_vectors = emb_text(["深度学习"])[0]['embedding']

    res = client.search(
        collection_name=collection_name,  # target collection
        data=[query_vectors],  # query vectors
        limit=2,  # number of returned entities
        output_fields=["text", "subject"],  # specifies fields to be returned
    )
    print(res)

if __name__ == '__main__':
    # 创建集合
    create_collection("demo_collection")
    documents = [
        "人工智能是计算机科学的一个分支",
        "机器学习是实现人工智能的重要方法",
        "深度学习是机器学习的一个子领域"
    ]
    # test_embedding = emb_text("This is a test")
    test_embedding = emb_text(documents)
    embedding_dim = len(test_embedding)
    # print(embedding_dim) 索引长度也就是定义的维度1024
    # print(test_embedding) 索引内容
    data = [
        {"id": i, "vector": test_embedding[i]['embedding'], "text": documents[i], "subject": "demo"}
        for i in range(len(documents))
    ]
    print("Data has", len(data), "entities, each with fields: ", data[0].keys())
    print("Vector dim:", len(data[0]["vector"]))
    # 插入数据
    res = insert_data("demo_collection", data)
    print(res)
    # 查询数据
    search_data("demo_collection")

可以通过相似值检索出相关内容

2.2 图片检索以图搜图功能

使用 Milvus 搜索图像 | Milvus 文档

这里有两张柯基的图片一张金毛的图片

python 复制代码
import base64
import os

import dashscope
from milvus.demo import client

dashscope.api_key=''
image = "https://dashscope.oss-cn-beijing.aliyuncs.com/images/256_1.png"

def create_collection(collection_name):
    if client.has_collection(collection_name=collection_name):
        client.drop_collection(collection_name=collection_name)
    client.create_collection(
        collection_name=collection_name,
        auto_id=True,
        vector_field_name="vector",
        dimension=1152,  # The vectors we will use in this demo has 768 dimensions
        metric_type="IP",  # Inner product distance
        consistency_level="Bounded",  # Supported values are (`"Strong"`, `"Session"`, `"Bounded"`, `"Eventually"`). See https://milvus.io/docs/consistency.md#Consistency-Level for more details.
    )

    print(client.list_collections())
    print(client.describe_collection(collection_name=client.list_collections()[0]))

# def insert_data(collection_name, data):
#     res = client.insert(collection_name=collection_name, data=data)
#     return  res


def image_to_base64(image_path):
    with open(image_path, "rb") as image_file:
        # 读取文件并转换为Base64
        base64_image = base64.b64encode(image_file.read()).decode('utf-8')
    # 设置图像格式
    image_format = "png"  # 根据实际情况修改,比如jpg、bmp 等
    image_data = f"data:image/{image_format};base64,{base64_image}"
    # 输入数据
    input = [{'image': image_data}]
    return input


#input = [{'image': image}]
def emb_text(input):
    # 调用模型接口
    resp = dashscope.MultiModalEmbedding.call(
        model="tongyi-embedding-vision-plus",
        input=input
    ).output['embeddings'][0]['embedding']
    # print(resp)
    # print(len(resp))
    return resp

def search_data(input,collection_name):
    query_vectors = emb_text(input)

    res = client.search(
        collection_name=collection_name,  # target collection
        data=[query_vectors],  # query vectors
        limit=2,  # number of returned entities
        output_fields=["filename"],  # specifies fields to be returned
    )
    print(res)


if __name__ == '__main__':
    create_collection("image_embeddings")
    for file in os.listdir("../data"):
        if file.endswith(".png"):
            input = image_to_base64("../data/" + file)
            image_embedding = emb_text(input)
            res = client.insert(
                "image_embeddings",
                {"vector": image_embedding, "filename": file},
            )
            print(res)

    search_data(image_to_base64("../data/柯基1.png"),"image_embeddings")
    #emb_text(input)

这里通过filename做演示,发现搜索柯基图片的时候返回的也是柯基,实际业务可以将图片地址返回前端使用以图搜相似图片

3 集成

后续更新

https://docs.llamaindex.org.cn/en/stable/examples/vector_stores/MilvusIndexDemo/

相关推荐
uncle_ll1 小时前
Milvus介绍及多模态检索实践:从部署到实战全解析
milvus·多模态·向量数据库·ann·rag·搜索·检索
失忆爆表症10 小时前
01_项目搭建指南:从零开始的 Windows 开发环境配置
windows·postgresql·fastapi·milvus
dblens 数据库管理和开发工具2 天前
开源向量数据库比较:Chroma, Milvus, Faiss,Weaviate
数据库·开源·milvus·faiss·chroma·weaviate
玄同7653 天前
数据库全解析:从关系型到向量数据库,LLM 开发中的选型指南
数据库·人工智能·知识图谱·milvus·知识库·向量数据库·rag
自可乐4 天前
Milvus向量数据库/RAG基础设施学习教程
数据库·人工智能·python·milvus
领航猿1号9 天前
Langchain 1.0.2 从入门到精通(含基础、RAG、Milvus、Ollama、MCP、Agents)
langchain·agent·milvus·rag·mcp·langchain 1.0
Knight_AL10 天前
Docker 部署 Milvus 并连接现有 MinIO 对象存储
docker·eureka·milvus
码农阿豪10 天前
基于Milvus与混合检索的云厂商文档智能问答系统:Java SpringBoot全栈实现
java·spring boot·milvus
GeminiJM10 天前
亿级向量检索:Elasticsearch vs. Milvus,性能鸿沟与架构抉择
elasticsearch·架构·milvus
福大大架构师每日一题12 天前
milvus v2.6.9 发布:支持主键搜索、段重开机制、日志性能全面提升!
android·java·milvus