.Net使用Elastic.Clients.Elasticsearch在Elasticsearch8中实现向量存储和相似度检索

文章目录

一、测试环境

Elastic.Clients.Elasticsearch版本:8.13.0

Elasticsearch版本:8.13.0

二、代码

1、创建包含DenseVector的索引
csharp 复制代码
public static bool InitIndex()
{
    // 定义索引配置
    var faceVectorproperties = new Properties
        {
            { "Id" ,new KeywordProperty()},
            { "FileID" ,new KeywordProperty()},
            { "FileGUID" ,new KeywordProperty()},
            { "ResourceID" ,new KeywordProperty()},
            { "FileName" ,new TextProperty()},
            { "Embedding" ,new DenseVectorProperty{Dims = 3 } }
        };
    // 定义索引配置
    var indexConfig = new IndexState
    {
        Settings = new IndexSettings
        {
            NumberOfShards = 1, // 设置分片数
            NumberOfReplicas = 1 // 设置副本数
        },
        Mappings = new TypeMapping
        {
            Properties = faceVectorproperties
        }
    };
    //判断是否已经存在该索引
    var existFaceVectorIndexResponse = _client.Indices.ExistsAsync("FaceVector").Result;
    if (!existFaceVectorIndexResponse.IsValidResponse)
    {
        // 创建索引请求
        var createIndexRequest = new CreateIndexRequest("FaceVector")
        {
            Settings = indexConfig.Settings,
            Mappings = indexConfig.Mappings
        };
        var createFaceVectorIndexResponse = _client.Indices.CreateAsync(createIndexRequest).Result;
        if (createFaceVectorIndexResponse.Acknowledged)
        {
                //添加一条测试数据
                ES_FaceVector temp = new ES_FaceVector
                {
                    FileID = 0,
                    FileGUID = Guid.NewGuid(),
                    ResourceID = 0,
                    FileName = "测试",
                    Embedding = new float[] {1.2f,1.1f,1.3f }
                };
                var addDocResult = AddDoc<ES_FaceVector>(temp, ElasticIndexEnum.FaceVector);
        }
        else
        {
            return false;
        }
    }
    return true;
}
2、索引文档
csharp 复制代码
//批量索引文档
public static bool AddDocs<T>(List<T> data, string indexName) where T : class
{
    var bulkIndexResponse = _client.BulkAsync(b => b
        .Index(indexName)
        .IndexMany(data)
    ).Result;
    return bulkIndexResponse.IsValidResponse;
}
//单个索引文档
public static bool AddDoc<T>(T data, string indexName) where T : class
{
    var response = _client.IndexAsync(data, indexName).Result;
    return response.IsValidResponse;
}
3、对向量字段进行近似knn检索
csharp 复制代码
public static void SearchKnn()
{
    // 构建KNN查询
    var doubleArr = new[] { -0.04604065, 0.054946236, 0.057453074};
    var arrLen = doubleArr.Length;
    var knnQuery = new KnnQuery()
    {
        k = 2,
        NumCandidates = 1000,
        Field = "embedding",
        QueryVector= doubleArr.Select(s=>(float)s).ToArray()
    };
    // 构建Elasticsearch查询
    var searchRequest = new SearchRequest<ES_FaceVector>(ElasticIndexEnum.FaceVector)
    {
        Knn = new KnnQuery[] { knnQuery },
        MinScore = 0.90,
        SourceIncludes = new [] { "fileName", "embedding" }
    };

    var searchResponse = _client.Search<ES_FaceVector>(searchRequest);
    if (searchResponse.IsValidResponse)
    {
        foreach (var hit in searchResponse.Hits)
        {
            // 处理每个文档的结果
            var fileNameTemp = hit.Source.FileName;
            var embeddingTemp = hit.Source.Embedding;
            
        }
    }
    else
    {
        Console.WriteLine($"Error: {searchResponse.DebugInformation}");
    }
}

三、参考

.Net使用Elastic.Clients.Elasticsearch连接Elasticsearch8

https://www.elastic.co/guide/en/elasticsearch/client/net-api/8.13/connecting.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html


相关推荐
缺点内向5 分钟前
C#: 告别繁琐!轻松移除Word文档中的文本与图片水印
c#·自动化·word·.net
Dxy123931021643 分钟前
Elasticsearch 索引与映射:为你的数据打造一个“智能仓库”
大数据·elasticsearch·搜索引擎
2501_930707781 小时前
使用 C# .NET 从 PowerPoint 演示文稿中提取背景图片
c#·powerpoint·.net
倒流时光三十年6 小时前
SpringBoot 数据库同步 Elasticsearch 性能优化
数据库·spring boot·elasticsearch
向上的车轮6 小时前
为什么.NET(C#)转 Java 开发时常常在“吐槽”Java:checked exception
java·c#·.net
星辰_mya7 小时前
Elasticsearch更新了分词器之后
大数据·elasticsearch·搜索引擎
波波0077 小时前
每日一题:.NET 的 GC是如何分代工作的?
算法·.net·gc
Elastic 中国社区官方博客8 小时前
Elasticsearch:Workflows 介绍 - 9.3
大数据·数据库·人工智能·elasticsearch·ai·全文检索
星辰_mya8 小时前
Elasticsearch主分片数写入后不能改
大数据·elasticsearch·搜索引擎
春日见19 小时前
vscode代码无法跳转
大数据·人工智能·深度学习·elasticsearch·搜索引擎