使用阿里云试用Elasticsearch学习:使用内置模型 lang_ident_model_1 创建管道并使用

文档:https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-deploy-model.html

部署刚刚下载好的内置模型


部署内存不够用

还得花钱,拉几把倒吧。就用自带的吧。

测试模型

bash 复制代码
POST _ml/trained_models/lang_ident_model_1/_infer
{
  "docs":[{"text": "The fool doth think he is wise, but the wise man knows himself to be a fool."}]
}

以下是高概率预测英语的结果。

bash 复制代码
{
  "inference_results": [
    {
      "predicted_value": "en",
      "prediction_probability": 0.9999658805366392,
      "prediction_score": 0.9999658805366392
    }
  ]
}

创建管道

添加处理器

reference 推理

bash 复制代码
# Field map
{
  "message": "text"
}
# Inference configuration
{
  "classification":{
    "num_top_classes":5
  }
}

set 设置

bash 复制代码
#  field 
event.ingested
# value 
{{{_ingest.timestamp}}}

失败处理器

测试

bash 复制代码
[
  {
    "_source": {
    "text_field":"Hello, my name is Josh and I live in Berlin."
    }
  }
]
bash 复制代码
[
 {
   "_source":{
     "message":"Sziasztok! Ez egy rövid magyar szöveg. Nézzük, vajon sikerül-e azonosítania a language identification funkciónak? Annak ellenére is sikerülni fog, hogy a szöveg két angol szót is tartalmaz."
     }
  }
]


测试没问题,创建管道

使用

安装插件

注意版本号与es版本一直,都是8.9.1。安装完会自行重启。
下载mapper-annotated-text安装包

映射索引

注意message字段别写错

bash 复制代码
PUT ner-test
{
  "mappings": {
    "properties": {
      "ml.inference.predicted_value": {"type": "annotated_text"},
      "ml.inference.model_id": {"type": "keyword"},
      "message": {"type": "text"},
      "event.ingested": {"type": "date"}
    }
  }
}

索引文档

通过管道 lang_ident_model_1 索引一批文档

bash 复制代码
POST /_bulk?pipeline=lang_ident_model_1
{"create":{"_index":"ner-test","_id":"1"}}
{"message":"Hello, my name is Josh and I live in Berlin."}
{"create":{"_index":"ner-test","_id":"2"}}
{"message":"I work for Elastic which was founded in Amsterdam."}
{"create":{"_index":"ner-test","_id":"3"}}
{"message":"Elastic has headquarters in Mountain View, California."}
{"create":{"_index":"ner-test","_id":"4"}}
{"message":"Elastic's founder, Shay Banon, created Elasticsearch to solve a simple need: finding recipes!"}
{"create":{"_index":"ner-test","_id":"5"}}
{"message":"Elasticsearch is built using Lucene, an open source search library."}

或者用query

bash 复制代码
POST lang-test/_doc?pipeline=ner-test
{
  "message": "Mon pays ce n'est pas un pays, c'est l'hiver"
}

查看数据

bash 复制代码
"hits": [
      {
        "_index": "ner-test",
        "_id": "1",
        "_score": 1,
        "_source": {
          "message": "Hello, my name is Josh and I live in Berlin.",
          "event": {
            "ingested": "2024-04-13T20:31:48.855089336Z"
          },
          "ml": {
            "inference": {
              "predicted_value": "en",
              "top_classes": [
                {
                  "class_name": "en",
                  "class_probability": 0.9854748734614491,
                  "class_score": 0.9854748734614491
                },
                {
                  "class_name": "tg",
                  "class_probability": 0.003855695585908385,
                  "class_score": 0.003855695585908385
                },
                {
                  "class_name": "ig",
                  "class_probability": 0.0036940515396614113,
                  "class_score": 0.0036940515396614113
                },
                {
                  "class_name": "sw",
                  "class_probability": 0.0021393582129747924,
                  "class_score": 0.0021393582129747924
                },
                {
                  "class_name": "it",
                  "class_probability": 0.0011839650697029283,
                  "class_score": 0.0011839650697029283
                }
              ],
              "prediction_probability": 0.9854748734614491,
              "prediction_score": 0.9854748734614491,
              "model_id": "lang_ident_model_1"
            }
          }
        }
      },......

文档重新索引到新目标

bash 复制代码
POST _reindex
{
  "source": {
    "index": "ner-test-new",
    "size": 50
  },
  "dest": {
    "index": "ner-test",
    "pipeline": "lang_ident_model_1"
  }
}
相关推荐
hengzhepa23 分钟前
ElasticSearch备考 -- Search across cluster
学习·elasticsearch·搜索引擎·全文检索·es
Elastic 中国社区官方博客2 小时前
Elasticsearch:使用 LLM 实现传统搜索自动化
大数据·人工智能·elasticsearch·搜索引擎·ai·自动化·全文检索
慕雪华年3 小时前
【WSL】wsl中ubuntu无法通过useradd添加用户
linux·ubuntu·elasticsearch
Elastic 中国社区官方博客5 小时前
使用 Vertex AI Gemini 模型和 Elasticsearch Playground 快速创建 RAG 应用程序
大数据·人工智能·elasticsearch·搜索引擎·全文检索
alfiy6 小时前
Elasticsearch学习笔记(四) Elasticsearch集群安全配置一
笔记·学习·elasticsearch
alfiy7 小时前
Elasticsearch学习笔记(五)Elastic stack安全配置二
笔记·学习·elasticsearch
丶21361 天前
【大数据】Elasticsearch 实战应用总结
大数据·elasticsearch·搜索引擎
闲人编程1 天前
elasticsearch实战应用
大数据·python·elasticsearch·实战应用
世俗ˊ1 天前
Elasticsearch学习笔记(3)
笔记·学习·elasticsearch
weixin_466286681 天前
ElasticSearch入门
大数据·elasticsearch·搜索引擎