作者:来自 Elastic Andre Luiz
了解如何在 Elasticsearch 中使用 Amazon Nova 系列模型。
在本文中,我们将讨论 Amazon 的 AI 模型家族------Amazon Nova,并学习如何将其与 Elasticsearch 结合使用。
关于 Amazon Nova
Amazon Nova 是 Amazon 的一系列人工智能模型,可在 Amazon Bedrock 上使用,旨在提供高性能和成本效益。这些模型支持文本、图像和视频输入,生成文本输出,并针对不同的准确性、速度和成本需求进行了优化。
Amazon Nova 主要模型
-
Amazon Nova Micro:专注于文本处理的快速、经济高效模型,适用于翻译、推理、代码补全和数学问题求解。其生成速度超过 200 个 token 每秒,非常适合需要即时响应的应用。
-
Amazon Nova Lite:一种低成本的多模态模型,可快速处理图像、视频和文本。其速度和准确性表现突出,适用于交互式和高数据量的应用,尤其是成本敏感的场景。
-
Amazon Nova Pro:最高级的选择,结合了高准确性、速度和成本效益。适用于视频摘要、问答、软件开发和 AI 代理等复杂任务。专家评测表明,它在文本和视觉理解方面表现卓越,并且能够遵循指令执行自动化工作流。
Amazon Nova 模型适用于多种应用场景,包括内容创作、数据分析、软件开发以及基于 AI 的流程自动化。
我们将展示如何将 Amazon Nova 模型与 Elasticsearch 结合使用,以实现自动化的产品评论分析。
我们将进行以下步骤:
-
通过 Inference API 创建一个端点,将 Amazon Bedrock 与 Elasticsearch 集成。
-
使用 Inference Processor 创建一个数据处理管道,该管道将调用 Inference API 端点。
-
索引产品评论,并使用管道自动生成评论分析。
-
分析集成后的结果。
在 Inference API 中创建端点
首先,我们配置 Inference API 以将 Amazon Bedrock 与 Elasticsearch 集成。我们选择 Amazon Nova Lite 作为使用的模型,其 ID 为 amazon.nova-lite-v1:0,因为它在速度、准确性和成本之间提供了良好的平衡。
注意 :你需要有效的凭据才能使用 Amazon Bedrock。你可以在此处查看文档以获取访问密钥:
bash
`
1. PUT _inference/completion/bedrock_completion_amazon_nova-lite
2. {
3. "service": "amazonbedrock",
4. "service_settings": {
5. "access_key": "#access_key#",
6. "secret_key": "#secret_key#",
7. "region": "us-east-1",
8. "provider": "amazontitan",
9. "model": "amazon.nova-lite-v1:0"
10. }
11. }
`AI写代码
创建评论分析 pipeline
现在,我们创建一个处理流水线,该流水线将使用 Inference Processor 来执行评论分析提示(prompt)。该提示会将评论数据发送到 Amazon Nova Lite,并执行以下操作:
-
情感分类(正面、负面或中立)
-
评论摘要生成
-
关键词提取
-
真实性评估(真实 | 可疑 | 泛化)
bash
`
1. PUT /_ingest/pipeline/review_analyzer_ai
2. {
3. "processors": [
4. {
5. "script":
6. {
7. "source": """ctx.prompt = "Analyze the following product review and return a structured JSON. Task: - Summarize the review concisely. - Detect and classify the sentiment as positive, neutral, or negative.- Generate relevant tags (keywords) based on the review content and detected sentiment. - Evaluate the authenticity of the review (authentic, suspicious, or generic). Review: " + ctx.review + " Respond in JSON format with the following fields: \"review_analyze\": {\"sentiment\": \"<positive | neutral | negative>\", \"authenticity\": \"<authentic | suspicious | generic>\",\"summary\": \"<short review summary>\", \"keywords\": [\"<keyword 1>\", \"<keyword 2>\", \"...\"]}}}"
8. """
9. }
10. },
11. {
12. "inference": {
13. "model_id": "bedrock_completion_amazon_nova-lite",
14. "input_output": {
15. "input_field": "prompt",
16. "output_field": "result"
17. }
18. }
19. },
20. {
21. "gsub": {
22. "field": "result",
23. "pattern": "```json",
24. "replacement": ""
25. }
26. },
27. {
28. "json" : {
29. "field" : "result",
30. "strict_json_parsing": false,
31. "add_to_root" : true
32. }
33. },
34. {
35. "remove": {
36. "field": "result"
37. }
38. },
39. {
40. "remove": {
41. "field": "prompt"
42. }
43. }
44. ]
45. }
`AI写代码
索引评论
现在,我们使用 Bulk API 索引产品评论。之前创建的流水线将自动应用,并将 Nova 模型生成的分析结果添加到索引的文档中。
json
`
1. POST bulk/
2. { "index": { "_index" : "products", "_id": 1, "pipeline":"review_analyzer_ai" } }
3. { "product": "Pampers Pants Premium Care Fralda", "review": "Best diaper ever! Great material, lots of cotton, without all that plastic. Doesn't leak! My baby is a boy and every diaper leaked around the waist, this model solved the problem. Even on a small baby it's worth the effort of putting on the short diaper. I put it on my baby at 9 pm and only take it off in the morning, without any leaks." }
4. { "index": { "_index" : "products", "_id": 2, "pipeline":"review_analyzer_ai" } }
5. { "product": "Portable Electric Body Massager", "review": "It broke in three months for no apparent reason, thank goodness I didn't review it before. I don't recommend buying it because it has a short lifespan." }
6. { "index": { "_index" : "products", "_id": 3, "pipeline":"review_analyzer_ai" } }
7. { "product": "Havit Fuxi-H3 Black Quad-Mode Wired and Wireless Gaming Headset", "review": "The sound is good for the price, but the connectivity is horrible. You always need to be playing audio, otherwise it loses connection (I work from home, and this is very annoying). Sometimes it loses connection and you have to turn it off and on again to get it back on. The microphone is very sensitive, so it loses connection frequently and you have to turn the headset off and on for the microphone to work again. The flexibility of the stem is useless, because if you move it, the microphone can turn off. Sometimes I need to use Linux and the headset simply doesn't work. It's light and comfortable, the sound is adequate, but the connectivity is terrible." }
8. { "index": { "_index" : "products", "_id": 4, "pipeline":"review_analyzer_ai" } }
9. { "product": "Air Fryer 4L Oil Free Fryer Mondial", "review": "For those looking for value for money, it's a good option, but the tray (which is underneath the perforated basket) is already peeling a lot. My mother has one just like it and said that hers is even rusting, in other words, the material is MUCH inferior. There's also something that bothers me, because it looks like a microwave, it doesn't fry evenly, it's weaker in the middle and stronger on the sides. Buy at your own risk." }
`AI写代码
查询和分析结果
最后,我们运行查询以查看 Amazon Nova Lite 模型如何分析和分类评论。通过执行 GET products/_search,我们可以获取已经被评论内容增强的文档。
该模型能够识别主要情感(正面、中立或负面 ),生成简要摘要,提取相关关键词,并评估每条评论的 真实性。这些字段有助于理解客户的意见,而无需阅读完整文本。
在解释结果时,我们关注以下方面:
-
情感:指示消费者对产品的整体感受。
-
摘要:提炼评论中提及的主要观点。
-
关键词:可用于分组相似评论或识别反馈模式。
-
真实性:判断评论是否可信,对内容审核或筛选有帮助。
css
`1. "hits": [
2. {
3. "_index": "products",
4. "_id": "1",
5. "_score": 1,
6. "_ignored": [
7. "review.keyword"
8. ],
9. "_source": {
10. "product": "Pampers Pants Premium Care Fralda",
11. "model_id": "bedrock_completion_amazon_nova-lite",
12. "review_analyze": {
13. "summary": "The reviewer praises the diaper for its great material, high cotton content, and leak-proof design, especially highlighting its effectiveness for their baby.",
14. "sentiment": "positive",
15. "keywords": [
16. "best diaper",
17. "great material",
18. "cotton",
19. "no plastic",
20. "leak-proof",
21. "baby",
22. "effective"
23. ],
24. "authenticity": "authentic"
25. },
26. "review": "Best diaper ever! Great material, lots of cotton, without all that plastic. Doesn't leak! My baby is a boy and every diaper leaked around the waist, this model solved the problem. Even on a small baby it's worth the effort of putting on the short diaper. I put it on my baby at 9 pm and only take it off in the morning, without any leaks."
27. }
28. },
29. {
30. "_index": "products",
31. "_id": "2",
32. "_score": 1,
33. "_source": {
34. "product": "Portable Electric Body Massager",
35. "model_id": "bedrock_completion_amazon_nova-lite",
36. "review_analyze": {
37. "summary": "The product broke in three months for no apparent reason and the reviewer does not recommend it due to its short lifespan.",
38. "sentiment": "negative",
39. "keywords": [
40. "broke",
41. "short lifespan",
42. "not recommend"
43. ],
44. "authenticity": "authentic"
45. },
46. "review": "It broke in three months for no apparent reason, thank goodness I didn't review it before. I don't recommend buying it because it has a short lifespan."
47. }
48. },
49. {
50. "_index": "products",
51. "_id": "3",
52. "_score": 1,
53. "_ignored": [
54. "review.keyword"
55. ],
56. "_source": {
57. "product": "Havit Fuxi-H3 Black Quad-Mode Wired and Wireless Gaming Headset",
58. "model_id": "bedrock_completion_amazon_nova-lite",
59. "review_analyze": {
60. "summary": "The headset has good sound quality for the price but suffers from poor connectivity, especially when using the microphone or moving the headset. It also has compatibility issues with Linux.",
61. "sentiment": "negative",
62. "keywords": [
63. "sound",
64. "connectivity",
65. "microphone",
66. "compatibility",
67. "annoying",
68. "turn off and on",
69. "Linux",
70. "flexible stem",
71. "work from home"
72. ],
73. "authenticity": "authentic"
74. },
75. "review": "The sound is good for the price, but the connectivity is horrible. You always need to be playing audio, otherwise it loses connection (I work from home, and this is very annoying). Sometimes it loses connection and you have to turn it off and on again to get it back on. The microphone is very sensitive, so it loses connection frequently and you have to turn the headset off and on for the microphone to work again. The flexibility of the stem is useless, because if you move it, the microphone can turn off. Sometimes I need to use Linux and the headset simply doesn't work. It's light and comfortable, the sound is adequate, but the connectivity is terrible."
76. }
77. },
78. {
79. "_index": "products",
80. "_id": "4",
81. "_score": 1,
82. "_ignored": [
83. "review.keyword"
84. ],
85. "_source": {
86. "product": "Air Fryer 4L Oil Free Fryer Mondial",
87. "model_id": "bedrock_completion_amazon_nova-lite",
88. "review_analyze": {
89. "summary": "The product offers value for money but has issues with peeling, rusting, and uneven frying.",
90. "sentiment": "negative",
91. "keywords": [
92. "value for money",
93. "peeling",
94. "rusting",
95. "uneven frying",
96. "weaker in the middle"
97. ],
98. "authenticity": "authentic"
99. },
100. "review": "For those looking for value for money, it's a good option, but the tray (which is underneath the perforated basket) is already peeling a lot. My mother has one just like it and said that hers is even rusting, in other words, the material is MUCH inferior. There's also something that bothers me, because it looks like a microwave, it doesn't fry evenly, it's weaker in the middle and stronger on the sides. Buy at your own risk."
101. }
102. }
103. ]`AI写代码
最终想法
Amazon Nova Lite 与 Elasticsearch 的集成展示了语言模型如何将原始评论转化为结构化且有价值的信息。通过流水线处理评论,我们能够自动且一致地提取 情感、真实性、摘要 和 关键词。
结果表明,该模型能够理解评论的上下文、分类用户的意见,并突出显示每个体验中最相关的点。这使数据集更加丰富,可用于提升搜索能力。
想要获得 Elastic 认证?查看下一次 Elasticsearch Engineer 培训时间!
Elasticsearch 拥有众多新功能,可帮助你构建最佳搜索解决方案。探索我们的示例 notebooks 了解更多信息,开启 免费云试用 ,或立即在本地机器尝试 Elastic!