Oracle 26ai 向量搜索零入门:从模型加载到语义检索初体验

前言

在 AI 时代,如何让数据库理解语义、实现智能搜索成为热点话题。Oracle 26ai 数据库内置了向量搜索能力,无需外部向量数据库,就能在 SQL 层面完成文本语义检索。本文将带你从零开始,在 Oracle 26ai 中完成 ONNX 模型加载、向量生成和相似性搜索的全流程。

选择模型

要从数据生成向量,我们需要一种方法。可以在外部生成向量再加载到数据库中,也可以将模型加载到数据库中,然后使用该模型从数据生成向量。简单一点,我使用 Oracle 提供的 all-MiniLM-L12-v2 模型,我们将使用它将纯文本转换为向量。

all-MiniLM-L12-v2模型的下载地址:https://adwc4pm.objectstorage.us-ashburn-1.oci.customer-oci.com/p/TtH6hL2y25EypZ0-rrczRZ1aXp7v1ONbRBfCiT-BDBN8WLKQ3lgyW6RxCfIFLdA6/n/adwc4pm/b/OML-ai-models/o/all_MiniLM_L12_v2_augmented.zip

加载模型

创建一个目录来存放模型,下载并解压到该目录:

bash 复制代码
[oracle@ora26ai ~]$ mkdir -p /u01/models
[oracle@ora26ai ~]$ cd /u01/models
[oracle@ora26ai models]$ unzip -oq all_MiniLM_L12_v2_augmented.zip
[oracle@ora26ai models]$ ls
all_MiniLM_L12_v2_augmented.zip  all_MiniLM_L12_v2.onnx  LICENSE_ATTRIBUTION.txt  README-ALL_MINILM_L12_V2-augmented.txt

使用 Oracle 26ai 数据库,来进行以下测试。连接到数据库,创建测试用户,创建指向物理目录的目录对象,并授予测试用户访问权限:

sql 复制代码
[oracle@ora26ai models]$ sqlplus sys/Oraoracle@//10.10.10.39:1521/pdb1 as sysdba

SQL*Plus: Release 23.26.1.0.0 - Production on Thu Apr 9 23:09:50 2026
Version 23.26.1.0.0

Copyright (c) 1982, 2025, Oracle.  All rights reserved.


Connected to:
Oracle AI Database 26ai Enterprise Edition Release 23.26.1.0.0 - Production
Version 23.26.1.0.0

SQL> create user if not exists testuser1 identified by testuser1 quota unlimited on users;

User created.

SQL> grant create session, db_developer_role, create mining model to testuser1;

Grant succeeded.

SQL> create or replace directory model_dir as '/u01/models';

Directory created.

SQL> grant read, write on directory model_dir to testuser1;

Grant succeeded.

现在使用 DBMS_VECTOR 包将模型加载到数据库中:

sql 复制代码
begin
  dbms_vector.drop_onnx_model (
    model_name => 'ALL_MINILM_L12_V2',
    force => true);

  dbms_vector.load_onnx_model (
    directory  => 'model_dir',
    file_name  => 'all_MiniLM_L12_v2.onnx',
    model_name => 'ALL_MINILM_L12_V2');
end;
/

USER_MINING_MODELS 视图中查看模型信息:

sql 复制代码
column model_name format a30
column algorithm format a10
column mining_function format a15

select model_name, algorithm, mining_function
from   user_mining_models
where  model_name = 'ALL_MINILM_L12_V2';

输出:
MODEL_NAME                     ALGORITHM  MINING_FUNCTION
------------------------------ ---------- ---------------
ALL_MINILM_L12_V2              ONNX       EMBEDDING

生成向量(VECTOR 数据类型)

现在可以使用 VECTOR_EMBEDDING 函数生成向量。以下示例为文本 "Quick test" 生成向量。可以看到,相对于文本长度,生成的向量非常大。

sql 复制代码
set long 1000000
set pagesize 400
select vector_embedding(all_minilm_l12_v2 using 'Quick test' as data) AS my_vector;

MY_VECTOR
--------------------------------------------------------------------------------
[-3.86444256E-002,7.27762803E-002,-6.99377712E-003,-7.29618035E-003,8.81512091E-
003,-6.36086613E-002,4.39666817E-003,-4.20215651E-002,-1.32307202E-001,-5.837616
05E-003,-1.3236966E-002,-1.62914731E-002,6.54898351E-003,-4.983522E-002,-1.98450
536E-002,-4.69920225E-002,1.03937663E-001,-8.96753445E-002,-2.77861813E-003,4.13
947664E-002,-6.51626661E-002,-1.09901905E-001,-8.73053819E-003,2.533352E-002,-1.
42030632E-002,-2.42071245E-002,1.91591978E-002,4.93748812E-003,6.30869251E-003,-
1.24127813E-001,-7.17297941E-003,3.7317384E-002,4.97635901E-002,4.52162437E-002,
1.49683114E-002,-2.21795831E-002,-3.67936082E-002,-6.20233943E-004,7.16803819E-0
02,5.33913262E-003,1.92087106E-002,-9.91346017E-002,3.90679464E-002,2.22725421E-
002,5.04363105E-002,1.81943253E-002,5.34031466E-002,1.44161871E-002,-1.99907795E
-002,-1.20323608E-002,-2.63888389E-002,-4.14667316E-002,6.2473774E-002,-4.688386
62E-002,1.16748568E-002,-2.43180972E-002,-3.11982706E-002,-7.5750039E-003,2.2546
6359E-002,-4.17359956E-002,1.23237111E-002,4.31706831E-002,-7.83750787E-002,1.24
918511E-002,5.42060807E-002,4.33742851E-002,2.52278382E-003,-1.15482137E-002,-9.
98658361E-004,-2.12613102E-002,1.00960173E-002,3.17986645E-002,-1.13146752E-002,
-1.26893371E-002,2.66182758E-002,-7.50683714E-003,-3.70341949E-002,1.94851588E-0
02,-2.9213747E-002,-2.61210538E-002,2.86212545E-002,-9.15900841E-002,1.50552345E
-002,-4.98168021E-002,2.29324233E-002,7.82517716E-003,4.22972552E-002,3.37974802
E-002,-4.2345725E-002,-6.32970557E-002,3.84949856E-002,-1.93851739E-002,1.962338
11E-003,-3.91593436E-004,7.80334743E-003,5.63595518E-002,4.45814878E-002,-4.9701
6348E-002,1.36384079E-002,2.76547611E-001,6.3580215E-002,-1.69337653E-002,-3.259
48671E-002,2.74621621E-002,-1.84809547E-002,-3.58916223E-002,3.18280957E-003,-3.
92074026E-002,-5.03925188E-003,-3.95198427E-002,2.64224112E-002,5.44404946E-002,
-2.97634304E-003,1.06564369E-002,4.55005467E-002,-9.6166715E-002,4.53019142E-002
,3.02239861E-002,-1.11025631E-001,6.18582554E-002,8.55141804E-002,-1.51456865E-0
02,-5.64082488E-002,-5.93042118E-004,1.07500188E-001,-6.81523383E-002,1.85917299
E-002,3.75312977E-002,-3.27163041E-002,-4.72422279E-002,5.59753366E-002,2.204377
58E-002,2.74991542E-002,2.6306238E-002,-5.36229946E-002,6.86868653E-003,5.060332
36E-003,8.8686645E-002,3.97679023E-002,-1.49526203E-003,-1.11991554E-001,-1.4892
0557E-002,-1.42183565E-002,5.44682518E-002,-5.46902828E-002,-3.37714851E-002,-3.
93099487E-002,-8.88748933E-003,-2.50034276E-002,-3.86718176E-002,7.15422928E-002
,-1.72947664E-002,5.7217218E-002,1.23445597E-002,-6.2534079E-002,-1.97963249E-00
2,4.08163741E-002,9.22357664E-003,2.36456636E-002,-4.27602604E-003,-1.24366455E-
001,8.28649253E-002,-5.27119003E-002,-1.11121042E-002,4.35752422E-002,1.35777146
E-002,-2.25060452E-002,4.55260389E-002,3.8973894E-002,-8.93306658E-002,1.1729340
3E-001,5.51190414E-002,-2.56631672E-002,-5.30632474E-002,-8.3953537E-002,4.83216
904E-003,6.57674596E-002,8.87271464E-002,-1.52742835E-002,1.0525425E-002,-1.5814
418E-002,-3.10783181E-002,-2.90690996E-002,7.04224128E-003,-3.09849493E-002,-4.4
6302071E-003,-7.20088482E-002,-7.05658421E-002,4.65546139E-002,1.10276654E-001,3
.60872261E-002,1.86069943E-002,-6.10642694E-002,3.21829356E-002,-1.43657662E-002
,-6.75653145E-002,8.0748558E-002,1.68782603E-002,-1.0059043E-001,-7.55800977E-00
2,-1.69591643E-002,-4.4571083E-002,-8.6054299E-003,4.3378789E-002,4.29520719E-00
2,3.94066535E-002,8.67496245E-003,-8.5212335E-002,1.20206423E-001,-1.14268221E-0
01,-1.7028559E-002,8.87670461E-003,-4.69081141E-002,-3.02138384E-002,4.61057387E
-002,-4.92519997E-002,1.5618098E-002,-9.27053913E-002,-6.08293712E-002,1.4645159
2E-002,-1.84691269E-002,-1.40407518E-001,5.35490997E-002,5.85880894E-033,7.62652
457E-002,-3.07706036E-002,-6.74770074E-003,1.03074081E-001,7.20860362E-002,-9.75
818709E-002,1.51840553E-001,7.43321329E-002,-2.99238227E-002,9.39517915E-002,1.5
0299724E-002,4.35530096E-002,-7.5808214E-003,-7.49262646E-002,-5.07647395E-002,4
.01099063E-002,-7.43360221E-002,4.62087579E-002,9.61421523E-003,3.15185694E-004,
6.20259941E-002,1.57011077E-002,3.29307318E-002,5.69748059E-002,-7.89974183E-002
,9.78372246E-003,1.16776042E-002,-3.65987606E-002,-5.30387387E-002,-1.22491773E-
002,5.65313101E-002,3.41438241E-002,-4.26849052E-002,9.84478816E-002,1.52461289E
-003,-6.92429617E-002,9.64930356E-002,-1.85021386E-002,4.28027324E-002,-4.418307
54E-002,-2.54553296E-002,5.20384498E-002,-1.38082858E-002,-1.59469545E-002,2.100
03126E-002,-1.85695048E-002,2.29395889E-002,1.91418324E-002,4.09490801E-002,2.35
137548E-002,-3.91654707E-002,3.57466713E-002,4.80409227E-002,-1.02699148E-002,1.
45040629E-002,-4.27465364E-002,-5.09258844E-002,-7.12790638E-002,-9.19121876E-00
2,1.79740377E-002,-3.53490934E-002,-2.26370133E-002,1.6497435E-002,1.05952598E-0
01,-3.52565758E-002,-3.41151431E-002,-5.7282865E-002,-3.10265832E-002,6.97579682
E-002,-2.50360705E-002,-3.91423441E-002,1.37532474E-002,6.76135859E-003,-5.15896
529E-002,-3.55789214E-002,6.91288933E-002,-3.4734223E-002,-1.098355E-002,-2.4521
4198E-002,-3.24611887E-002,1.08489497E-002,8.21795501E-003,-2.97738295E-002,4.81
432006E-002,-5.78631125E-002,2.8562462E-002,4.02920581E-002,2.82907318E-002,-3.7
9493348E-002,8.59354064E-003,1.02058174E-002,2.48055868E-002,7.16195162E-003,-6.
24535196E-002,-3.25725861E-002,4.26037687E-033,-7.57265091E-003,-4.15650047E-002
,-4.9813509E-002,1.02479653E-002,3.28872614E-002,1.50397036E-003,-6.39198944E-00
2,-7.53579438E-002,-2.46183965E-002,-3.06450091E-002,4.16100211E-002,7.04020783E
-002,-8.15085769E-002,2.55300757E-002,1.89818796E-002,4.26408388E-002,-2.1998681
1E-002,7.13623175E-003,-3.42554823E-002,3.70062445E-003,-3.15255509E-003,1.41580
394E-002,5.00133969E-002,7.54985511E-002,6.42605647E-002,7.55612105E-002,1.52721
936E-002,1.15661152E-001,-2.45987345E-002,1.08358106E-002,5.02406769E-002,6.2881
0748E-002,-5.52952401E-002,-5.5196926E-002,-4.60026506E-003,-1.46539938E-002,6.4
0283972E-002,5.18338159E-002,2.51765456E-002,6.45218045E-002,-8.35603252E-002,3.
57579924E-002,6.28178008E-003,3.15947039E-003,2.320843E-002,4.76812162E-002,-5.4
7788292E-003,-1.06323622E-001,-1.45862857E-002,-5.92180677E-002,-1.59236323E-002
,-1.90922078E-002,4.61261906E-002,2.41158754E-002,-7.90221989E-003,1.11448206E-0
01,1.11205513E-002,-2.0573806E-002,-4.08658385E-002,5.5462148E-002,5.37177995E-0
02,4.76263314E-002,-3.29908058E-002,4.37314026E-002]

我们可以在 DML 中使用 VECTOR_EMBEDDING 函数,下面来演示,我们需要一些测试文本,如我下载了一些电影台词,放到 movie_quotes.csv 文件。

bash 复制代码
[oracle@ora26ai models]$ cd /u01/models/
[oracle@ora26ai models]$ ls movie_quotes.csv
movie_quotes.csv

连接到数据库,从 CSV 文件创建新表:

sql 复制代码
[oracle@ora26ai models]$ sqlplus sys/Oraoracle@//10.10.10.39:1521/pdb1 as sysdba

SQL*Plus: Release 23.26.1.0.0 - Production on Thu Apr 9 23:21:28 2026
Version 23.26.1.0.0

Copyright (c) 1982, 2025, Oracle.  All rights reserved.


Connected to:
Oracle AI Database 26ai Enterprise Edition Release 23.26.1.0.0 - Production
Version 23.26.1.0.0


SQL> drop table if exists movie_quotes purge;

Table dropped.

SQL> 
create table movie_quotes as
select movie_quote, movie, movie_type, movie_year
from   external (
         (
           movie_quote  varchar2(400),
           movie        varchar2(200),
           movie_type   varchar2(50),
           movie_year   number(4)
         )
         type oracle_loader
         default directory model_dir
         access parameters (
           records delimited by newline
           skip 1
           badfile model_dir
           logfile model_dir:'moview_quotes_ext_tab_%a_%p.log'
           discardfile model_dir
           fields csv with embedded terminated by ',' optionally enclosed by '"'
           missing field values are null
           (
             movie_quote char(400),
             movie,
             movie_type,
             movie_year
           )
        )
        location ('movie_quotes.csv')
        reject limit unlimited
      );

SQL> desc movie_quotes
 Name					   Null?    Type
 ----------------------------------------- -------- ----------------------------
 MOVIE_QUOTE					    VARCHAR2(400)
 MOVIE						    VARCHAR2(200)
 MOVIE_TYPE					    VARCHAR2(50)
 MOVIE_YEAR					    NUMBER(4)

添加新列以存储每条电影台词的向量数据,使用新的 VECTOR 数据类型:

sql 复制代码
SQL> alter table movie_quotes add (
  movie_quote_vector vector
);  2    3  

Table altered.

SQL> desc movie_quotes
 Name					   Null?    Type
 ----------------------------------------- -------- ----------------------------
 MOVIE_QUOTE					    VARCHAR2(400)
 MOVIE						    VARCHAR2(200)
 MOVIE_TYPE					    VARCHAR2(50)
 MOVIE_YEAR					    NUMBER(4)
 MOVIE_QUOTE_VECTOR				    VECTOR(*, *, DENSE)

通过从电影台词生成向量来填充新列:

sql 复制代码
SQL> update movie_quotes set    movie_quote_vector = vector_embedding(all_minilm_l12_v2 using movie_quote as data);  

732 rows updated.

SQL> commit;

Commit complete.

使用 VECTOR_DISTANCE 进行向量搜索

使用 VECTOR_DISTANCE 函数进行搜索。该函数接受两个向量,返回它们之间的距离。由于我们使用的模型是为文本数据生成向量的,因此两个相似向量的向量距离应该更小。

在以下示例中,我们从搜索文本创建向量,并按搜索文本与台词文本之间的向量距离对查询输出进行排序。

首先搜索 "Films with motivational speaking in them":

sql 复制代码
variable search_text varchar2(100);
exec :search_text := 'Films with motivational speaking in them';

set linesize 200
column movie format a50
column movie_quote format a100

SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,
       movie,
       movie_quote
FROM   movie_quotes
order by 1
fetch approximate first 5 rows only;

  DISTANCE MOVIE                                              MOVIE_QUOTE
---------- -------------------------------------------------- --------------------------------------------------
6.786E-001 Once Upon a Time in Hollywood                      That was the best acting i've ever seen in my whole life.
6.979E-001 Dead Poets Society                                 You must strive to find your own voice because the longer you wait to begin, the less likely you are going to find it at all.
7.169E-001 The Pursuit of Happyness                           Walk that walk and go forward all the time. Don't just talk that talk, walk it and go forward. Also, the walk didn't have to be long strides; baby steps counted too. Go forward.
7.186E-001 Joker                                              My mother always tells me to smile and put on a happy face. She told me I had a purpose to bring laughter and joy to the world.
7.234E-001 Blazing Saddles                                    Men, you are about to embark on a great crusade to stamp out runaway decency in the west. Now you men will only be risking your lives, whilst I will be risking an almost certain Academy Award nomination for Best Supporting Actor.

接下来搜索 "Films about war":

sql 复制代码
variable search_text varchar2(100);
exec :search_text := 'Films about war';

set linesize 200
column movie format a50
column movie_quote format a100

SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,
       movie,
       movie_quote
FROM   movie_quotes
order by 1
fetch approximate first 5 rows only;

  DISTANCE MOVIE                                              MOVIE_QUOTE
---------- -------------------------------------------------- --------------------------------------------------
5.682E-001 Dr. Strangelove                                    Gentlemen, you can't fight in here! This is the War Room!
6.346E-001 Blazing Saddles                                    Men, you are about to embark on a great crusade to stamp out runaway decency in the west. Now you men will only be risking your lives, whilst I will be risking an almost certain Academy Award nomination for Best Supporting Actor.
6.587E-001 Fury                                               Ideals are peaceful; history is violent.
7.243E-001 The Kill Team                                      You give me your loyalty, and I?ll guarantee that each and every one of you will have a chance to be a warrior, to actually be a part of history.
7.253E-001 Dr. No                                             Bond. James Bond

创建向量索引

如果要创建向量索引,必须为根容器设置 VECTOR_MEMORY_SIZE 参数,所需内存量取决于被索引数据的大小和复杂度。否则可能出现报错:ERROR at line 1: ORA-51962: The vector memory area is out of space for the current container.

sql 复制代码
SQL> show parameter VECTOR_MEMORY_SIZE;

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
vector_memory_size		     big integer 0
SQL> 
SQL> alter system set vector_memory_size = 1G scope=spfile;

System altered.

SQL> 
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area 3436422304 bytes
Fixed Size		    5014688 bytes
Variable Size		  570425344 bytes
Database Buffers	 1778384896 bytes
Redo Buffers		    8855552 bytes
Vector Memory Area	 1073741824 bytes
Database mounted.
Database opened.
SQL> show parameter VECTOR_MEMORY_SIZE;

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
vector_memory_size		     big integer 1G

分配足够内存后,就可以创建向量索引了。有两种类型的向量索引:

  • 邻域分区向量索引:适合中等规模数据集,通过分区技术优化搜索性能
  • 内存中邻域图向量索引:适合大规模数据集,使用 HNSW 算法提供更快的近似搜索

以下是两种索引的创建方法:

sql 复制代码
-- 邻域分区向量索引
SQL> drop index if exists movie_quotes_vector_idx;

Index dropped.

SQL> create vector index movie_quotes_vector_idx on movie_quotes(movie_quote_vector) organization neighbor partitions
distance cosine
with target accuracy 95; 

Index created.

-- 内存中邻域图向量索引
SQL> drop index if exists movie_quotes_vector_idx;
SQL> create vector index movie_quotes_vector_idx on movie_quotes(movie_quote_vector) organization inmemory neighbor graph
distance cosine
with target accuracy 95;

通过本文的实战演练,相信你已经了解Oracle 26ai 中使用向量搜索的一些方法。从模型加载、向量生成到相似性检索,Oracle 提供了完整的 SQL 接口,让 AI 能力触手可及。在此过程中,也观察到向量很大,无论是从存储角度还是内存角度,都必须考虑到这一点,尤其是在创建向量索引时。另外此功能依赖于我们为要执行的任务拥有合适的模型。如果没有好的模型,结果就不会很好。


相关推荐
ZC跨境爬虫3 小时前
海南大学交友平台登录页开发实战day3(解决python传输并读取登录信息的问题)
前端·数据库·python·html
JosieBook3 小时前
【数据库】为何“端边云”协同架构正在重塑大数据存储格局?
大数据·数据库·架构
SPC的存折3 小时前
3、MySQL数据库主从复制
linux·运维·服务器·数据库·mysql
眷蓝天3 小时前
MySQL数据库主从复制+MaxScale读写分离
数据库·mysql
实证小助手3 小时前
最新上市公司2011-2024年面板数据
数据库·论文笔记
一 乐3 小时前
非遗文化传承网站|基于springboot + vue非遗文化传承网站系统(源码+数据库+文档)
java·数据库·vue.js·spring boot·论文·毕设·非遗文化传承网站
祢真伟大3 小时前
TranswarpArgoDB9.4星环数据库部署
数据库
XDHCOM3 小时前
ORA-13045报错解析,科普兼容性标志,故障修复与远程处理指南
数据库·oracle
Elastic 中国社区官方博客3 小时前
如何使用 Mastra 和 Elasticsearch 构建具备代理能力的 AI 应用
大数据·数据库·人工智能·elasticsearch·搜索引擎·ai·全文检索