聊聊Spring AI的MilvusVectorStore

本文主要研究一下Spring AI的MilvusVectorStore

示例

pom.xml

复制代码
		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-starter-vector-store-milvus</artifactId>
		</dependency>

配置

复制代码
spring:
  ai:
    vectorstore:
      milvus:
        initialize-schema: true
        databaseName: "default"
        collectionName: "test_collection1"
        embeddingDimension: 1024
        indexType: IVF_FLAT
        metricType: COSINE
        client:
          host: "localhost"
          port: 19530

代码

复制代码
    @Test
    public void testAddAndSearch() {
        List <Document> documents = List.of(
                new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
                new Document("The World is Big and Salvation Lurks Around the Corner"),
                new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));

        // Add the documents to Milvus Vector Store
        vectorStore.add(documents);

        // Retrieve documents similar to a query
        List<Document> results = this.vectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());
        log.info("results:{}", JSON.toJSONString(results));
    }

输出如下:

复制代码
results:[{"contentFormatter":{"excludedEmbedMetadataKeys":[],"excludedInferenceMetadataKeys":[],"metadataSeparator":"\n","metadataTemplate":"{key}: {value}","textTemplate":"{metadata_string}\n\n{content}"},"formattedContent":"distance: 0.43509113788604736\nmeta1: meta1\n\nSpring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!","id":"d1c92394-77c8-4c67-9817-0980ad31479d","metadata":{"distance":0.43509113788604736,"meta1":"meta1"},"score":0.5649088621139526,"text":"Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!"},{"contentFormatter":{"$ref":"$[0].contentFormatter"},"formattedContent":"distance: 0.5709311962127686\n\nThe World is Big and Salvation Lurks Around the Corner","id":"65d7ddb3-a735-4dad-9da0-cbba5665b149","metadata":{"distance":0.5709311962127686},"score":0.42906883358955383,"text":"The World is Big and Salvation Lurks Around the Corner"},{"contentFormatter":{"$ref":"$[0].contentFormatter"},"formattedContent":"distance: 0.5936022996902466\nmeta2: meta2\n\nYou walk forward facing the past and you turn back toward the future.","id":"26050d78-3396-4b61-97ea-111249f6d037","metadata":{"distance":0.5936022996902466,"meta2":"meta2"},"score":0.40639767050743103,"text":"You walk forward facing the past and you turn back toward the future."}]

源码

MilvusVectorStoreAutoConfiguration

org/springframework/ai/vectorstore/milvus/autoconfigure/MilvusVectorStoreAutoConfiguration.java

复制代码
@AutoConfiguration
@ConditionalOnClass({ MilvusVectorStore.class, EmbeddingModel.class })
@EnableConfigurationProperties({ MilvusServiceClientProperties.class, MilvusVectorStoreProperties.class })
@ConditionalOnProperty(name = SpringAIVectorStoreTypes.TYPE, havingValue = SpringAIVectorStoreTypes.MILVUS,
		matchIfMissing = true)
public class MilvusVectorStoreAutoConfiguration {

	@Bean
	@ConditionalOnMissingBean(MilvusServiceClientConnectionDetails.class)
	PropertiesMilvusServiceClientConnectionDetails milvusServiceClientConnectionDetails(
			MilvusServiceClientProperties properties) {
		return new PropertiesMilvusServiceClientConnectionDetails(properties);
	}

	@Bean
	@ConditionalOnMissingBean(BatchingStrategy.class)
	BatchingStrategy milvusBatchingStrategy() {
		return new TokenCountBatchingStrategy();
	}

	@Bean
	@ConditionalOnMissingBean
	public MilvusVectorStore vectorStore(MilvusServiceClient milvusClient, EmbeddingModel embeddingModel,
			MilvusVectorStoreProperties properties, BatchingStrategy batchingStrategy,
			ObjectProvider<ObservationRegistry> observationRegistry,
			ObjectProvider<VectorStoreObservationConvention> customObservationConvention) {

		return MilvusVectorStore.builder(milvusClient, embeddingModel)
			.initializeSchema(properties.isInitializeSchema())
			.databaseName(properties.getDatabaseName())
			.collectionName(properties.getCollectionName())
			.embeddingDimension(properties.getEmbeddingDimension())
			.indexType(IndexType.valueOf(properties.getIndexType().name()))
			.metricType(MetricType.valueOf(properties.getMetricType().name()))
			.indexParameters(properties.getIndexParameters())
			.iDFieldName(properties.getIdFieldName())
			.autoId(properties.isAutoId())
			.contentFieldName(properties.getContentFieldName())
			.metadataFieldName(properties.getMetadataFieldName())
			.embeddingFieldName(properties.getEmbeddingFieldName())
			.batchingStrategy(batchingStrategy)
			.observationRegistry(observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP))
			.customObservationConvention(customObservationConvention.getIfAvailable(() -> null))
			.build();
	}

	@Bean
	@ConditionalOnMissingBean
	public MilvusServiceClient milvusClient(MilvusVectorStoreProperties serverProperties,
			MilvusServiceClientProperties clientProperties, MilvusServiceClientConnectionDetails connectionDetails) {

		var builder = ConnectParam.newBuilder()
			.withHost(connectionDetails.getHost())
			.withPort(connectionDetails.getPort())
			.withDatabaseName(serverProperties.getDatabaseName())
			.withConnectTimeout(clientProperties.getConnectTimeoutMs(), TimeUnit.MILLISECONDS)
			.withKeepAliveTime(clientProperties.getKeepAliveTimeMs(), TimeUnit.MILLISECONDS)
			.withKeepAliveTimeout(clientProperties.getKeepAliveTimeoutMs(), TimeUnit.MILLISECONDS)
			.withRpcDeadline(clientProperties.getRpcDeadlineMs(), TimeUnit.MILLISECONDS)
			.withSecure(clientProperties.isSecure())
			.withIdleTimeout(clientProperties.getIdleTimeoutMs(), TimeUnit.MILLISECONDS)
			.withAuthorization(clientProperties.getUsername(), clientProperties.getPassword());

		if (clientProperties.isSecure() && StringUtils.hasText(clientProperties.getUri())) {
			builder.withUri(clientProperties.getUri());
		}

		if (clientProperties.isSecure() && StringUtils.hasText(clientProperties.getToken())) {
			builder.withToken(clientProperties.getToken());
		}

		if (clientProperties.isSecure() && StringUtils.hasText(clientProperties.getClientKeyPath())) {
			builder.withClientKeyPath(clientProperties.getClientKeyPath());
		}

		if (clientProperties.isSecure() && StringUtils.hasText(clientProperties.getClientPemPath())) {
			builder.withClientPemPath(clientProperties.getClientPemPath());
		}

		if (clientProperties.isSecure() && StringUtils.hasText(clientProperties.getCaPemPath())) {
			builder.withCaPemPath(clientProperties.getCaPemPath());
		}

		if (clientProperties.isSecure() && StringUtils.hasText(clientProperties.getServerPemPath())) {
			builder.withServerPemPath(clientProperties.getServerPemPath());
		}

		if (clientProperties.isSecure() && StringUtils.hasText(clientProperties.getServerName())) {
			builder.withServerName(clientProperties.getServerName());
		}

		return new MilvusServiceClient(builder.build());
	}

	static class PropertiesMilvusServiceClientConnectionDetails implements MilvusServiceClientConnectionDetails {

		private final MilvusServiceClientProperties properties;

		PropertiesMilvusServiceClientConnectionDetails(MilvusServiceClientProperties properties) {
			this.properties = properties;
		}

		@Override
		public String getHost() {
			return this.properties.getHost();
		}

		@Override
		public int getPort() {
			return this.properties.getPort();
		}

	}

}

MilvusVectorStoreAutoConfiguration在spring.ai.vectorstore.typemilvus会启用(matchIfMissing=true),它根据MilvusServiceClientProperties创建PropertiesMilvusServiceClientConnectionDetails,创建TokenCountBatchingStrategy、MilvusServiceClient,最后根据MilvusVectorStoreProperties创建MilvusVectorStore

MilvusServiceClientProperties

org/springframework/ai/vectorstore/milvus/autoconfigure/MilvusServiceClientProperties.java

复制代码
@ConfigurationProperties(MilvusServiceClientProperties.CONFIG_PREFIX)
public class MilvusServiceClientProperties {

	public static final String CONFIG_PREFIX = "spring.ai.vectorstore.milvus.client";

	/**
	 * Secure the authorization for this connection, set to True to enable TLS.
	 */
	protected boolean secure = false;

	/**
	 * Milvus host name/address.
	 */
	private String host = "localhost";

	/**
	 * Milvus the connection port. Value must be greater than zero and less than 65536.
	 */
	private int port = 19530;

	/**
	 * The uri of Milvus instance
	 */
	private String uri;

	/**
	 * Token serving as the key for identification and authentication purposes.
	 */
	private String token;

	/**
	 * Connection timeout value of client channel. The timeout value must be greater than
	 * zero.
	 */
	private long connectTimeoutMs = 10000;

	/**
	 * Keep-alive time value of client channel. The keep-alive value must be greater than
	 * zero.
	 */
	private long keepAliveTimeMs = 55000;

	/**
	 * Enables the keep-alive function for client channel.
	 */
	// private boolean keepAliveWithoutCalls = false;

	/**
	 * The keep-alive timeout value of client channel. The timeout value must be greater
	 * than zero.
	 */
	private long keepAliveTimeoutMs = 20000;

	/**
	 * Deadline for how long you are willing to wait for a reply from the server. With a
	 * deadline setting, the client will wait when encounter fast RPC fail caused by
	 * network fluctuations. The deadline value must be larger than or equal to zero.
	 * Default value is 0, deadline is disabled.
	 */
	private long rpcDeadlineMs = 0; // Disabling deadline

	/**
	 * The client.key path for tls two-way authentication, only takes effect when "secure"
	 * is True.
	 */
	private String clientKeyPath;

	/**
	 * The client.pem path for tls two-way authentication, only takes effect when "secure"
	 * is True.
	 */
	private String clientPemPath;

	/**
	 * The ca.pem path for tls two-way authentication, only takes effect when "secure" is
	 * True.
	 */
	private String caPemPath;

	/**
	 * server.pem path for tls one-way authentication, only takes effect when "secure" is
	 * True.
	 */
	private String serverPemPath;

	/**
	 * Sets the target name override for SSL host name checking, only takes effect when
	 * "secure" is True. Note: this value is passed to grpc.ssl_target_name_override
	 */
	private String serverName;

	/**
	 * Idle timeout value of client channel. The timeout value must be larger than zero.
	 */
	private long idleTimeoutMs = TimeUnit.MILLISECONDS.convert(24, TimeUnit.HOURS);

	/**
	 * The username and password for this connection.
	 */
	private String username = "root";

	/**
	 * The password for this connection.
	 */
	private String password = "milvus";

	//......
}	

MilvusServiceClientProperties提供了spring.ai.vectorstore.milvus.client的配置,可以设置host、port、connectTimeoutMs、username、password等

PropertiesMilvusServiceClientConnectionDetails

org/springframework/ai/vectorstore/milvus/autoconfigure/MilvusVectorStoreAutoConfiguration.java

复制代码
	static class PropertiesMilvusServiceClientConnectionDetails implements MilvusServiceClientConnectionDetails {

		private final MilvusServiceClientProperties properties;

		PropertiesMilvusServiceClientConnectionDetails(MilvusServiceClientProperties properties) {
			this.properties = properties;
		}

		@Override
		public String getHost() {
			return this.properties.getHost();
		}

		@Override
		public int getPort() {
			return this.properties.getPort();
		}

	}

PropertiesMilvusServiceClientConnectionDetails实现了MilvusServiceClientConnectionDetails接口,适配了getHost、getPort方法

MilvusVectorStoreProperties

org/springframework/ai/vectorstore/milvus/autoconfigure/MilvusVectorStoreProperties.java

复制代码
@ConfigurationProperties(MilvusVectorStoreProperties.CONFIG_PREFIX)
public class MilvusVectorStoreProperties extends CommonVectorStoreProperties {

	public static final String CONFIG_PREFIX = "spring.ai.vectorstore.milvus";

	/**
	 * The name of the Milvus database to connect to.
	 */
	private String databaseName = MilvusVectorStore.DEFAULT_DATABASE_NAME;

	/**
	 * Milvus collection name to store the vectors.
	 */
	private String collectionName = MilvusVectorStore.DEFAULT_COLLECTION_NAME;

	/**
	 * The dimension of the vectors to be stored in the Milvus collection.
	 */
	private int embeddingDimension = MilvusVectorStore.OPENAI_EMBEDDING_DIMENSION_SIZE;

	/**
	 * The type of the index to be created for the Milvus collection.
	 */
	private MilvusIndexType indexType = MilvusIndexType.IVF_FLAT;

	/**
	 * The metric type to be used for the Milvus collection.
	 */
	private MilvusMetricType metricType = MilvusMetricType.COSINE;

	/**
	 * The index parameters to be used for the Milvus collection.
	 */
	private String indexParameters = "{\"nlist\":1024}";

	/**
	 * The ID field name for the collection.
	 */
	private String idFieldName = MilvusVectorStore.DOC_ID_FIELD_NAME;

	/**
	 * Boolean flag to indicate if the auto-id is used.
	 */
	private boolean isAutoId = false;

	/**
	 * The content field name for the collection.
	 */
	private String contentFieldName = MilvusVectorStore.CONTENT_FIELD_NAME;

	/**
	 * The metadata field name for the collection.
	 */
	private String metadataFieldName = MilvusVectorStore.METADATA_FIELD_NAME;

	/**
	 * The embedding field name for the collection.
	 */
	private String embeddingFieldName = MilvusVectorStore.EMBEDDING_FIELD_NAME;

	//......

	public enum MilvusMetricType {

		/**
		 * Invalid metric type
		 */
		INVALID,
		/**
		 * Euclidean distance
		 */
		L2,
		/**
		 * Inner product
		 */
		IP,
		/**
		 * Cosine distance
		 */
		COSINE,
		/**
		 * Hamming distance
		 */
		HAMMING,
		/**
		 * Jaccard distance
		 */
		JACCARD

	}

	public enum MilvusIndexType {

		INVALID, FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, HNSW, DISKANN, AUTOINDEX, SCANN, GPU_IVF_FLAT, GPU_IVF_PQ, BIN_FLAT,
		BIN_IVF_FLAT, TRIE, STL_SORT

	}

}	

MilvusVectorStoreProperties提供了spring.ai.vectorstore.milvus的配置,主要是配置databaseName、collectionName、embeddingDimension(默认1536)、indexType(默认IVF_FLAT)、metricType(默认COSINE)

CommonVectorStoreProperties

org/springframework/ai/vectorstore/properties/CommonVectorStoreProperties.java

复制代码
public class CommonVectorStoreProperties {

	/**
	 * Vector stores do not initialize schema by default on application startup. The
	 * applications explicitly need to opt-in for initializing the schema on startup. The
	 * recommended way to initialize the schema on startup is to set the initialize-schema
	 * property on the vector store. See {@link #setInitializeSchema(boolean)}.
	 */
	private boolean initializeSchema = false;

	public boolean isInitializeSchema() {
		return this.initializeSchema;
	}

	public void setInitializeSchema(boolean initializeSchema) {
		this.initializeSchema = initializeSchema;
	}

}

CommonVectorStoreProperties定义了initializeSchema属性,代表说是否需要在启动的时候初始化schema

小结

Spring AI提供了spring-ai-starter-vector-store-milvus用于自动装配MilvusVectorStore。要注意的是embeddingDimension默认是1536,如果出现io.milvus.exception.ParamException: Incorrect dimension for field 'embedding': the no.0 vector's dimension: 1024 is not equal to field's dimension: 1536,那么需要重建schema,把embeddingDimension设置为1024。

doc

相关推荐
移远通信35 分钟前
2025上海车展 | 移远通信全栈车载智能解决方案重磅亮相,重构“全域智能”出行新范式
人工智能
lybugproducer1 小时前
创建型设计模式之:简单工厂模式、工厂方法模式、抽象工厂模式、建造者模式和原型模式
java·设计模式·建造者模式·简单工厂模式·工厂方法模式·抽象工厂模式·面向对象
南客先生2 小时前
马架构的Netty、MQTT、CoAP面试之旅
java·mqtt·面试·netty·coap
Minyy112 小时前
SpringBoot程序的创建以及特点,配置文件,LogBack记录日志,配置过滤器、拦截器、全局异常
xml·java·spring boot·后端·spring·mybatis·logback
百锦再2 小时前
Java与Kotlin在Android开发中的全面对比分析
android·java·google·kotlin·app·效率·趋势
武昌库里写JAVA3 小时前
39.剖析无处不在的数据结构
java·vue.js·spring boot·课程设计·宠物管理
蹦蹦跳跳真可爱5894 小时前
Python----深度学习(基于深度学习Pytroch簇分类,圆环分类,月牙分类)
人工智能·pytorch·python·深度学习·分类
Nelson_hehe5 小时前
Java基础第四章、面向对象
java·语法基础·面向对象程序设计
蚂蚁20145 小时前
卷积神经网络(二)
人工智能·计算机视觉
Thomas_YXQ5 小时前
Unity3D Lua集成技术指南
java·开发语言·驱动开发·junit·全文检索·lua·unity3d