
简介
我们推出Intern-S1 ------这是迄今为止最先进的开源多模态推理模型 。Intern-S1兼具强大的通用任务能力与顶尖的科学领域性能,足以媲美领先的闭源商业模型。
该模型基于2350亿参数的MoE语言架构与60亿参数的视觉编码器,并经过5万亿token多模态数据 的持续预训练,其中包含超过2.5万亿科学领域token 。这使得Intern-S1在保持优秀通用能力的同时,特别擅长解析化学结构、理解蛋白质序列、规划化合物合成路线等专业科学任务,成为真实科研场景中的得力助手。
特性
- 在语言与视觉推理基准测试中表现优异,尤其擅长科学类任务
- 基于5万亿token的超大规模数据集持续预训练,其中超50%为专业科学数据,具备深厚领域知识沉淀
- 动态分词器可原生解析分子式、蛋白质序列及地震信号等专业数据格式
性能表现
我们在通用数据集和科学数据集上对Intern-S1进行了多维度评估。下表展示了与当前主流视觉语言模型及大语言模型的性能对比结果。
| Benchmarks | Intern-S1 || InternVL3-78B | Qwen2.5-VL-72B | DS-R1-0528 | Qwen3-235B-A22B | Kimi-K2-Instruct | Gemini-2.5 Pro | o3 | Grok-4 |
| Benchmarks |
|----------------|---|---|---------------|----------------|------------|-----------------|------------------|----------------|------|--------|
| MMLU-Pro | 83.5 ✅ || 73.0 | 72.1 | 83.4 | 82.2 | 82.7 | 86.0 | 85.0 | 85.9 |
| MMMU | 77.7 ✅ || 72.2 | 70.2 | - | - | - | 81.9 | 80.8 | 77.9 |
| GPQA | 77.3 || 49.9 | 49.0 | 80.6 | 71.1 | 77.8 | 83.8 | 83.3 | 87.5 |
| MMStar | 74.9 ✅ || 72.5 | 70.8 | - | - | - | 79.3 | 75.1 | 69.6 |
| MathVista | 81.5 👑 || 79.0 | 74.8 | - | - | - | 80.3 | 77.5 | 72.5 |
| AIME2025 | 86.0 || 10.7 | 10.9 | 87.5 | 81.5 | 51.4 | 83.0 | 88.9 | 91.7 |
| MathVision | 62.5 ✅ || 43.1 | 38.1 | - | - | - | 73.0 | 67.7 | 67.3 |
| IFEval | 86.7 || 75.6 | 83.9 | 79.7 | 85.0 | 90.2 | 91.5 | 92.2 | 92.8 |
| SFE | 44.3 👑 || 36.2 | 30.5 | - | - | - | 43.0 | 37.7 | 31.2 |
| Physics | 44.0 ✅ || 23.1 | 15.7 | - | - | - | 40.0 | 47.9 | 42.8 |
| SmolInstruct | 51.0 👑 || 19.4 | 21.0 | 30.7 | 28.7 | 48.1 | 40.4 | 43.9 | 47.3 |
| ChemBench | 83.4 👑 || 61.3 | 61.6 | 75.6 | 75.8 | 75.3 | 82.8 | 81.6 | 83.3 |
| MatBench | 75.0 👑 || 49.3 | 51.5 | 57.7 | 52.1 | 61.7 | 61.7 | 61.6 | 67.9 |
| MicroVQA | 63.9 👑 || 59.1 | 53.0 | - | - | - | 63.1 | 58.3 | 59.5 |
| ProteinLMBench | 63.1 || 61.6 | 61.0 | 61.4 | 59.8 | 66.7 | 62.9 | 67.7 | 66.2 |
| MSEarthMCQ | 65.7 👑 || 57.2 | 37.6 | - | - | - | 59.9 | 61.0 | 58.0 |
| XLRS-Bench | 55.0 👑 || 49.3 | 50.9 | - | - | - | 45.2 | 43.6 | 45.4 |
注意:✅ 表示开源模型中的最佳性能,👑 表示所有模型中的最佳性能。
我们使用 OpenCompass 和 VLMEvalkit 来评估所有模型。
快速开始
采样参数
我们推荐使用以下超参数以确保更好的结果
python
top_p = 1.0
top_k = 50
min_p = 0.0
temperature = 0.7
Transformers
以下提供了演示代码,展示如何基于文本和多模态输入进行生成。
请使用transformers>=4.53.0以确保模型正常运行。
文本输入
python
from transformers import AutoProcessor, AutoModelForCausalLM
import torch
model_name = "internlm/Intern-S1"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "tell me about an interesting physical phenomenon."},
],
}
]
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)
generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)
图像输入
python
from transformers import AutoProcessor, AutoModelForCausalLM
import torch
model_name = "internlm/Intern-S1"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "http://images.cocodataset.org/val2017/000000039769.jpg"},
{"type": "text", "text": "Please describe the image explicitly."},
],
}
]
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)
generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)
视频输入
请确保已通过pip install decord
命令安装decord视频解码库。
python
from transformers import AutoProcessor, AutoModelForCausalLM
import torch
model_name = "internlm/Intern-S1"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
messages = [
{
"role": "user",
"content": [
{
"type": "video",
"url": "https://huggingface.co/datasets/hf-internal-testing/fixtures_videos/resolve/main/tennis.mp4",
},
{"type": "text", "text": "What type of shot is the man performing?"},
],
}
]
inputs = processor.apply_chat_template(
messages,
return_tensors="pt",
add_generation_prompt=True,
video_load_backend="decord",
tokenize=True,
return_dict=True,
).to(model.device, dtype=torch.float16)
generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)
服务
您可以使用以下任一LLM推理框架来创建与OpenAI兼容的服务器:
lmdeploy(>=0.9.2)
lmdeploy serve api_server internlm/Intern-S1 --reasoning-parser intern-s1 --tool-call-parser intern-s1 --tp 8
vllm
Coming soon.
sglang
对Intern-S1的支持仍在开发中,请参考这个PR。
bash
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
python3 -m sglang.launch_server \
--model-path internlm/Intern-S1 \
--trust-remote-code \
--mem-fraction-static 0.85 \
--tp 8 \
--enable-multimodal \
--grammar-backend none
ollama 本地部署:
bash
# install ollama
curl -fsSL https://ollama.com/install.sh | sh
# fetch model
ollama pull internlm/interns1
# run model
ollama run internlm/interns1
# then use openai client to call on http://localhost:11434/v1
高级用法
工具调用
如今,许多大语言模型(LLM)都具备工具调用功能,这是一种强大的能力,允许模型通过与外置工具和API交互来扩展其功能。这使得模型能够执行诸如获取实时信息、运行代码或调用其他应用程序中的函数等任务。
对开发者而言,一个关键优势是越来越多的开源LLM被设计为与OpenAI API兼容。这意味着你可以利用OpenAI库中熟悉的语法和结构,在这些开源模型上实现工具调用。因此,本教程演示的代码具有通用性------它不仅适用于OpenAI模型,也适用于任何遵循相同接口标准的模型。
为了说明其工作原理,我们将通过一个实际代码示例(基于lmdeploy api服务器)展示如何使用工具调用获取最新天气预报。
python
from openai import OpenAI
import json
def get_current_temperature(location: str, unit: str = "celsius"):
"""Get current temperature at a location.
Args:
location: The location to get the temperature for, in the format "City, State, Country".
unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])
Returns:
the temperature, the location, and the unit in a dict
"""
return {
"temperature": 26.1,
"location": location,
"unit": unit,
}
def get_temperature_date(location: str, date: str, unit: str = "celsius"):
"""Get temperature at a location and date.
Args:
location: The location to get the temperature for, in the format "City, State, Country".
date: The date to get the temperature for, in the format "Year-Month-Day".
unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])
Returns:
the temperature, the location, the date and the unit in a dict
"""
return {
"temperature": 25.9,
"location": location,
"date": date,
"unit": unit,
}
def get_function_by_name(name):
if name == "get_current_temperature":
return get_current_temperature
if name == "get_temperature_date":
return get_temperature_date
tools = [{
'type': 'function',
'function': {
'name': 'get_current_temperature',
'description': 'Get current temperature at a location.',
'parameters': {
'type': 'object',
'properties': {
'location': {
'type': 'string',
'description': 'The location to get the temperature for, in the format \'City, State, Country\'.'
},
'unit': {
'type': 'string',
'enum': [
'celsius',
'fahrenheit'
],
'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'
}
},
'required': [
'location'
]
}
}
}, {
'type': 'function',
'function': {
'name': 'get_temperature_date',
'description': 'Get temperature at a location and date.',
'parameters': {
'type': 'object',
'properties': {
'location': {
'type': 'string',
'description': 'The location to get the temperature for, in the format \'City, State, Country\'.'
},
'date': {
'type': 'string',
'description': 'The date to get the temperature for, in the format \'Year-Month-Day\'.'
},
'unit': {
'type': 'string',
'enum': [
'celsius',
'fahrenheit'
],
'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'
}
},
'required': [
'location',
'date'
]
}
}
}]
messages = [
{'role': 'user', 'content': 'Today is 2024-11-14, What\'s the temperature in San Francisco now? How about tomorrow?'}
]
openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(
model=model_name,
messages=messages,
max_tokens=32768,
temperature=0.8,
top_p=0.8,
stream=False,
extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),
tools=tools)
print(response.choices[0].message)
messages.append(response.choices[0].message)
for tool_call in response.choices[0].message.tool_calls:
tool_call_args = json.loads(tool_call.function.arguments)
tool_call_result = get_function_by_name(tool_call.function.name)(**tool_call_args)
tool_call_result = json.dumps(tool_call_result, ensure_ascii=False)
messages.append({
'role': 'tool',
'name': tool_call.function.name,
'content': tool_call_result,
'tool_call_id': tool_call.id
})
response = client.chat.completions.create(
model=model_name,
messages=messages,
temperature=0.8,
top_p=0.8,
stream=False,
extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),
tools=tools)
print(response.choices[0].message.content)
思维模式与非思维模式的切换
Intern-S1 默认启用思维模式,以增强模型的推理能力,从而生成更高质量的回答。如需关闭该功能,可在 tokenizer.apply_chat_template
中设置 enable_thinking=False
python
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False # think mode indicator
)
通过LMDeploy部署Intern-S1模型时,您可以在请求中调整enable_thinking
参数来动态控制思维模式。
python
from openai import OpenAI
import json
messages = [
{
'role': 'user',
'content': 'who are you'
}, {
'role': 'assistant',
'content': 'I am an AI'
}, {
'role': 'user',
'content': 'AGI is?'
}]
openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(
model=model_name,
messages=messages,
temperature=0.7,
top_p=0.8,
max_tokens=2048,
extra_body={
"enable_thinking": False,
}
)
print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))
对于vllm和sglang用户,请通过以下方式配置
python
extra_body={
"chat_template_kwargs": {"enable_thinking": False}
}