向vllm部署的qwen3服务发送请求时禁用thinking模式curl http://localhost:8000/v1/chat/completions -H “Content-Type: application/json” -d ‘{ “model”: “Qwen/Qwen3-8B”, “messages”: [ {“role”: “user”, “content”: “Give me a short introduction to large language models.”} ], “temperature”: 0.7, “top_p”: 0.8, “to