
solon集成sse和websocket访问本地ollama部署的大语言模型
solon集成sse和websocket访问本地ollama部署的大语言模型
·
1.介绍
Solon(OpenSolon) | 官网(Spring 替代方案)
# ollama是什么? Ollama 官网:https://github.com/ollama/ollamaOllama 是一个开源的人工智能(AI)和机器学习(ML)工具平台, 特别设计用于简化大型语言模型(LLM)的部署和使用流程 # QWEN2.5 开源旗舰模型 Qwen2.5-72B 性能超越 Llama 3.1-405B,再次登上全球开源大模型的王座; 通义旗舰模型 Qwen-Max 全方位升级,性能已经逼近 GPT-4o Qwen2.5开源了7个尺寸,0.5B、1.5B、3B、7B、14B、32B、72B,它们在同等参数赛道都创造了业界最佳成绩
2.集成
solon集成websocket:Solon WebSocket 开发 (v2)
solon集成sse:solon-web-sse
ollama安装:自己百度 https://ollama.com/
ollama模型下载:https://ollama.com/library
3.示例
OllamaApi 封装ollama本地访问api和相关参数
public interface OllamaApi {
//本地ollama api地址
String url = "http://127.0.0.1:11434/api/generate";
//参数stream false 不流式返回
boolean StreamFalse = false;
//参数stream true 流式返回
boolean StreamTrue = true;
}
ModelEnum ollama下载的模型
public enum ModelEnum {
QWEN2_5_7B("qwen2.5:latest"),
QWEN2_5_14B("qwen2.5:14b"),
LLAMA3_2_VISION("llama3.2-vision:latest");
private String model;
ModelEnum(String model) {
this.model = model;
}
public String model() {
return model;
}
}
OllamaResponse 响应类
@Data(staticConstructor = "of")
@Accessors(chain = true)
public class OllamaResponse {
private OllamaResponseEnum type;
private String content;
}
OllamaResponseEnum 响应枚举类
public enum OllamaResponseEnum {
ASK("ask"),
ANSWER("answer"),
FINISH("finish");
private String type;
OllamaResponseEnum(String type) {
this.type = type;
}
public String type() {
return type;
}
}
接口直接访问示例:
示例使用hutool相关的工具类,需要引入hutool-all
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>5.8.33</version>
</dependency>
/**
* 直接访问
*
* @param prompt
* @return
*/
@Mapping("generate")
public R generate(String prompt) {
String result = HttpRequest.post(OllamaApi.url)
.body(
JSONUtil.createObj()
.set("prompt", prompt)
.set("model", ModelEnum.QWEN2_5_14B.model())
.set("stream", OllamaApi.StreamFalse)
.toString()
)
.execute()
.body();
//解析返回结果
String response = JSONUtil.parseObj(result).getStr("response");
return R.data(response);
}
流式请求示例:
SSE相关代码:
public interface SseService {
/**
* 创建sse连接
*
* @param uid
* @return
*/
SseEmitter connect(Long uid);
/**
* 单发消息
*
* @param uid
*/
void send(Long uid, String msg);
/**
* 关闭sse连接
*
* @param uid
*/
void complete(Long uid);
}
@Component
public class SseServiceImpl implements SseService {
private final Map<Long, SseEmitter> emitterMap = Maps.newConcurrentMap();
@SneakyThrows
@Override
public SseEmitter connect(Long uid) {
return new SseEmitter(0L)
.onCompletion(() -> emitterMap.remove(uid))
.onError(Throwable::printStackTrace)
.onInited(s -> emitterMap.put(uid, s));
}
@SneakyThrows
@Override
public void send(Long uid, String msg) {
SseEmitter sseEmitter = emitterMap.get(uid);
if (sseEmitter == null) {
System.err.println("sseEmitter is null");
return;
}
sseEmitter.send(new SseEvent().data(msg));
}
@Override
public void complete(Long uid) {
SseEmitter sseEmitter = emitterMap.get(uid);
if (sseEmitter != null) {
sseEmitter.complete();
}
}
}
@Mapping("SseConnect")
public SseEmitter SseConnect() {
// long userId = StpUtil.getLoginIdAsLong();
return sseService.connect(1L);
}
接口请求示例(sse和websocket都可以)
需要引入
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp-sse</artifactId>
<version>4.12.0</version>
</dependency>
@SneakyThrows
@Mapping("generateStream")
public void generateStream(String prompt) {
OkHttpClient client = new OkHttpClient.Builder()
.connectTimeout(10, TimeUnit.SECONDS)
.writeTimeout(50, TimeUnit.SECONDS)
.readTimeout(10, TimeUnit.MINUTES)
.build();
RequestBody body = RequestBody.create(
MediaType.parse("application/json; charset=utf-8"),
JSONUtil.createObj()
.set("prompt", prompt)
.set("model", ModelEnum.QWEN2_5_14B.model())
.set("stream", OllamaApi.StreamTrue)
.toString()
);
// 请求对象
Request request = new Request.Builder()
.url(OllamaApi.url)
.post(body)
.build();
//sse接口返回
sseService.send(1L, JSONUtil.toJsonStr(OllamaResponse.of().setType(OllamaResponseEnum.ASK).setContent(prompt)));
//websocket返回
// SolonWebSocket.send(1L, JSONUtil.toJsonStr(OllamaResponse.of().setType(OllamaResponseEnum.ASK).setContent(prompt)));
try (Response response = client.newCall(request).execute()) {
if (!response.isSuccessful()) throw new IOException("Unexpected code " + response);
BufferedReader reader = new BufferedReader(new InputStreamReader(response.body().byteStream()));
String line;
while ((line = reader.readLine()) != null) {
if (JSONUtil.parseObj(line).getBool("done")){
sseService.send(1L, JSONUtil.toJsonStr(OllamaResponse.of().setType(OllamaResponseEnum.FINISH).setContent("")));
sseService.complete(1L);
// SolonWebSocket.send(1L, JSONUtil.toJsonStr(OllamaResponse.of().setType(OllamaResponseEnum.FINISH).setContent("")));
return;
}
String responseData = JSONUtil.parseObj(line).getStr("response");
sseService.send(1L, JSONUtil.toJsonStr(OllamaResponse.of().setType(OllamaResponseEnum.ANSWER).setContent(responseData)));
// SolonWebSocket.send(1L, JSONUtil.toJsonStr(OllamaResponse.of().setType(OllamaResponseEnum.ANSWER).setContent(responseData)));
}
}
}
更多推荐
所有评论(0)