如何借助ray来搭建chatglm?

如何借助ray来搭建chatglm2,应该如何对>>> from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained(“THUDM/chatglm2-6b”, trust_remote_code=True)
model = AutoModel.from_pretrained(“THUDM/chatglm2-6b”, trust_remote_code=True, device=‘cuda’)
model = model.eval()
response, history = model.chat(tokenizer, “你好”, history=[])
print(response)
你好:wave:!我是人工智能助手 ChatGLM2-6B,很高兴见到你,欢迎问我任何问题。
response, history = model.chat(tokenizer, “晚上睡不着应该怎么办”, history=history)
print(response)
这段代码进行修改?

可以试试 Ray Serve https://docs.ray.io/en/latest/serve/index.html