RPC with Web Search
#12971
Replies: 1 comment
-
Hi, I am currently learning about RPC feature. Do you have any ideas about the parallelism strategy of the rpc servers? (Model parallelism or tensor parallelism), and do you know where can I find the relative code. Please inform me if you know, thank you. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello all,
I've just attempted to use the rpc feature onto my m4 pro Mac and one 3070ti razer laptop for qwq 32b Q4_K_M.gguf, actually the allocation I did was 30(mac), 25 (razer) and the rest leaving to the mac cpu. The response was quickly returned until the web search opened, and I've checked the log indicating the bing did find the corresponding answers and the model did analyze the answers to organize. However, the model always repeated the final summary of the answer that it tried to answer me. I have got no idea to handle it, please let me know how. Thank you.
Beta Was this translation helpful? Give feedback.
All reactions