-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add response body handler #90
Conversation
return nil, fmt.Errorf("unmarshaling response body: %v", err) | ||
} | ||
reqCtx.Response = res | ||
klog.V(3).Infof("Response: %+v", res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, the response is just as big as the body, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually not, the response here only contains the fields we are interested in (the number of tokens).
/approve Thanks Cong |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, liu-cong The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The PR adds a response body handler, which parses the inference response with
usage
stats such ascompletion_tokens
.This is the prerequisite for adding metrics such as per output token latency.