Skip to content

Commit 971fe9f

Browse files
add tokens per second output (ggml-org#246)
* add tokens per second output * Update gpttype_adapter.cpp simplify --------- Co-authored-by: LostRuins <[email protected]>
1 parent b1b8dc3 commit 971fe9f

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

gpttype_adapter.cpp

+2-1
Original file line numberDiff line numberDiff line change
@@ -1280,7 +1280,8 @@ generation_outputs gpttype_generate(const generation_inputs inputs, generation_o
12801280
float pt1 = (time1*1000.0/(embd_inp.size()==0?1:embd_inp.size()));
12811281
int realnpredict = params.n_predict-stopper_unused_tokens;
12821282
float pt2 = (time2*1000.0/(realnpredict==0?1:realnpredict));
1283-
printf("\nTime Taken - Processing:%.1fs (%.0fms/T), Generation:%.1fs (%.0fms/T), Total:%.1fs", time1, pt1, time2, pt2, (time1 + time2));
1283+
float tokens_per_second = (realnpredict == 0 ? 0 : realnpredict / (time1 + time2));
1284+
printf("\nTime Taken - Processing:%.1fs (%.0fms/T), Generation:%.1fs (%.0fms/T), Total:%.1fs (%.1fT/s)", time1, pt1, time2, pt2, (time1 + time2), tokens_per_second);
12841285
fflush(stdout);
12851286
output.status = 1;
12861287
generation_finished = true;

0 commit comments

Comments
 (0)