Skip to content

Commit 89ddbef

Browse files
BruceMacDakawrykow
authored andcommitted
server : add /detokenize endpoint (ggml-org#2802)
* Add a /detokenize endpoint to the example server * remove trailing white-space
1 parent 499351d commit 89ddbef

File tree

2 files changed

+27
-0
lines changed

2 files changed

+27
-0
lines changed

examples/server/README.md

+6
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,12 @@ node index.js
164164

165165
Note that the special `BOS` token is not added in front of the text and also a space character is not inserted automatically as it is for `/completion`.
166166

167+
- **POST** `/detokenize`: Convert tokens to text.
168+
169+
*Options:*
170+
171+
`tokens`: Set the tokens to detokenize.
172+
167173
- **POST** `/embedding`: Generate embedding of a given text just as [the embedding example](../embedding) does.
168174

169175
*Options:*

examples/server/server.cpp

+21
Original file line numberDiff line numberDiff line change
@@ -1104,6 +1104,12 @@ static json format_tokenizer_response(const std::vector<llama_token> &tokens)
11041104
{"tokens", tokens}};
11051105
}
11061106

1107+
static json format_detokenized_response(std::string content)
1108+
{
1109+
return json{
1110+
{"content", content}};
1111+
}
1112+
11071113
template <typename T>
11081114
static T json_value(const json &body, const std::string &key, const T &default_value)
11091115
{
@@ -1501,6 +1507,21 @@ int main(int argc, char **argv)
15011507
const json data = format_tokenizer_response(tokens);
15021508
return res.set_content(data.dump(), "application/json"); });
15031509

1510+
svr.Post("/detokenize", [&llama](const Request &req, Response &res)
1511+
{
1512+
auto lock = llama.lock();
1513+
1514+
const json body = json::parse(req.body);
1515+
std::string content;
1516+
if (body.count("tokens") != 0)
1517+
{
1518+
const std::vector<llama_token> tokens = body["tokens"];
1519+
content = tokens_to_str(llama.ctx, tokens.cbegin(), tokens.cend());
1520+
}
1521+
1522+
const json data = format_detokenized_response(content);
1523+
return res.set_content(data.dump(), "application/json"); });
1524+
15041525
svr.Post("/embedding", [&llama](const Request &req, Response &res)
15051526
{
15061527
auto lock = llama.lock();

0 commit comments

Comments
 (0)