15 lines
1.2 KiB
JSON
15 lines
1.2 KiB
JSON
{
|
|
"title": "Flexible and Efficient Grammar-Constrained Decoding",
|
|
"authors": [
|
|
"Kanghee Park",
|
|
"Timothy Zhou",
|
|
"Loris D'Antoni"
|
|
],
|
|
"abstract": "Large Language Models (LLMs) are often asked to generate structured outputs\nthat obey precise syntactic rules, such as code snippets or formatted data.\nGrammar-constrained decoding (GCD) can guarantee that LLM outputs matches such\nrules by masking out tokens that will provably lead to outputs that do not\nbelong to a specified context-free grammar (CFG). To guarantee soundness, GCD\nalgorithms have to compute how a given LLM subword tokenizer can align with the\ntokens used\n by a given context-free grammar and compute token masks based on this\ninformation. Doing so efficiently is challenging and existing GCD algorithms\nrequire tens of minutes to preprocess common grammars. We present a new GCD\nalgorithm together with an implementation that offers 17.71x faster offline\npreprocessing than existing approaches while preserving state-of-the-art\nefficiency in online mask computation.",
|
|
"pdf_url": "http://arxiv.org/pdf/2502.05111v1",
|
|
"entry_id": "http://arxiv.org/abs/2502.05111v1",
|
|
"categories": [
|
|
"cs.CL",
|
|
"cs.AI"
|
|
]
|
|
} |