lijiaoqiao/scripts/release.py at 570f8bab8fd9db2f48bed2990cc820351bc10f21

Files

Sanjays2402 570f8bab8f fix(compression): exclude completion tokens from compression trigger (#12026 )

Cherry-picked from PR #12481 by @Sanjays2402.

Reasoning models (GLM-5.1, QwQ, DeepSeek R1) inflate completion_tokens
with internal thinking tokens. The compression trigger summed
prompt_tokens + completion_tokens, causing premature compression at ~42%
actual context usage instead of the configured 50% threshold.

Now uses only prompt_tokens — completion tokens don't consume context
window space for the next API call.

- 3 new regression tests
- Added AUTHOR_MAP entry for @Sanjays2402

Closes #12026

2026-04-20 05:12:10 -07:00

33 KiB

Executable File

Raw Blame History

View Raw

33 KiB Executable File Raw Blame History

33 KiB

Executable File

Raw Blame History