Spelling-out is not Straightforward: LLMs’ Capability of Tokenization from Token to Characters

Published in In The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP): Findings, 2025