Unexpected tokenizer behavior difference from v4 to v5

```python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('mlx-community/Llama-3.2-1B-Instruct-4bit')

text = tokenizer.decode([128000, 64, 1174, 65])
print(text)
```

For 4.57.3 you get `<|begin_of_text|>a,b`
For 5.0.0rc1 you get `<|begin_of_text|>a ,b`

Is it expected the behavior changed here?

### Reproduction

Run the snippet above

### Expected behavior

The two should have the same behavior right?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unexpected tokenizer behavior difference from v4 to v5 #42913

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected tokenizer behavior difference from v4 to v5 #42913

Description

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions