Skip to content

Improve GPT OSS Conversion Script #42902

@KyleMylonakisProtopia

Description

@KyleMylonakisProtopia

Feature request

The GPT OSS Conversion Script exposes parameters that are not needed or used, has incorrect documentation, and crashes due to a tiktoken bug.

I improved the script and validated it works with GPT-OSS-20B.

Motivation

The original motivation is from huggingface/accelerate#3882 (comment) where I would like to use accelerate to load a GPT-OSS-20B model. This does not work out of the box, however this conversion script helps. Unfortunately the script was broken and had incorrect documentation and dangerous arguments.

Your contribution

#42901

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions