Skip to content

Reasoning without using the think function #16392

@imjking

Description

@imjking

Hi, i want to use Qwen3_0.6B model in 8255 device, i exported pte model and run it on device successfully. Now i want to disable the "think" function to verify something, how can i achieve it ?
I use the following command and get outputs.txt:
./qnn_llama_runner_ndk27 --decoder_model_version qwen3 --tokenizer_path tokenizer.json --model_path hybrid_llama_qnn.pte --prompt "who are you" --seq_len 512 --eval_mode 1 --temperature 0.8 && cat outputs.txt

Image

cc @cccclai @winskuo-quic @shewu-quic @haowhsu-quic @DannyYuyang-quic @cbilgin

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: qnnIssues related to Qualcomm's QNN delegate and code under backends/qualcomm/partner: qualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions