fix(gguf): ensure dequantized tensors are on correct device for MPS #8713

Pfannkuchensack · 2025-12-27T23:04:38Z

Summary

When using GGUF-quantized models on MPS (Apple Silicon), the dequantized tensors could end up on a different device than the other operands in math operations, causing "Expected all tensors to be on the same device" errors.

This fix ensures that after dequantization, tensors are moved to the same device as the other tensors in the operation.

Related Issues / Discussions

(https://discord.com/channels/1020123559063990373/1149506274971631688/1454480237311168654)

QA Instructions

Test with z_image_turbo-Q4_K.gguf, Qwen_3_4b-Q6_K.gguf on mac

Merge Plan

No big change.

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
❗Changes to a redux slice have a corresponding migration
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

When using GGUF-quantized models on MPS (Apple Silicon), the dequantized tensors could end up on a different device than the other operands in math operations, causing "Expected all tensors to be on the same device" errors. This fix ensures that after dequantization, tensors are moved to the same device as the other tensors in the operation.

Vargol · 2025-12-29T10:14:23Z

Note this issue doesn't occur with keep_ram_copy_of_weights enabled as it defaults on enabled you'll need

keep_ram_copy_of_weights: False

in the invoke.yaml when testing
Also I believe partial loading can work around the issue too, either of those settings being enabled are sub-optimal on MPS.

It's basically the same issue as #7939

Oh, I believe it breaks on CUDA with the same settings, If you've got the VRAM to run it.

gogurtenjoyer · 2025-12-29T16:16:47Z

I've tested this PR on a M5 with 32gb RAM. It allows generations, with no errors, with keep_ram_copy_of_weights: False set. No errors reported in console, and the speed is the same as before.

lstein

Looks good. I'm going ahead with an approval to merge.

github-actions bot added python PRs that change python files backend PRs that change backend files labels Dec 27, 2025

Merge branch 'main' into fix/gguf-mps-device-mismatch

fc108d0

lstein marked this pull request as ready for review January 2, 2026 00:07

lstein requested review from blessedcoolant and lstein as code owners January 2, 2026 00:07

Merge branch 'main' into fix/gguf-mps-device-mismatch

b7a6cad

lstein approved these changes Jan 2, 2026

View reviewed changes

Merge branch 'main' into fix/gguf-mps-device-mismatch

928a8bd

lstein enabled auto-merge (squash) January 2, 2026 00:32

lstein added 2 commits January 1, 2026 19:36

Merge branch 'main' into fix/gguf-mps-device-mismatch

2f50b87

Merge branch 'main' into fix/gguf-mps-device-mismatch

922a959

lstein merged commit 3b2d2ef into invoke-ai:main Jan 2, 2026
13 checks passed

Pfannkuchensack deleted the fix/gguf-mps-device-mismatch branch January 3, 2026 08:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gguf): ensure dequantized tensors are on correct device for MPS #8713

fix(gguf): ensure dequantized tensors are on correct device for MPS #8713

Pfannkuchensack commented Dec 27, 2025

Uh oh!

Vargol commented Dec 29, 2025 •

edited

Loading

Uh oh!

gogurtenjoyer commented Dec 29, 2025

Uh oh!

lstein left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix(gguf): ensure dequantized tensors are on correct device for MPS #8713

fix(gguf): ensure dequantized tensors are on correct device for MPS #8713

Conversation

Pfannkuchensack commented Dec 27, 2025

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

Uh oh!

Vargol commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gogurtenjoyer commented Dec 29, 2025

Uh oh!

lstein left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Vargol commented Dec 29, 2025 •

edited

Loading