-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Prevent error from being fused with scalar in simd_op_check #8867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fix the output mismatch in fmls with float16 type where error() was optimized in a way that it is fused with scalar computation. compute_root() makes sure scalar result is computed independently.
|
In simd_op_check_wasm, i8x16.splat generates , which was previously for data reuse . |
| // Include a scalar version | ||
| Halide::Func f_scalar("scalar_" + name); | ||
| f_scalar(x, y) = e; | ||
| f_scalar.compute_root(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth a comment as to why this is necessary for correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
That's coming from these tests: // Load vector with identical lanes generates *.splat.
check("i8x16.splat", 16 * w, in_u8(0));
check("i16x8.splat", 8 * w, in_u16(0));
check("i32x4.splat", 4 * w, in_u32(0));
check("i64x2.splat", 2 * w, in_u64(0));I think it's actually an improvement to use |
alexreinking
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just fixed simd_op_check_wasm myself. Hope this works!
Fix the output mismatch in fmls with float16 type where error() was optimized in a way that it is fused with scalar computation. compute_root() makes sure scalar result is computed independently.