-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
eval-callback : add support for saving logits
examples
#18281
opened Dec 22, 2025 by
danbev
Loading…
Vulkan: Tune Flash Attention for MoE on AMD GPUs
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18280
opened Dec 22, 2025 by
0cc4m
Loading…
Prevent crash if TTFT >300sec, boosted to 90 days
examples
server
#18279
opened Dec 22, 2025 by
wbtek
Loading…
opencl: fix q4_0 unpacking for adreno in get_tensor
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
tools : use common_log_pause to fix fit-params output race
examples
#18276
opened Dec 22, 2025 by
Aadeshveer
Loading…
KYLIN: fix compile error for cuda backend
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18275
opened Dec 22, 2025 by
lizhenneng
Loading…
docs: Fix typos in SYCL documentation
documentation
Improvements or additions to documentation
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#18269
opened Dec 21, 2025 by
yoka
Loading…
llama: fix magic number of 999 for GPU layers
#18266
opened Dec 21, 2025 by
JohannesGaessler
Loading…
server: add real-time prompt preprocessing progress via synthetic SSE chunks
examples
server
#18265
opened Dec 21, 2025 by
ServeurpersoCom
•
Draft
vulkan: Extend rope fusions to allow mrope
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#18264
opened Dec 21, 2025 by
jeffbolznv
Loading…
server: (docs) remove mention about extra_args
examples
server
#18262
opened Dec 21, 2025 by
ngxson
Loading…
Add Gemma3n multimodal support with MobileNetV5 vision encoder
examples
model
Model specific
python
python script changes
#18256
opened Dec 21, 2025 by
simrnsingh
Loading…
ggml rpc : Add missing check for rpc buffer type
ggml
changes relating to the ggml tensor library for machine learning
#18242
opened Dec 21, 2025 by
struct
Loading…
ggml-cpu: parallelize tensor repacking with OpenMP
ggml
changes relating to the ggml tensor library for machine learning
#18239
opened Dec 21, 2025 by
pestopoppa
Loading…
cli: buffering info log, only show if model load failed
examples
#18236
opened Dec 20, 2025 by
ngxson
Loading…
webui: Fix the header backdrop blur
examples
server
#18230
opened Dec 20, 2025 by
ImadSaddik
Loading…
server: /v1/responses (text generation only)
examples
python
python script changes
server
#18227
opened Dec 20, 2025 by
openingnow
Loading…
webui: use server presets as parameter placeholders
examples
server
#18226
opened Dec 20, 2025 by
ServeurpersoCom
Loading…
ggml-metal: guard buffer map slicing
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#18225
opened Dec 20, 2025 by
SzymonPrajs
Loading…
webui: apply webui_settings on first load
examples
server
#18223
opened Dec 20, 2025 by
ServeurpersoCom
Loading…
ggml-metal: fix memset range and temp buffer leaks
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#18221
opened Dec 20, 2025 by
SzymonPrajs
Loading…
convert: rework ftype heuristics
python
python script changes
#18214
opened Dec 20, 2025 by
taronaeo
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.