-
Notifications
You must be signed in to change notification settings - Fork 19.5k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[SYCL] add to support pool_1d, move pool_1d/2d code to pool.cpp/hpp
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#24584
opened Jun 13, 2026 by
arthw
Contributor
Loading…
ci : add sycl to check-release
devops
improvements to build systems and github actions
#24583
opened Jun 13, 2026 by
CISC
Member
Loading…
vulkan: support all backend tests for SQR/SQRT/SIN/COS/CLAMP/LEAKY_RELU/NORM
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#24582
opened Jun 13, 2026 by
jeffbolznv
Contributor
Loading…
vulkan: Support gated_delta_net with S_v=16
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#24581
opened Jun 13, 2026 by
jeffbolznv
Contributor
Loading…
vulkan: support more CONCAT types
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#24579
opened Jun 13, 2026 by
jeffbolznv
Contributor
Loading…
[SYCL]fix reorder function crash:GGML_ASSERT(block_num_y % num_subgroups ==0)
examples
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#24578
opened Jun 13, 2026 by
arthw
Contributor
Loading…
ggml: optimize concat op by replacing per-element memcpy with row-level memcpy
ggml
changes relating to the ggml tensor library for machine learning
#24575
opened Jun 13, 2026 by
sirohikartik
Contributor
Loading…
CI: Replace flake8-no-print with flake8-debug and pin repos to hashes
#24572
opened Jun 13, 2026 by
jpodivin
Contributor
Loading…
CUDA: Add conv3d.
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#24569
opened Jun 13, 2026 by
Sero1000
Loading…
EXPERIMENT: meta: key external view cache by backend context
ggml
changes relating to the ggml tensor library for machine learning
#24566
opened Jun 13, 2026 by
nycdubliner
•
Draft
[fattn-tune] Add Blackwell MMA config
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#24565
opened Jun 13, 2026 by
yaohengxu
Contributor
Loading…
[SYCL] Enhance set_rows to support q1_0, mxfp4, nvfp4
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#24564
opened Jun 13, 2026 by
arthw
Contributor
Loading…
CUDA: don't route RDNA3.5 flash attention to the rocWMMA kernel
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#24562
opened Jun 13, 2026 by
liminfei-amd
Loading…
1 task done
CUDA/HIP: chunked MFMA prefill kernel for GATED_DELTA_NET (CDNA)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#24561
opened Jun 13, 2026 by
jadenmach2
Contributor
Loading…
ggml-alloc : check realloc result in alloc_tensor_range
ggml
changes relating to the ggml tensor library for machine learning
#24559
opened Jun 13, 2026 by
ricku777-bear
Loading…
Fix 24486: TP: allows the usage of 8,9,10 gpus for stepfun
#24554
opened Jun 13, 2026 by
krampenschiesser
Loading…
llama: copy tensor_split at model load instead of retaining caller pointer, resolving segfault
#24552
opened Jun 13, 2026 by
dragonfyre13
Loading…
llama : disable graph reuse when contexts share memory under SPLIT_MODE_TENSOR
#24549
opened Jun 12, 2026 by
nycdubliner
Loading…
2 tasks done
Reduce RSS during BF16 GGUF export
python
python script changes
#24548
opened Jun 12, 2026 by
i386
Loading…
ggml-cuda: use universal launch bounds for MoE MMVQ kernel
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#24547
opened Jun 12, 2026 by
batot1
Loading…
CUDA: size routed MoE MMQ N-tiles from typical expert width on RDNA3
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#24546
opened Jun 12, 2026 by
ravel7524
Contributor
Loading…
docs: add eagle3 to speculative doc
documentation
Improvements or additions to documentation
#24540
opened Jun 12, 2026 by
LiaXLiang
Loading…
spec: add spec metrics mean acceptance length and acceptance rate per position
examples
server
#24536
opened Jun 12, 2026 by
ruixiang63
Contributor
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.