Open
Description
Summary
- When trying to convert a model to iq3_s or iq3_xxs, gives fatal error and abort.
Error
sd -M convert -m realDream_sdxl6.safetensors --type iq3_s
[INFO ] model.cpp:908 - load realDream_sdxl6.safetensors using safetensors format
[INFO ] model.cpp:1985 - model tensors mem size: 2183.06MB
|=> | 55/2641 - 0.00it/sOops: found point 103 not on grid: 103 0 0 0
/usr/src/debug/stable-diffusion.cpp-vulkan-git/stable-diffusion.cpp/ggml/src/ggml-quants.c:3929: fatal error
ptrace: Operation not permitted.
No stack.
The program is not being run.
Aborted (core dumped)
Command to test quants:
sd -M convert -m realDream_sdxl6.safetensors --type q4_0
Test model: realDream_sdxl6 ( SDXL | F16 | 6.46GB )
Speed to convert quants (almost all of them)
quant | model tensor mem size | it/s |
---|---|---|
tq1_0 | 1565.20MB | 14.49 |
tq2_0 | 1697.60MB | 13.51 |
q2_K | 1896.20MB | 4.52 |
iq3_xxs | 2050.66MB | no |
iq3_s | 2183.06MB | no |
q3_K | 2183.06MB | 8.26 |
iq4_xs | 2469.92MB | 1.61 |
iq4_nl | 2479.52MB | 1.78 |
q4_0 | 2479.52MB | 10.00 |
q4_K | 2558.19MB | 5.46 |
q4_1 | 2659.47MB | 5.56 |
q5_0 | 2839.42MB | 9.80 |
q5_K | 2911.25MB | 5.26 |
q5_1 | 3019.38MB | 5.52 |
q6_K | 3286.38MB | 5.95 |
q8_0 | 3919.13MB | 13.70 |
Time to finish convertion:
- q8_0: 4m16s
- iq4_xs: 19m52s (very very slow)
Conclusions
- Bug error (fatal) in:
iq3_xxs
,iq3_s
, maybe more - Conversion use only "one" CPU core, multithreaded optimization maybe?
q8_0
is converted 4.65x faster thaniq4_xs
- Faster:
q8_0
>q4_0
System:
OS: Arch Linux x86_64
Kernel: Linux 6.12.24-1-lts
Shell: bash 5.2.37
WM: dwm (X11)
Terminal: tmux 3.5a
CPU: Intel(R) Core(TM) i7-4790 (8) @ 3.60 GHz
GPU: NVIDIA GeForce GTX 1660 SUPER [Discrete] (6GB)
Memory: 2.47 GiB / 15.56 GiB (16%)
Locale: en_US.UTF-8
Metadata
Metadata
Assignees
Labels
No labels