Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] ONNX support #772

Open
FNsi opened this issue Dec 8, 2023 · 61 comments
Open

[Feature Request] ONNX support #772

FNsi opened this issue Dec 8, 2023 · 61 comments
Labels
enhancement New feature or request

Comments

@FNsi
Copy link

FNsi commented Dec 8, 2023

In compact structure (model size 256k~4m) that would be a runtime effect base on DirectMl

Am I so greedy?😂

@FNsi FNsi closed this as not planned Won't fix, can't repro, duplicate, stale Dec 8, 2023
@cqaqlxz
Copy link

cqaqlxz commented Dec 9, 2023

REAL-ESRGAN is to large, It's too difficult to run it in real time on current computers.

@FNsi
Copy link
Author

FNsi commented Dec 9, 2023

REAL-ESRGAN is to large, It's too difficult to run it in real time on current computers.

The main bottleneck is memory size. 2k game + 2x enlarge cost about 16g memory. The speed can be real time in 3060 (512k model) only if the memory is unlimited. 😂

Imo that could work in igpu, though 780m still not good enough, maybe Qualcomm elite x, another story...

@Blinue
Copy link
Owner

Blinue commented Dec 10, 2023

Some models can indeed be inferenced in real time, such as mpv-upscale-2x_animejanai. I plan to add support for ONNX in the future, but there is still a lot of uncertainty.

@Blinue Blinue added the enhancement New feature or request label Dec 10, 2023
@FNsi
Copy link
Author

FNsi commented Dec 15, 2023

2x-DigitalFlim

The best ESR model I ever tried, not only the size but also the output(real + anime)

@Blinue Blinue reopened this Dec 15, 2023
@kato-megumi
Copy link
Contributor

The SuperUltraCompact model isn't much larger than Anime4k UL model (around 2x, I guess), so it's kinda possible to be ported to HLSL format.

@Blinue
Copy link
Owner

Blinue commented Dec 15, 2023

While porting to HLSL does indeed offer higher efficiency, the cost is also substantial unless there's an automated approach. I'm inclined to adopt ONNX Runtime, enabling us to seamlessly integrate any ONNX model with ease.

@YingDoge
Copy link

YingDoge commented Jan 2, 2024

i personal think this is a great idea as animejanai does offer much better grafic some times. I would personal donate 20 US if this happen. Magie is getting better everyday. Love this thing so much.

While porting to HLSL does indeed offer higher efficiency, the cost is also substantial unless there's an automated approach. I'm inclined to adopt ONNX Runtime, enabling us to seamlessly integrate any ONNX model with ease.

@kato-megumi
Copy link
Contributor

kato-megumi commented Feb 14, 2024

I ported Animejanai V3 SuperUltraCompact and 2x-DigitalFlim to Magpie's effect if anyone want to try.
https://gist.github.com/kato-megumi/d10c12463b97184c559734f2cba553be

Gist
magpie effect. GitHub Gist: instantly share code, notes, and snippets.

@Blinue
Copy link
Owner

Blinue commented Feb 14, 2024

Great job! It appears that Animejanai is well-suited for scenes from old anime, as it doesn’t produce sharp lines like Anime4K does. However, a significant issue is that it sacrifices many details. DigitalFlim is sharper than Animejanai, it also suffers from severe detail loss. In terms of performance, they are roughly 20-25 times slower than Lanczos.

@FNsi
Copy link
Author

FNsi commented Feb 16, 2024

nothing happened after I put both files in effects folder (even rebooted the system)

For experiment I also put the fakehdr.hlsl and it works...

Don't know if I made any mistakes (version 10.05)

@kato-megumi
Copy link
Contributor

kato-megumi commented Feb 16, 2024

You have to use newer version. https://github.com/Blinue/Magpie/actions/runs/7911000525

GitHub
An all-purpose window upscaler for Windows 10/11. Contribute to Blinue/Magpie development by creating an account on GitHub.

@FNsi
Copy link
Author

FNsi commented Feb 17, 2024

You have to use newer version. https://github.com/Blinue/Magpie/actions/runs/7911000525

Thank u for your great work and help! Anyway I still don't know how to download the build from GitHub action, so let me keep that surprise till the next upcoming release.😁

However, a significant issue is that it sacrifices many details.

For that I think it's the common problem in ESR model, base on the structure (even large model can't keep many detail) and training datasets (animations?)

GitHub
An all-purpose window upscaler for Windows 10/11. Contribute to Blinue/Magpie development by creating an account on GitHub.

@Blinue
Copy link
Owner

Blinue commented Feb 17, 2024

@FNsi
Copy link
Author

FNsi commented Feb 17, 2024

Download from here: https://github.com/Blinue/Magpie/actions/runs/7911000525/artifacts/1246839355

Thank u. After sigin again I can download it.
It's wired that kind of page from action Need to be sign in (otherwise show 404) even I already signed in iOS client...

@spiwar
Copy link

spiwar commented Feb 19, 2024

You have to use newer version. https://github.com/Blinue/Magpie/actions/runs/7911000525

GitHub**refactor: XamlWindow 禁止子类直接访问成员 · Blinue/Magpie@e3dc41b**An all-purpose window upscaler for Windows 10/11. Contribute to Blinue/Magpie development by creating an account on GitHub.

Can you port the SD model of animejanai, which is more aggressive in its detail reconstruction? an UC model for those of us with more computing power would also be great.

GitHub
An all-purpose window upscaler for Windows 10/11. Contribute to Blinue/Magpie development by creating an account on GitHub.

@kato-megumi
Copy link
Contributor

Can you port the SD model of animejanai

@spiwar Do you have link for it? Didn't find it on their github

@FNsi
Copy link
Author

FNsi commented Feb 19, 2024

For detail restore...2x-Futsuu-Anime, but its 4M... i think its a game for 4090

@kato-megumi
Copy link
Contributor

animejanai.zip
Here is animejanai's Compact and UltraCompact for anyone with enough power.
UltraCompact run like 3fps for 720p on my machine.
Havent test Compact yet.

@carycary246
Copy link

animejanai.zip Here is animejanai's Compact and UltraCompact for anyone with enough power. UltraCompact run like 3fps for 720p on my machine. Havent test Compact yet.

Same issue 3fps trying to run ultracompact, even though its fine when I use it in mpv. Can you port the v3 sharp model? They are in the animejanai discord beta releases.

@kato-megumi
Copy link
Contributor

kato-megumi commented Feb 20, 2024

Same issue 3fps trying to run ultracompact, even though its fine when I use it in mpv

Perhaps it's a limitation of magpie/hlsl. I'm hopeful that integrating ONNX will enhance its performance. What GPU are you using?

Can you port the v3 sharp model?

Ok. https://gist.github.com/kato-megumi/d10c12463b97184c559734f2cba553be#file-animejanai_sharp_suc-hlsl

Gist
magpie effect. GitHub Gist: instantly share code, notes, and snippets.

@spiwar
Copy link

spiwar commented Feb 20, 2024

Can you port the SD model of animejanai

@spiwar Do you have link for it? Didn't find it on their github

You can find it in the full 1.1gb release, but i've included it here for convenience.
2x_AnimeJaNai_SD_V1beta34_Compact.zip

@spiwar
Copy link

spiwar commented Feb 20, 2024

image

RTX 3080ti, upscale from 1080p source to 4k
C model runs at seconds per frame
UC model runs at 2-3 fps
SUC model runs at ~40fps

If we can optimize this to run at decent speeds then it would be very nice, UC and C model looks quite natural with no oversharpening.

@Blinue
Copy link
Owner

Blinue commented Feb 20, 2024

The performance optimization space is very limited, because the bottleneck is in floating-point operations.

@kato-megumi I found that 16-bit floating-point numbers (min16float) are more efficient, with about a 10% performance improvement on my side. But this is still not enough to make UC usable. Further performance improvement can only be achieved by using platform-specific APIs, such as TensorRT.

image

@FNsi
Copy link
Author

FNsi commented Feb 20, 2024

find data to enhance SUC model might be the better way forward...
Comparing with tensor rt, directml is a universal solution, imo...
(But obviously cannot gain from nv hardware acceleration)
Or PyTorch compile with 8bit?

@kato-megumi
Copy link
Contributor

@Blinue Sorry, can you elaborate. I though using FORMAT R16G16B16A16_FLOAT already mean 16-bit floating-point number?

@Ptilopsis01
Copy link

Ptilopsis01 commented Mar 10, 2024

新版本使用了新的渲染系统 #643,在低帧率下也能保持鼠标流畅。

明白了,这样使用体验确实好很多

可能有两个原因,源窗口本身是置顶的或者你打开了调试模式。

确实是,我开了调试模式

还有另一个小bug,新测试版本无法识别sharpen文件夹及其下的所有效果

@Blinue
Copy link
Owner

Blinue commented Mar 10, 2024

还有另一个小bug,新测试版本无法识别sharpen文件夹及其下的所有效果

这些还没适配新的渲染系统,耐心等待 #643 完成。

@Ptilopsis01
Copy link

这些还没适配新的渲染系统,耐心等待 #643 完成。

OK了解了

@kato-megumi
Copy link
Contributor

Upscale 1440p (or anything bigger than 1080p) with tensorrt result in black screen.
magpie.2.log
magpie.1.log
magpie.log

@Blinue
Copy link
Owner

Blinue commented Mar 10, 2024

The reason is that the TensorRT engine in Magpie is built to handle up to 1080p input at most. It can technically support bigger inputs, but consumer-grade graphics cards may have difficulty with real-time inference.

@spiwar
Copy link

spiwar commented Mar 10, 2024

Just tested, very good performance upscaling from 1080p -> 4k with the included model (animejanai v3 ultracompact)
RTX 3080ti
Previous build with hlsl: 2fps~
DML: 22fps
TensorRT: 34fps
CUDA: doesn't work

Using superultracompact model gets me to 60fps on the same scenario, huge improvements all around.

This might be already in the works, but I think it's a good idea to have a pop up saying that the engine is being built when using TensorRT, as it happens in the background users might think nothing is happening when in fact the engine is being built.

@Blinue Blinue changed the title [Feature Request] Add REAL-ESRGAN support [Feature Request] ONNX support Mar 11, 2024
@Ptilopsis01
Copy link

Ptilopsis01 commented Mar 11, 2024

DML: 22fps
TensorRT: 34fps

Could you tell me how do you monitor your fps? The inner monitor can't work at present, and this version is not compatible with Rivatuner, which I guess also can not get the right fps data because of the new rendering system.

@kato-megumi
Copy link
Contributor

turn on developer mode by edit config.json
Setting > developer options > Duplicate frame detection to never
Now you can get fps with Rivatuner. Remember dont move your mouse when try to get fps.

@spiwar
Copy link

spiwar commented Mar 11, 2024

DML: 22fps
TensorRT: 34fps

Could you tell me how do you monitor your fps? The inner monitor can't work at present, and this version is not compatible with Rivatuner, which I guess also can not get the right fps data because of the new rendering system.

RTSS works for me. You might have to add magpie as a separate application in the rtss whitelist.

To monitor your fps with rtss, have an animation play or any moving scene and dont move your mouse.

  • The fps reading will become wrong if you move your mouse (to make the mouse movement smooth)
  • if you have a static scene the fps will also become wrong

@Ptilopsis01
Copy link

Duplicate frame detection to never

Thanks!

@Kamikadashi
Copy link

The reason is that the TensorRT engine in Magpie is built to handle up to 1080p input at most. It can technically support bigger inputs, but consumer-grade graphics cards may have difficulty with real-time inference.

At the very least, this should be available as an option.

@Blinue
Copy link
Owner

Blinue commented Mar 14, 2024

At the very least, this should be available as an option.

I plan to enable the TensorRT backend to support inputs of any size in the future. This means that users will have to rebuild the engine multiple times to scale larger windows.

@Kamikadashi
Copy link

I plan to enable the TensorRT backend to support inputs of any size in the future. This means that users will have to rebuild the engine multiple times to scale larger windows.

Does this mean the engine would need to be rebuilt every time the window size changes, or would it need to be built just once for each different window size?

@Blinue
Copy link
Owner

Blinue commented Mar 15, 2024

Since building the engine is quite time-consuming, it’s crucial to minimize the frequency of rebuilds. The implementation details have not been decided yet, please be patient.

@HIllya51
Copy link

Does the ONNX version not support Integrated graphics? The screen will go black on the AMD R6-6600H CPU...

@Blinue
Copy link
Owner

Blinue commented Mar 18, 2024

Kindly note that only DirectML backend is supported by non-NVIDIA graphics cards. Can you provide the logs to help diagnose the problem?

@HIllya51
Copy link

Kindly note that only DirectML backend is supported by non-NVIDIA graphics cards. Can you provide the logs to help diagnose the problem?

logs.zip

@Blinue
Copy link
Owner

Blinue commented Mar 18, 2024

It’s likely due to OOM; I suspect the integrated graphics card doesn’t have sufficient resources to perform the inference. Can you share the ONNX file you’re using?

@HIllya51
Copy link

It’s likely due to OOM; I suspect the integrated graphics card doesn’t have sufficient resources to perform the inference. Can you share the ONNX file you’re using?

I just use the 2x_AnimeJaNai_HD_V3_UltraCompact_425k-fp16.onnx

@Blinue
Copy link
Owner

Blinue commented Mar 18, 2024

I just use the 2x_AnimeJaNai_HD_V3_UltraCompact_425k-fp16.onnx

For effective inference on the UC model, a minimum of a 3060 GPU is essential. While it might be feasible to run much smaller models on integrated graphics cards, it doesn’t make much sense to do so: for tiny models, HLSL is significantly faster.

@hooke007
Copy link
Collaborator

Does the ONNX version not support Integrated graphics? The screen will go black on the AMD R6-6600H CPU...

If you only want to verify if ONNX would work. I only suggest use "tiny" model on IGPU(i.e. #847 (comment))

@HIllya51
Copy link

I just use the 2x_AnimeJaNai_HD_V3_UltraCompact_425k-fp16.onnx

For effective inference on the UC model, a minimum of a 3060 GPU is essential. While it might be feasible to run much smaller models on integrated graphics cards, it doesn’t make much sense to do so: for tiny models, HLSL is significantly faster.

Does the ONNX version not support Integrated graphics? The screen will go black on the AMD R6-6600H CPU...

If you only want to verify if ONNX would work. I only suggest use "tiny" model on IGPU(i.e. #847 (comment))

I conducted some tests again and found that it can scale certain windows correctly (nodepad, calculator, window terminal), but it may black out for certain windows (explorer, some games), so it may not be an error caused by ONNX, but rather a bug in window capture

@Blinue
Copy link
Owner

Blinue commented Mar 18, 2024

I conducted some tests again and found that it can scale certain windows correctly (nodepad, calculator, window terminal), but it may black out for certain windows (explorer, some games), so it may not be an error caused by ONNX, but rather a bug in window capture

I believe this is related to the window size. Scaling larger windows requires more VRAM, leading to OOM.

@HIllya51
Copy link

I conducted some tests again and found that it can scale certain windows correctly (nodepad, calculator, window terminal), but it may black out for certain windows (explorer, some games), so it may not be an error caused by ONNX, but rather a bug in window capture

I believe this is related to the window size. Scaling larger windows requires more VRAM, leading to OOM.

I have tested it, and for the software I mentioned earlier that can be scaled, no matter how I resize it, it can scale normally. For software that cannot be scaled, no matter how I resize it, it will black screen

@Blinue
Copy link
Owner

Blinue commented Mar 18, 2024

@HIllya51 Could you create an issue for this problem?

@keyrwhf
Copy link

keyrwhf commented May 12, 2024

Thank you so much! I love this tool.
I apologize for my English; it’s not very good.
However, I’m facing an issue. I’m using Magpie with Textractor to read a Visual Novel game.
In this ONNX preview version, it only supports the graphic capture method; other methods don’t work.
Textractor generates an extra window text box that stays “Always on top” to display text. When I use the graphic capture method, the text box disappears.
However, the Desktop Duplication method works with Textractor, and the text box doesn’t disappear.
Can you please help?

@Blinue
Copy link
Owner

Blinue commented May 13, 2024

In the onnx-preview1 version, only Graphics Capture is functional; other capture methods have not been adapted. It's merely a technical preview, so stay tuned for future updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests