Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's performancedifference after tensorrt enabled? #26

Open
Godady opened this issue May 17, 2018 · 7 comments
Open

What's performancedifference after tensorrt enabled? #26

Godady opened this issue May 17, 2018 · 7 comments

Comments

@Godady
Copy link

Godady commented May 17, 2018

As Nvidia said the tensorrt GIE would improve much performance of inference. But i've never seen a report about improvement when tensorRT applied in Gtx 10xx GPUs, especially Gtx 1080ti. Could anyone tell me how much performance of PhoenixGo gained when tensorrt enabled versus tensorRT disabled in 1080Ti.

@godmoves
Copy link
Contributor

godmoves commented May 17, 2018

After tensorrt enabled, the usage of a 1080ti raises from ~75% to ~95%, so I think it works fine for 1080ti.

@wodesuck
Copy link
Collaborator

30% performance improved on P40. Since I don't have a 1080ti, I can't test for you.

@baduk1
Copy link

baduk1 commented Jun 17, 2018

How to enable this tensortt gie to raise usage?

@baduk1
Copy link

baduk1 commented Jun 18, 2018

@godmoves does it win ELF weights on 95%?

@baduk1
Copy link

baduk1 commented Jun 18, 2018

@godmoves
https://www.youtube.com/watch?v=xboKiwywEfM (2:00)
My current performance is 30% on these settings:
What should I change to get 75%?
num_eval_threads: 2
num_search_threads: 12
max_children_per_node: 512
max_search_tree_size: 2000000000
timeout_ms_per_step: 20000
max_simulations_per_step: 0
eval_batch_size: 4
eval_wait_batch_timeout_us: 100
model_config {
train_dir: "ckpt"
}
gpu_list: "0,1"
c_puct: 2.5
virtual_loss: 1.0
enable_resign: 0
v_resign: -0.9
enable_dirichlet_noise: 0
dirichlet_noise_alpha: 0.03
dirichlet_noise_ratio: 0.25
monitor_log_every_ms: 0
get_best_move_mode: 0
enable_background_search: 0
enable_policy_temperature: 0
policy_temperature: 0.67
inherit_default_act: 1
early_stop {
enable: 1
check_every_ms: 100
sims_factor: 1.0
sims_threshold: 2000
}
unstable_overtime {
enable: 1
time_factor: 0.3
}
behind_overtime {
enable: 1
act_threshold: 0.0
time_factor: 0.3
}
time_control {
enable: 1
c_denom: 20
c_maxply: 40
reserved_time: 1.0
}

@wonderingabout
Copy link
Contributor

wonderingabout commented Dec 5, 2018

30% performance improved on P40. Since I don't have a 1080ti, I can't test for you.

arround the same on Tesla P100

25 sec/move at 5000 sims/per move with tensorRT 3.0.4 (deb install) on ubuntu 16.04, cuda 9.0, cudnn 7.0.5 (deb installs)

@wonderingabout
Copy link
Contributor

15-20% on ubuntu 16.04 LTS with GTX 1060
see the wiki speed benchmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants