You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Firstly I'm not using cgroups(This is required?)
My issue is:
I'm using the Torque version 5.1.3 (because version greater them failed to install in Linux Mint)
Configured with the following commands #me@root:
cd torque-5.1.3-1462984387_205d70d
./configure --with-debug --enable-nvidia-gpus --with-sendmail
make
make install
service trqauthd restart
service pbs_server restart
service pbs_sched start
qmgr -c 'set server auto_node_np = True'
make packages #Then I do ssh node01.lbn.com and run the following: #root@node02
apt-get update
apt-get install g++ libssl-dev libxml2-dev sysv-rc-conf libboost-all-dev -y
cd torque-5.1.3-1462984387_205d70d
./configure --with-debug --enable-nvidia-gpus
make -j 2
make install -j 2
After run this I run "pbsnodes" command and node01 is ok, I can see all GPU info, however when a submit a job the nvidia-smi change the status of GPU to Exclusive-process and the GPU activity stay 0%, but the task still running (when I run "top")
My GPU is Nvidia-Tesla k40c but I already tested with GeforceGT 430 and no success NOTE: I'm runing CUDA 8.0 and the latest NVIDIA Drivers
Please help-me!
Thank you in advance
The text was updated successfully, but these errors were encountered:
Firstly I'm not using cgroups(This is required?)
My issue is:
I'm using the Torque version 5.1.3 (because version greater them failed to install in Linux Mint)
Configured with the following commands
#me@root:
cd torque-5.1.3-1462984387_205d70d
./configure --with-debug --enable-nvidia-gpus --with-sendmail
make
make install
cp contrib/init.d/debian.pbs_server /etc/init.d/pbs_server
cp contrib/init.d/debian.pbs_sched /etc/init.d/pbs_sched
cp contrib/init.d/debian.trqauthd /etc/init.d/trqauthd
sysv-rc-conf pbs_server on
sysv-rc-conf trqauthd on
sysv-rc-conf pbs_sched on
echo '/usr/local/lib'>/etc/ld.so.conf.d/torque.conf
ldconfig
service trqauthd restart
echo '/usr/local/lib'>/etc/ld.so.conf.d/torque.conf
echo "master.lbn.com">/var/spool/torque/server_name
./torque.setup root
echo "node01.lbn.com np=12 gpus=1" > /var/spool/torque/server_priv/nodes
service trqauthd restart
service pbs_server restart
service pbs_sched start
qmgr -c 'set server auto_node_np = True'
make packages
#Then I do ssh node01.lbn.com and run the following:
#root@node02
apt-get update
apt-get install g++ libssl-dev libxml2-dev sysv-rc-conf libboost-all-dev -y
cd torque-5.1.3-1462984387_205d70d
./configure --with-debug --enable-nvidia-gpus
make -j 2
make install -j 2
./torque-package-clients-linux-x86_64.sh --install
./torque-package-devel-linux-x86_64.sh --install
./torque-package-doc-linux-x86_64.sh --install
./torque-package-mom-linux-x86_64.sh --install
./torque-package-server-linux-x86_64.sh --install
echo '/usr/local/lib'>/etc/ld.so.conf.d/torque.conf
ldconfig
cp contrib/init.d/debian.pbs_mom /etc/init.d/pbs_mom
cp contrib/init.d/debian.trqauthd /etc/init.d/trqauthd
sysv-rc-conf trqauthd on
sysv-rc-conf pbs_mom on
service trqauthd restart
echo '$pbsserver master'>/var/spool/torque/mom_priv/config
echo '$logevent 225'>>/var/spool/torque/mom_priv/config
echo '$usercp *:/home /home'>>/var/spool/torque/mom_priv/config
service pbs_mom start
After run this I run "pbsnodes" command and node01 is ok, I can see all GPU info, however when a submit a job the nvidia-smi change the status of GPU to Exclusive-process and the GPU activity stay 0%, but the task still running (when I run "top")
My GPU is Nvidia-Tesla k40c but I already tested with GeforceGT 430 and no success
NOTE: I'm runing CUDA 8.0 and the latest NVIDIA Drivers
Please help-me!
Thank you in advance
The text was updated successfully, but these errors were encountered: