【CUDA】cuda_10.* for Linuxのインストール

  •  確認環境

CentOS 7(1810)
CUDA 10.1.168
X、GCCはインストール済みが前提です。
CUDAのインストーラーが、CUDA 9 までと変わっています。
CUDA 9の時と同じノリで実行したのでマニュアルと違うかも知れません。orz
  •  ユーザーモードを変更

init 3コマンドでもOK(rebootしたらランレベル 5に戻るので注意です。))

# systemctl set-default multi-user.target ← ユーザーモードのデフォルト値をmulti-userにします。
Removed symlink /etc/systemd/system/default.target.
Created symlink from /etc/systemd/system/default.target to /usr/lib/systemd/system/multi-user.target.

  • デフォルトでインストールされているNVIDIAドライバーを削除

# lsmod | grep nouveau ← ドライバーの有無の確認します。(リストされてたら有り)
nouveau 1869689 6
video 24538 1 nouveau
mxm_wmi 13021 1 nouveau
i2c_algo_bit 13413 1 nouveau
drm_kms_helper 179394 1 nouveau
ttm 114635 1 nouveau
drm 429744 7 ttm,drm_kms_helper,nouveau
wmi 21636 4 hp_wmi,mxm_wmi,nouveau,intel_wmi_thunderbolt
# 
↑ NVIDIAドライバーがロードされていたら、CUDAのインストールに障害が発生するので排除します。

# vi /etc/modprobe.d/blacklist-nouveau.conf ← blacklistファイルを作成します。
# cat /etc/modprobe.d/blacklist-nouveau.conf ← blacklistファイルの内容です。
blacklist nouveau
options nouveau modeset=0
#
# dracut --force ← NVIDIAドライバーを削除します。
# reboot ← リブートします。# lsmod | grep nouveau ← リストに出力がないことを確認しておきます。
#
  • CUDAインストール

# ./cuda_10.1.168_418.67_linux.run --help ← 一応help見ときます。
Options:
--silent
Performs an installation with no further user-input and minimal
command-line output based on the options provided below. Silent
installations are useful for scripting the installation of CUDA.
Using this option implies acceptance of the EULA. The following flags
can be used to customize the actions taken during installation. At
least one of --driver, --uninstall, --toolkit, and --samples must
be passed if running with non-root permissions.

--driver
Install the CUDA Driver.

--toolkit
Install the CUDA Toolkit.

--toolkitpath=<path>
Install the CUDA Toolkit to the <path> directory. If this flag is not
provided, the default path of /usr/local/cuda-10.1 is used.

--samples
Install the CUDA Samples.

--samplespath=<path>
Install the CUDA Samples to the <path> directory. If this flag is not
provided, the default path of /root/NVIDIA_CUDA-10.1_Samples is used.

--librarypath=<path>
Install libraries to the <path> directory. If this flag is not provided,
the default path of your distribution is used. This flag only applies to
libraries installed outside of the CUDA Toolkit path.

--installpath=<path>
Install everything to the <path> directory. This flag sets the same values
as the toolkitpath, samplespath, and librarypath options.

--extract=<path>
Extracts driver runfile and the raw files of the toolkit and samples to
<path>.

This is especially useful when one wants to install the driver using one or
more of the command-line options provided by the driver installer which
are not exposed in this installer.

--override
Ignores compiler version checks which would prevent installation.

--no-opengl-libs
Prevents the driver installation from installing NVIDIA's GL libraries.
Useful for systems where the display is driven by a non-NVIDIA GPU.
In such systems, NVIDIA's GL libraries could prevent X from loading
properly.

--no-man-page
Do not install the man pages under /usr/share/man.

--kernel-source-path=<path>
Tells the driver installation to use <path> as the kernel source directory
when building the NVIDIA kernel module. Required for systems where the
kernel source is installed to a non-standard location.

--run-nvidia-xconfig
Tells the driver installation to run nvidia-xconfig to update the system
X configuration file so that the NVIDIA X driver is used. The pre-existing
X configuration file will be backed up.

This option should not be used on systems that require a custom
X configuration, or on systems where a non-NVIDIA GPU is rendering the
display.

--no-drm
Do not install the nvidia-drm kernel module. This kernel module provides
several features, including X11 autoconfiguration, support for PRIME, and
DRM-KMS. The latter is used to support modesetting on wind owing systems
that run independently of X11. The '--no-drm' option should only be used
to work around failures to build or install the nvidia-drm kernel module
on systems that do not need these features.

--tmpdir=<path> ← インストーラーが使うテンポラリ領域を指定します。
Performs any temporary actions within <path> instead of /tmp. Useful in
cases where /tmp cannot be used (doesn't exist, is full, is mounted with
'noexec', etc.).

--help
Prints this help message.
#
※ストレージが足りない時は、--tmpdirオプションを使う
 (テンポラリ領域が足りないとインストールに失敗します。)

# ./cuda_10.1.168_418.67_linux.run --tmpdir=/home ← インストーラーを実行します。


lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
x End User License Agreement x
x -------------------------- x
x x
x NVIDIA Software License Agreement and CUDA Supplement to x
x Software License Agreement. x
x x
x x
x Preface x
x ------- x
x x
x The Software License Agreement in Chapter 1 and the Supplement x
x in Chapter 2 contain license terms and conditions that govern x
x the use of NVIDIA software. By accepting this agreement, you x
x agree to comply with all the terms and conditions applicable x
x to the product(s) included herein. x
x x
x x
x NVIDIA Driver x
x x
x x
xqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqx
x Do you accept the above EULA? (accept/decline/quit): x
x

↑ メニューが表示されますので応答していきます。


lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
x End User License Agreement x
x -------------------------- x
x x
x NVIDIA Software License Agreement and CUDA Supplement to x
x Software License Agreement. x
x x
x x
x Preface x
x ------- x
x x
x The Software License Agreement in Chapter 1 and the Supplement x
x in Chapter 2 contain license terms and conditions that govern x
x the use of NVIDIA software. By accepting this agreement, you x
x agree to comply with all the terms and conditions applicable x
x to the product(s) included herein. x
x x
x x
x NVIDIA Driver x
x x
x x
xqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqx
x Do you accept the above EULA? (accept/decline/quit): x
x accept ← acceptと入力します。


lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
x CUDA Installer x
x - [X] Driver x
x [X] 418.67 x
x + [X] CUDA Toolkit 10.1 x
x [X] CUDA Samples 10.1 x
x [X] CUDA Demo Suite 10.1 x
x [X] CUDA Documentation 10.1 x
x Options x
x Install x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options x

↑ 矢印キーでメニューを移動し、お好みのオプションを選択してください。

lqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqk
x CUDA Installer x
x - [X] Driver x
x [X] 418.67 x
x + [X] CUDA Toolkit 10.1 x
x [X] CUDA Samples 10.1 x
x [X] CUDA Demo Suite 10.1 x
x [X] CUDA Documentation 10.1 x
x Options x
x Install x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x x
x Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options x

↑ 最後にinstallを選択してEnterキー押下するとインストールが実行されます。
#

===========
= Summary =
===========

Driver: Installed
Toolkit: Installed in /usr/local/cuda-10.1/
Samples: Installed in /root/

Please make sure that
- PATH includes /usr/local/cuda-10.1/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-10.1/lib64, or, add /usr/local/cuda-10.1/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-10.1/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.1/doc/pdf for detailed information on setting up CUDA.
Logfile is /var/log/cuda-installer.log
#

↑ インストール成功です。
※失敗した場合は、/var/log/cuda-installer.logの内容からエラーの対処をしてインストーラーを再実行します。
  • 後処理

# ls /etc/ld.so.conf.d/ ← ライブラリの設定を見てみます。
cuda-10-1.conf kernel-3.10.0-957.el7.x86_64.conf mariadb-x86_64.conf
dyninst-x86_64.conf libiscsi-x86_64.conf qt-x86_64.conf
# cat /etc/ld.so.conf.d/cuda-10-1.conf ← 設定ファイルの内容を確認しています。
/usr/local/cuda-10.1/targets/x86_64-linux/lib
#
# ldconfig ← ライブラリの設定をしておきます。
#

※以前のCUDAであれば、上記部分は手作業でやってました。

  • 確認

#
# nvidia-smi ← GPUが見えているか確認
Tue Jun 11 16:26:30 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P400 Off | 00000000:2D:00.0 Off | N/A |
| 27% 42C P0 N/A / N/A | 0MiB / 1993MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
#

↑ Quadro P400 が見えているので成功です。

#
# systemctl set-default graphical.target ← ユーザーモードのデフォルトを戻しておきます。
Removed symlink /etc/systemd/system/default.target.
Created symlink from /etc/systemd/system/default.target to /usr/lib/systemd/system/graphical.target.
#