|
服务端是CENTOS 7.9
顺序:
1. install the OS,
2. install the infiniband driver from the lustre web.(需要卸载掉mlnx的原有驱动)
如果ib0看不到:
sudo modprobe -rv ib_isert rpcrdma ib_srpt
sudo service openibd start
3. install the Kernel downloaded from the lustre web.
4. install the lustre server.
中间会提示zfs osd出错(由于安装的是ldiskfs,无视)
Libmpi.so.12也可以强制过去。
如果在安装infiniband时出现:
Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping…
Module mlx4_ib belong to kernel which is not a part of MLNX[FAILED] skipping…
Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping…
Module mlx4_en belong to kernel which is not a part of MLNX[FAILED] skipping…
Module mlx5_core belong to kernel which is not a part of ML[FAILED] skipping…
Module mlx5_ib belong to kernel which is not a part of MLNX[FAILED] skipping…
Module mlx5_fpga_tools does not exist, skipping… [FAILED]
Module ib_umad belong to kernel which is not a part of MLNX[FAILED] skipping…
Module ib_uverbs belong to kernel which is not a part of ML[FAILED] skipping…
Module ib_ipoib belong to kernel which is not a part of MLN[FAILED]skipping…
Loading HCA driver and Access Layer: ? ? ? ? ? ? ? ? ? ? ? [ ?OK ?]
Module rdma_cm belong to kernel which is not a part of MLNX[FAILED]skipping…
Module ib_ucm does not exist, skipping… ? ? ? ? ? ? ? ? ?[FAILED]
Module rdma_ucm belong to kernel which is not a part of MLN[FAILED]skipping…
There are two potential solutions.
As a workaround for the Bright packages, perform the following:
(1). In /etc/init.d/openibd on line 132, set FORCE=0 to FORCE=1. This causes openibd to ignore the kernel difference but relies on weak-updates.?
(2). Edit /etc/infiniband/openib.conf and set UCM_LOAD=no and MLX5_FPGA_LOAD=no. As most customers aren’t using Legacy cards or FPGAs, this should not be an issue.?
(3). Restart the openibd service.
Once complete, the Mellanox OFED modules should load as expected.
# service openibd start
Loading HCA driver and Access Layer: ? ? ? ? ? ? ? ? ? ? ? [ ?OK ?]
由于我的服务器(两台IO服务器)上各有两块网卡,一块光纤卡(万兆)用来连接原来的20台机子,还有一块HCA卡,用来连接后买的10台服务器infiniband网络。服务器同时为两个网络上的机子提供文件服务。
我的/etc/modprobe.d/lustre.conf是这样的:
options lnet networks="tcp0(ens2f0),o2ib0(ib0),tcp1(enp61s0f0)"
尽管没有网络会使用tcp0(ens2f0)但还得把它加上。tcp1是我的光纤卡接口。网上查到的解决方案,描述如下:
Lustre:
in recent Lustre releases, some specific filesystem could not be mounted due to a communication error between clients and servers, depending on the LNET configuration.
If we have a filesystem running on a host with 2 interfaces, let say tcp0 and tcp1 and the devices are setup to reply on both interfaces (formatted with --servicenode IP1@tcp0,IP2@tcp1).
If a client is connected only to tcp0 and try to mount this filesystem, it fails with an I/O error because it is trying to connect using tcp1 interface.
Mount failed:
# mount -t lustre x.y.z.a@tcp:/lustre /mnt/lustre
mount.lustre: mount x.y.z.a@tcp:/lustre at /mnt/client failed: Input/output error
Is the MGS running?
dmesg shows that communication fails using the wrong IP
[422880.743179] LNetError: 19787:0:(lib-move.c:1714:lnet_select_pathway()) no route to a.b.c.d@tcp1
# lnetctl peer show
peer:
- primary nid: a.b.c.d@tcp1
Multi-Rail: False
peer ni:
- nid: x.y.z.a@tcp
state: NA
- nid: 0@<0:0>
state:
Ping is OK though:
# lctl ping x.y.z.a@tcp
12345-0@lo
12345-a.b.c.d@tcp1
12345-x.y.z.a@tcp
server (lustre-2.10.4-1.el7.x86_64):
options lnet networks=tcp0(en0),o2ib0(in0)
client (lustre-2.12.5-RC1-0.el7.x86_64):
options lnet networks="o2ib(ib0)"
These two workarounds seem to work (only?very?limited testing so far):
Configuring LNET tcp on the client (although I actually only want to use IB):
options lnet networks="o2ib(ib0),tcp(enp3s0f0)"
Executing this before the actual Lustre mount:
lnetctl set discovery 0
我配好后的MDT文件系统如下:tunefs.lustre /dev/sdb1
checking for existing Lustre data: found
Reading CONFIGS/mountdata
Read previous values:
Target: lustre-MDT0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x5
(MDT MGS )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:
Permanent disk data:
Target: lustre-MDT0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x5
(MDT MGS )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:
配好后的OST如下:
tunefs.lustre /dev/sdb2
checking for existing Lustre data: found
Reading CONFIGS/mountdata
Read previous values:
Target: lustre-OST0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.10.1.101@tcp
Permanent disk data:
Target: lustre-OST0000
Index: 0
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.10.1.101@tcp
(其中10.10.1.101是我光纤卡的IP)
/etc/rc.local如下:
modprobe lnet
modprobe lustre
mount -t lustre /dev/sdb1 /mdt
mount -t lustre /dev/sdb2 /ost0
mount -t lustre /dev/sdb3 /ost1
第二台IO上:
cat /etc/modprobe.d/lustre.conf
options lnet networks="tcp0(ens2f0),o2ib0(ib0),tcp1(enp61s0f0)"
tunefs.lustre
checking for existing Lustre data: found
Reading CONFIGS/mountdata
Read previous values:
Target: lustre-OST0002
Index: 2
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.10.1.101@tcp
Permanent disk data:
Target: lustre-OST0002
Index: 2
Lustre FS: lustre
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.10.1.101@tcp
在光纤卡的客户机上:
modprobe lnet
modprobe lustre
mount -t lustre fio01@tcp:/lustre /lustre
其中fio01是10.10.1.101
在HCA卡的客户机上:
modprobe lnet
modprobe lustre
lnetctl set discovery 0
mount -t lustre 11.11.1.101@o2ib0:/lustre /lustre
umount -l /lustre
mount -t lustre 11.11.1.101@o2ib0:/lustre /lustre
其中11.11.1.101是服务端MDS上HCA的IP。
(这里很奇怪,需要卸载一次文件系统再挂载,才能稳定访问。)
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-26 22:06
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社