博文

我的Lustre安装记录

已有 3265 次阅读 2022-11-25 23:01 |系统分类:科研笔记

服务端是CENTOS 7.9

顺序：

1. install the OS，

2. install the infiniband driver from the lustre web.（需要卸载掉mlnx的原有驱动）

如果ib0看不到：

sudo modprobe -rv ib_isert rpcrdma ib_srpt

sudo service openibd start

3. install the Kernel downloaded from the lustre web.

4. install the lustre server.

中间会提示zfs osd出错（由于安装的是ldiskfs，无视）

Libmpi.so.12也可以强制过去。

如果在安装infiniband时出现：

Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping…

Module mlx4_ib belong to kernel which is not a part of MLNX[FAILED] skipping…

Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping…

Module mlx4_en belong to kernel which is not a part of MLNX[FAILED] skipping…

Module mlx5_core belong to kernel which is not a part of ML[FAILED] skipping…

Module mlx5_ib belong to kernel which is not a part of MLNX[FAILED] skipping…

Module mlx5_fpga_tools does not exist, skipping… [FAILED]

Module ib_umad belong to kernel which is not a part of MLNX[FAILED] skipping…

Module ib_uverbs belong to kernel which is not a part of ML[FAILED] skipping…

Module ib_ipoib belong to kernel which is not a part of MLN[FAILED]skipping…

Loading HCA driver and Access Layer: ? ? ? ? ? ? ? ? ? ? ? [ ?OK ?]

Module rdma_cm belong to kernel which is not a part of MLNX[FAILED]skipping…

Module ib_ucm does not exist, skipping… ? ? ? ? ? ? ? ? ?[FAILED]

Module rdma_ucm belong to kernel which is not a part of MLN[FAILED]skipping…

There are two potential solutions.

As a workaround for the Bright packages, perform the following:

（1）. In /etc/init.d/openibd on line 132, set FORCE=0 to FORCE=1. This causes openibd to ignore the kernel difference but relies on weak-updates.?

（2）. Edit /etc/infiniband/openib.conf and set UCM_LOAD=no and MLX5_FPGA_LOAD=no. As most customers aren’t using Legacy cards or FPGAs, this should not be an issue.?

（3）. Restart the openibd service.

Once complete, the Mellanox OFED modules should load as expected.

# service openibd start

Loading HCA driver and Access Layer: ? ? ? ? ? ? ? ? ? ? ? [ ?OK ?]

由于我的服务器（两台IO服务器）上各有两块网卡，一块光纤卡（万兆）用来连接原来的20台机子，还有一块HCA卡，用来连接后买的10台服务器infiniband网络。服务器同时为两个网络上的机子提供文件服务。

我的/etc/modprobe.d/lustre.conf是这样的：

options lnet networks="tcp0(ens2f0),o2ib0(ib0),tcp1(enp61s0f0)"

尽管没有网络会使用tcp0(ens2f0)但还得把它加上。tcp1是我的光纤卡接口。网上查到的解决方案，描述如下：

Lustre:

in recent Lustre releases, some specific filesystem could not be mounted due to a communication error between clients and servers, depending on the LNET configuration.

If we have a filesystem running on a host with 2 interfaces, let say tcp0 and tcp1 and the devices are setup to reply on both interfaces (formatted with --servicenode IP1@tcp0,IP2@tcp1).

If a client is connected only to tcp0 and try to mount this filesystem, it fails with an I/O error because it is trying to connect using tcp1 interface.

Mount failed:

# mount -t lustre x.y.z.a@tcp:/lustre /mnt/lustre

mount.lustre: mount x.y.z.a@tcp:/lustre at /mnt/client failed: Input/output error

Is the MGS running?

dmesg shows that communication fails using the wrong IP

[422880.743179] LNetError: 19787:0:(lib-move.c:1714:lnet_select_pathway()) no route to a.b.c.d@tcp1

# lnetctl peer show

peer:

- primary nid: a.b.c.d@tcp1

Multi-Rail: False

peer ni:

- nid: x.y.z.a@tcp

state: NA

- nid: 0@<0:0>

state:

Ping is OK though:

# lctl ping x.y.z.a@tcp

12345-0@lo

12345-a.b.c.d@tcp1

12345-x.y.z.a@tcp

server (lustre-2.10.4-1.el7.x86_64):

options lnet networks=tcp0(en0),o2ib0(in0)

client (lustre-2.12.5-RC1-0.el7.x86_64):

options lnet networks="o2ib(ib0)"

These two workarounds seem to work (only?very?limited testing so far):

Configuring LNET tcp on the client (although I actually only want to use IB):

options lnet networks="o2ib(ib0),tcp(enp3s0f0)"

Executing this before the actual Lustre mount:

lnetctl set discovery 0

我配好后的MDT文件系统如下：tunefs.lustre /dev/sdb1

checking for existing Lustre data: found

Reading CONFIGS/mountdata

Read previous values:

Target: lustre-MDT0000

Index: 0

Lustre FS: lustre

Mount type: ldiskfs

Flags: 0x5

(MDT MGS )

Persistent mount opts: user_xattr,errors=remount-ro

Parameters:

Permanent disk data:

Target: lustre-MDT0000

Index: 0

Lustre FS: lustre

Mount type: ldiskfs

Flags: 0x5

(MDT MGS )

Persistent mount opts: user_xattr,errors=remount-ro

Parameters:

配好后的OST如下：

tunefs.lustre /dev/sdb2

checking for existing Lustre data: found

Reading CONFIGS/mountdata

Read previous values:

Target: lustre-OST0000

Index: 0

Lustre FS: lustre

Mount type: ldiskfs

Flags: 0x2

(OST )

Persistent mount opts: ,errors=remount-ro

Parameters: mgsnode=10.10.1.101@tcp

Permanent disk data:

Target: lustre-OST0000

Index: 0

Lustre FS: lustre

Mount type: ldiskfs

Flags: 0x2

(OST )

Persistent mount opts: ,errors=remount-ro

Parameters: mgsnode=10.10.1.101@tcp

（其中10.10.1.101是我光纤卡的IP）

/etc/rc.local如下：

modprobe lnet

modprobe lustre

mount -t lustre /dev/sdb1 /mdt

mount -t lustre /dev/sdb2 /ost0

mount -t lustre /dev/sdb3 /ost1

第二台IO上：

cat /etc/modprobe.d/lustre.conf

options lnet networks="tcp0(ens2f0),o2ib0(ib0),tcp1(enp61s0f0)"

tunefs.lustre

checking for existing Lustre data: found

Reading CONFIGS/mountdata

Read previous values:

Target: lustre-OST0002

Index: 2

Lustre FS: lustre

Mount type: ldiskfs

Flags: 0x2

(OST )

Persistent mount opts: ,errors=remount-ro

Parameters: mgsnode=10.10.1.101@tcp

Permanent disk data:

Target: lustre-OST0002

Index: 2

Lustre FS: lustre

Mount type: ldiskfs

Flags: 0x2

(OST )

Persistent mount opts: ,errors=remount-ro

Parameters: mgsnode=10.10.1.101@tcp

在光纤卡的客户机上：

modprobe lnet

modprobe lustre

mount -t lustre fio01@tcp:/lustre /lustre

其中fio01是10.10.1.101

在HCA卡的客户机上：

modprobe lnet

modprobe lustre

lnetctl set discovery 0

mount -t lustre 11.11.1.101@o2ib0:/lustre /lustre

umount -l /lustre

mount -t lustre 11.11.1.101@o2ib0:/lustre /lustre

其中11.11.1.101是服务端MDS上HCA的IP。

（这里很奇怪，需要卸载一次文件系统再挂载，才能稳定访问。）

转载本文请联系原作者获取授权，同时请注明本文来自贾建峰科学网博客。
链接地址：https://blog.sciencenet.cn/blog-3367558-1365331.html

上一篇：Extended Huckel Molecular Orbital Program （EHMO程序一枚）
下一篇：Cp2k 2022.1安装记录

收藏 IP: 218.26.109.*| 热度|

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

贾建峰

扫一扫，分享此博文

jiajf的个人博客分享 http://blog.sciencenet.cn/u/jiajf

博文

我的Lustre安装记录

当前推荐数：0

该博文允许注册用户评论请点击登录评论 (0 个评论)

贾建峰

全部作者的其他最新博文

全部精选博文导读

jiajf的个人博客分享 http://blog.sciencenet.cn/u/jiajf

博文

我的Lustre安装记录

当前推荐数：0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

贾建峰

全部作者的其他最新博文

全部精选博文导读

该博文允许注册用户评论请点击登录评论 (0 个评论)