Try to automate NVME over TCP rediscovery with .service and .timer

Posted on

Hello everyone,

I tried this first with cron sucessfully as follows:
crontab -e
@reboot sleep seconds in XX format && /usr/sbin/modprobe nvme_tcp && /usr/sbin/nvme discover --transport=tcp --traddr=storage IP --host-traddr=host IP --trsvcid=8009 && /usr/sbin/nvme connect-all

Now I am struggling to do it with .service and .timer.

.service file contents:
[Unit]
Description=Nvme over TCP connection restore after reboot

[Service]
Type=oneshot
ExecStart=/usr/bin/bash /nvmeovertcp_rediscovery.sh

[Install]
WantedBy=multi-user.target

.timer file contents:
[Unit]
Description=timer for nvmeovertcp service

[Timer]
OnBootSec=15sec
Unit=nvmeovertcp.service

[Install]
WantedBy=timers.target

Script file contents:

!/bin/bash
modprobe nvme_tcp
nvme discover --transport=tcp --traddr=storage IP --host-traddr=host IP --trsvcid=8009
nvme connect-all

The .service fails, don't have any ideas why. The script runs manually pretty well.
Could you please help?

The status output is as follows:
-- Unit nvmeovertcp.service has begun starting up.
Mar 30 03:53:02 rhel kernel: nvme nvme0: queue_size 128 > ctrl sqsize 32, clamping down
Mar 30 03:53:02 rhel kernel: nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.6.101:8009
Mar 30 03:53:02 rhel kernel: nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
Mar 30 03:53:02 rhel bash[1900]: Discovery Log Number of Records 2, Generation counter 24
Mar 30 03:53:02 rhel bash[1900]: =====Discovery Log Entry 0======
Mar 30 03:53:02 rhel bash[1900]: trtype: tcp
Mar 30 03:53:02 rhel bash[1900]: adrfam: ipv4
Mar 30 03:53:02 rhel bash[1900]: subtype: unrecognized
Mar 30 03:53:02 rhel bash[1900]: treq: not specified
Mar 30 03:53:02 rhel bash[1900]: portid: 0
Mar 30 03:53:02 rhel bash[1900]: trsvcid: 8009
Mar 30 03:53:02 rhel bash[1900]: subnqn: nqn.1992-08.com.netapp:sn.8d916f8b90be11ed927fd039ea29162b:discovery
Mar 30 03:53:02 rhel bash[1900]: traddr: 192.168.6.101
Mar 30 03:53:02 rhel bash[1900]: sectype: none
Mar 30 03:53:02 rhel bash[1900]: =====Discovery Log Entry 1======
Mar 30 03:53:02 rhel bash[1900]: trtype: tcp
Mar 30 03:53:02 rhel bash[1900]: adrfam: ipv4
Mar 30 03:53:02 rhel bash[1900]: subtype: nvme subsystem
Mar 30 03:53:02 rhel bash[1900]: treq: not specified
Mar 30 03:53:02 rhel bash[1900]: portid: 0
Mar 30 03:53:02 rhel bash[1900]: trsvcid: 4420
Mar 30 03:53:02 rhel bash[1900]: subnqn: nqn.1992-08.com.netapp:sn.8d916f8b90be11ed927fd039ea29162b:subsystem.RHEL_87
Mar 30 03:53:02 rhel bash[1900]: traddr: 192.168.6.101
Mar 30 03:53:02 rhel bash[1900]: sectype: none
Mar 30 03:53:05 rhel bash[1906]: Failed to write to /dev/nvme-fabrics: Connection timed out
Mar 30 03:53:05 rhel kernel: nvme nvme0: failed to connect socket: -110
Mar 30 03:53:05 rhel kernel: nvme nvme0: queue_size 128 > ctrl sqsize 32, clamping down
Mar 30 03:53:05 rhel kernel: nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.6.101:8009
Mar 30 03:53:05 rhel kernel: nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
Mar 30 03:53:05 rhel bash[1906]: skipping unsupported subtype 3
Mar 30 03:53:05 rhel kernel: nvme nvme0: creating 2 I/O queues.
Mar 30 03:53:05 rhel kernel: nvme nvme0: mapped 2/0/0 default/read/poll queues.
Mar 30 03:53:05 rhel kernel: nvme nvme0: new ctrl: NQN "nqn.1992-08.com.netapp:sn.8d916f8b90be11ed927fd039ea29162b:subsystem.RHEL_87", addr 192.168.6.101:4420
Mar 30 03:53:05 rhel systemd[1]: nvmeovertcp.service: Main process exited, code=exited, status=146/n/a
Mar 30 03:53:05 rhel systemd[1]: nvmeovertcp.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd

-- Support: https://access.redhat.com/support

-- The unit nvmeovertcp.service has entered the 'failed' state with result 'exit-code'.
Mar 30 03:53:05 rhel systemd[1]: Failed to start Nvme over TCP connection restore after reboot.
-- Subject: Unit nvmeovertcp.service has failed
-- Defined-By: systemd

-- Support: https://access.redhat.com/support

-- Unit nvmeovertcp.service has failed.

Responses