Extending Striped LVM on xfs

Latest response

hello,
i was wondering what best practice would be to grow a striped logical volume. the server is a VM and its striped logical volume consists of 4 RDMs.
my understanding would be to;
change RDM size at the vmware level
rescan scsi on the server (so reboot not required)
preform vgextend onto the new presented size
do lvextend
then do grow_xfs on the logical striped volume

or do i have to create 4 new separate RDMs of a same size to original
add these into volume group
extend volume group
extend logical volume
grow xfs

i will be testing this but would like other peoples experiences on what worked best?

Responses

Hi Paul Sweeny.

Edited with an attempt for more clarity

If you decide to attempt this, open a case with Red Hat first and inform them of this discussion before proceeding.

By the way, (I mean this kindly), it would help us to have more details. Namely in this case, which partition and subdirectory are filling up for you?. What would really help is to know what partition specifically is filling up (perhaps "/var" or maybe even "/" (hopefully the issue is not /). The next thing that would really help us, is what directory under the filling-up partition is the biggest consumer under that partition.

Yes, you can grow a filesystem It is not always necessary to do so, especially with VMware. We often add an additional drive to the specific directory we find is overwhelming a particular partition

In short, one of the main ways I fix this is to (in the VMware appliance graphical interface) add a virtual drive from your existing storage pool to the system that is having storage issues. Namely add a new VMware drive to ultimately take the place of the filling up sub-directory in the partition that is threatening to reach 100%. I would then partition the new drive, put a file system on it, and mount it temporarily to a non-persistent mount such as "/". I'd shut off the service in question corresponding to the filled up directory (Postgres database, docker, tomcat, whatever it is), and rsync from that filling up the directory to the temporarily mounted drive at perhaps /tempmount_for_rsync.

Without the details that would make this easier. I'm going to use a common situation I used to see in my own environment where I work. I had another admin tell me they wanted to grow /var but after examination, the real issue was /var/postgres-9-3 that we added a new 150G drive to and mounted it to that location. We did not have to grow /var. As you read this reply, substitute "/var/postgres-9-3" with the actual directory that is consuming whatever partition is causing you trouble please.

Let's say the issue is "/var/postgres-9-3" filling up the partition "/var" (and hopefully not "/"). Start with the command df -PhT and look for the filled-up partition. You probably already know it (we just do not). Use the "cd" command and then the du command cited below in order to determine what sub-directory in the filled partition is giving you an issue. I have an example below.

An overview of the process would be:

  • Use the df command mentioned below to find the filling partition
[root@yoursystem ~] # df -PhT 
Filesystem                           Type        Size     Used      Avail    Use%     Mounted on
/dev/mapper/slash-var                  xfs          35.2G   34.2G      1G    98%      /var
  • Use the du command example below to uncover the filling sub-directory under that partition from above
# cd /var
# pwd
/var
# du -sk /var/* | sort -nr | head -15
31G   /var/postgres-9-3
3G   /var/log
<output trunctated>
# mountpoint /var/postgres-9-3
/var/postgres-9-3 is not a montpoint
# mountpoint /var
/var is a mountpoint
  • The above example shows /var/postgres-9-3 filling up /var
  • The above example shows /var is a mountpoint
  • the above example shows /var/postgres-9-3 is not a mountpoint, but is feeding var

CONTINUED

  • Shut off the applicable service, whatever it happens to be so you are not writing to that directory when you try to rsync it to the new location.
  • Add virtual disk within the VMware interface, partition it, and use mkfs.xfs to give it a file system.
  • Mount the new file system to a temporary location solely for the purpose of running a rsync from the old location to the new location.
  • Give it the directory the same permissions, ownership as the directory you are copying FROM (and SELinux contexts too, if required) - you can use for i in chcon chown chmod;do $i --reference /var/postgres-9-3 /temporary_mount_to_rsync;done
  • If the file system that is giving you trouble is at 100% consumption, examine the file system interrogating it with with du -sk /path/to/source/* | sort -nr | head -15 and then step by step, dig into the top result until perhaps you find something like a huge log file that you can cat /dev/null to and reduce to zero bytes (this is just an example because we do not know what service you are talking about here, or the specifics). In any case, appropriately deal with resolving the 100% full issue if this is applicable first, prior to proceeding to the next step
  • Once you have the file system stopped, and not at 100% consumption, and also the new file system ready to accept the data with the proper permissions above, then and only then proceed with the rest.
  • Run the rsync at least twice to verify no changes and the copy went correctly. rsync -au --progress /var/postgres-9-3//tempmount_for_rsync/` the trailing slashes are very important!!
  • Validate permissions, SELinux Contexts and so forth.
  • Again, with the service feeding the source directory OFF, move the old directory out of the way, the example below.. We used mv /var/postgres-9-3 /var/WASpostgres-9-3; mkdir -p /var/postgres-9-3/notmounted which moved the old one out of the way and made a fresh new directory.
  • That new directory made above will be where the new mountpoint will be placed after you have done the rsync mentioned above.
  • After you successfully rsync the data, then make a backup copy of /etc/fstab with cp -v /etc/fstab{,.2020_02_19} which will copy your /etc/fstab with a time date stamp.
    Then edit your /etc/fstab with the newly-created storage cited previously and make the mount point where the offending directory with excessive use is located do not mount it yet
  • After you truely have no-kidding actually verified your rsync ran clean and you are 100% positive of the results, and you have set the permissions, and adjusted the mount point in /etc/fstab, run mount -a and validate your new shiny mount point is at /var/postgres-9-3 Then start the service.
# echo this is before the new file system you created is actually mounted
[root@yoursystem /var ] # mountpoint /var/postgres-9-3
/var/postgres-9-3 is NOT a mountpoint

# echo make sure to move the original /var/postgres-9-3 BEFORE executing the mount command.  See instructions above!
# the instructions above made a subdirectory named "notmounted" - that is all you should see!!!
[root@yoursystem /var ] # ls /var/postgres-9-3
notmounted

# echo this is the command to mount the new directives you made in /etc/fstab to mount the partition to it's new home
# do not run this command unless you did the mv command mentioned earlier!
[root@yoursystem /var ] # mount -a
<no output!>

[root@yoursystem /var ] # mountpoint /var/postgres-9-3
/var/postgres-9-3 is a mountpoint
[root@yoursystem /var ] # lsblk | egrep postgres
# echo the output of lsblk ought to show the new mounted storage. 

[ root@yoursystem /var ] # systemctl start postgres-9-3
<no output>
[root@yoursystem /var ] # systemctl is-active postgres-9-3
active
# echo you will have to perform troubleshooting if the service does not come up, you may have to revert to the previous directory.
  • Validate the service is up and running. if required, revert the WASpostgres-9-3 directory back to postgres-9-3 after stopping the service and doing a umount. Do not delete the original directory until you are 100% certain the new storage not only works but that the service works with the copied data.

The intent of the above is to isolate the new storage you need to the actual directory you need it at. The idea behind the above method will avert having to grow a parent partition (especially if it happens to be / which generally goes bad and is not supported by Red Hat. I can provide a link for that statement at a later date). I have done this form of method in principle numerous times due to data exceeding the planning, etc of the original intent of a given server. This averts having of growing the parent partition only for something else to excessively dog-pile on the parent partition and you're left with the same issue --- especially when you have striped storage!

I wish you well with this matter. I and others are willing to assist you. Please understand that plain text sometimes does not convey what we intend, and the intention of the above is just to attempt to provide assistance to you even with a deficit of very necessary details.

This is a lot of info I posted. We will help you with any questions you have, so please ask.

Kind Regards,
RJ

That's a nice and clean explanation RJ. Good indeed.

Thanks, Sadashiva, I hope they come back if they need help or that they resolved it somehow.

Regards,
RJ

Hi RJ,

I have to say it's really a great pleasure to read your comprehensive explanations and instructions ... very, very useful my friend ! :)

Regards,
Christian

Thanks, Christian,

I hope Paul Sweeney either has this resolved or comes back to take advantage of these tips (and it actually helps)

Regards,
RJ

hello, apologies i have been away from office last few days. Appreciate the comprehensive guide RJ a bit more background for you all i need to extend a production database mount point from 7TB to 12TB say its mountpoint is /test/prd01 it is made of of volume group testvg logical volume is testlv01 which is stripped across 4 disks /dev/sdb-/dev/sde

each disk is currently 1.79T presented as a RDM

the database is in use at all times and can have no down time

my plan was to extend each RDM on the vcentre, rescan the scsi to see new disk space extend the volume group and logical volume respectively and use grow.xfs ( file system is xfs)

if i am correct RJ your method would have me create 4 new RDMS then rsync all the data across

once in sync then mount these new logical volumes over the affected filesystem /test/prd01?

Hello Paul,

Warning: RJ's concept is very great and in very nice detail. The issue to be aware of: It does not fulfill your latest requirement, No downtime allowed.

So will have to go for your own plan.

Be aware that you need to keep your striping in tact. So you best follow the steps in (Extending a Striped Volume)[https://access.redhat.com/solutions/530843] plus the included links.

I personally would go for scenario two, adding 4 RDMs of 1.26TiB instead of extending current ones. It just to avoid breaking the current RDMs.

Regards,

Jan Gerrit Kootstra

Paul,

Some say "no downtime" but after explaining methods, downtime might be needed. However, if you cannot, try what Jan Gerrit Kootstra says. On that note, I've been able to do this with minimal downtime in at least four instances in the last month,However, it was not with the amount of storage you speak of.

In light of your current context and details, I'd recommend what Jan says above.

Regards,
RJ

hi, yea i was thinking of along the same lines as Jan but have been asked to explore the option of extending the current mounted disks, as datastore space may be an issue and in order to replicate our current environment we would have to double the space rather than just allocate 4 TB. has anyone experience of doing this method? my concern is the same as Jan's that it may break the current mounted disks......

Hi Paul,

"No downtime allowed ?" Well, sometimes you may run into situations where you nevertheless have to shutdown the server.
And when it comes to working on disks and partitions, it is definitely recommended to do it when the file system is offline. :)

Regards,
Christian

hi Christian, it is a hospital system and in the past I would always advocate for rebooting the server and doing all the work while the file system is offline, i am just trying to gather information about each method and go back with evidence of either a) it can be done or b) it can be done but can cause issues c) do not do and use a different method e.g. mounting 4 new RDM's and add to volume group and logical volume

i appreciate all the guidance so far, but has anyone tried my method without any reboots or unmounts?

Paul, The method I've done works well, but it requires downtime. I used the existing storage pool that was already established within VMware. We are at a void of full details. We now know from today's replies you have a database and are likely using Red Hat 7 server since you are on XFS. A layout of your filesystem (df -PhT) and (lspci) might help. I'd open a case prior with Red Hat with your proposal if needed. I get the idea Jan might have some insight on what your scenario is, namely with RDMs.

Regards,
RJ

Are you using VMware?

apologies RJ only seeing this now, the systems are currently on RHEL 6 but migrating to RHEL 7, yes we are using VM ware

I'm thinking - consider Jan Gerrit Kootstra's method perhaps. Let's discuss more

Regards,
RJ

Hi RJ,

The way safer method would be to NOT do it from within a running system ... this is just "my two cents" of course. :)

Regards,
Christian

thanks RJ, Jans method was the method i had suggested and had seen work before in the past, but was asked to investigate the method i have supplied above. I believe that Jans method is cleaner and should meet criteria of no downtime as well as extending disk space

Paul,

I'm good with what works best with your scenario - Jan's method seems to be the best fit in this scenario.

For your own reference, I'd highly recommend you do recognize what specific directory is filling up what specific partition. I'm glad you asked why.

Let's say the database you mention is filling up /var (we do not know what your partitioning layout is) is located under /var/SOMEDATABASE, and /var is a partition (this would be determined by executing the command mountpoint var and /var/SOMEDATABASE is not a partition. I mentioned a method using df du and mountpoint earlier). Knowing this for now will be helpful just to know specifically what is filling up your partition. I bring this up because in the future when you replace that server I'd recommend you make a partition for where your database is specifically so you can assign a hearty amount of storage to that partition when you initially build the server. You could then later if needed grow that partition instead of a parent partition where there would be a competition for space for sub-directories. I speak as an outsider here, because I do not know your partitioning. You're back from being away, so it's not vital or important for you to post that detail here now, but I bring this up so you can understand it for your scenario for the future, when you go to build a replacement server.

Let us know how this goes, know you can lean on Red Hat Support during this process as well. Feel free to ask us questions, we'll help as we can. Reminder, we're volunteers and not Red Hat employees

Kind Regards,
RJ

thats great and appreciate all the advice, what you advise is what i have advised management but have been asked to investigate the scenario above , your initial reply was how i done a similar task on a solaris 10 system so have experience of it

Cool, Solaris is my prior-to-Linux background.

Let us know if you need anything

Kind Regards,
RJ

Paul,

Make sure to have a tested online backup.

RJ's advise to open a precaution Support case is a very good action to start with.

Regards ,

Jan Gerrit

yea plan is to use a DR server to do a test

Make a well-considered decision:

Offline filesystem extension with downtime or online extension without downtime.

Like Christian tries to tell you: Convenience is not a good trigger for a decision.

Better safe than sorry should be the motto

i agree 100% better safe than sorry

Thank you Jan ! Better safe than sorry should be the motto ... Wise words spoken. :)

Paul,

I hope this has been resolved for you. Let us know in general, what steps you took for others who might arrive at this discussion you started to get an idea of your resolution

Kind Regards,
RJ