Managing Core Dumps (user space)

Latest response

We have a development environment that utilizes a standard of having multiple service accounts run several instances of the code...

Which we then set kernel params to dump the cores in a common location
kernel.core_pattern = /u01/core/core.%e.%p.%t
kernel.core_uses_pid = 1

Now - the issue I am facing.. the cores are written as 0600 and owned by the original process owner (which makes sense due to the sensitive nature of the dump data), however - the folks doing the analysis log in to the hosts as their own users and will have no permissions to manage the files.

I found the code (system call) which ignores any FACL applied to that directory as well.

  • from do_coredump
         file = filp_open(corename, O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);

Has anyone else ran in to this issue and "dealt with it"? If so, how? ;-)



I am experiencing something similar here. i am using nfs4, and rhel6 (are you using nfs4/rhel6?). Setting an ACL does not register to my client nfs systems. I tried this solution as well.

# added "acl" to the default mount options on the server
somesystem:/the/source  /the/mountpoint  ext4  all,the,options,comma,separated,acl  1 2

Are you using a SAN or NAS? if it is an appliance, that could cause a different layer of issues.

I checked

grep ACL /boot/config-`uname -r` 

and did not see any NFS|NFSD v4, ACL directive.

# abbreviated output, grep on ACL & NFS

I changed various settings in /etc/nfsmount.conf (including "Acl=True"), to no avail. I've also been dealing with a separate chmod/chown that silently fails as root on nfs4 vs. a SAN that has been annoying on rhel6. Tried the RH solution articles etc on that, still digging.

I am using SAN (lvm) in this case. I don't know specifically which portion of the FACL that the core dump process will not respect, but I know that the system call in the core dump process will only write at 0600 - which makes sense, I suppose. I'm likely going to have to write some cron job to do a find and change the perms. Fortunately it's only one directory I have to worry about.

If you wanted to expand the permissions, and this is automated, perhaps a umask command before and a reset to the normal umask after?

By the way, what did you run to get that system call you cited:

  file = filp_open(corename, O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);

Your SAN, does it have mount/share options? Is it an appliance?

In digging for this, I've found other things to dig through, and it seems somewhat related to the other issue I've faced. If either of us finds something, let's certainly post here...

I found that system call (or whatever) after hours (literally) of troubleshooting several issues I was having with our core dump config.

My directory accepts that ACL and applies it correctly - it just doesn't work with a core dump. Like I mentioned, I think this is by design as a security "feature". It would just be nice if there was a handler that could manage the outcome. Perhaps an ABRT trigger?

I'm now actually having some fun trying to find that reference again ;-)

918 int do_coredump(long signr, struct pt_regs * regs)
919 {
920         struct linux_binfmt * binfmt;
921         char corename[6+sizeof(current->comm)];
922         struct file * file;
923         struct inode * inode;
925         lock_kernel();
926         binfmt = current->binfmt;
927         if (!binfmt || !binfmt->core_dump)
928                 goto fail;
929         if (!current->dumpable || atomic_read(&current->mm->mm_users) != 1)
930                 goto fail;
931         current->dumpable = 0;
932         if (current->rlim[RLIMIT_CORE].rlim_cur < binfmt->min_coredump)
933                 goto fail;
935         memcpy(corename,"core.", 5);
936 #if 0
937         memcpy(corename+5,current->comm,sizeof(current->comm));
938 #else
939         corename[4] = '\0';
940 #endif
941         file = filp_open(corename, O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);
942         if (IS_ERR(file))
943                 goto fail;
944         inode = file->f_dentry->d_inode;
945         if (inode->i_nlink > 1)
946                 goto close_fail;        /* multiple links - don't dump */
948         if (!S_ISREG(inode->i_mode))
949                 goto close_fail;
950         if (!file->f_op)
951                 goto close_fail;
952         if (!file->f_op->write)
953                 goto close_fail;
954         if (!binfmt->core_dump(signr, regs, file))
955                 goto close_fail;
956         unlock_kernel();
957         filp_close(file, NULL);
958         return 1;
960 close_fail:
961         filp_close(file, NULL);
962 fail:
963         unlock_kernel();
964         return 0;
965 }



How about a wrapper that starts the executable(s), checks the exit status of the wrapped executable, looks for the corrseponding core file and changes its permissions.

Siem Korteweg