Unable to get valid subscription for VMs

Latest response

We have two datacenters, each with their own VMware hypervisors that we need to subscribe RHEL 7 VM's on. For the one datacenter, everything is working fine. However, in the other one, things are a bit strange. I was able to register & subscribe the VM on which virt-who is running, but any other VM I spin-up and register/subscribe only gets a 24hr temporary subscription. I've tried:

subscription-manager remove --all
subscription-manager register
subscription-manager attach --auto

... but the VMs still come up with a temporary (24hr) subscription. If I run "subscription-manager list --consumed" I see this (other lines showing account details removed):

Service Level: Standard
Service Type: L1-L3
Status Details: Guest has not been reported on any host and is using a temporary unmapped guest subscription.
Subscription Type: Stackable (Temporary)
Starts: 08/27/2016
Ends: 12/20/2016
System Type: Virtual

The VMs are put on the same hypervisor as the one that is running virt-who and was successfully registeres/subscribed, so I'm not really sure why additional VMs won't subscribe. We have a total of 24 subscriptions and only 4-5 are currently being used, so not enough subscriptions isn't the issue. Any idea? I can post the /var/log/rhsm/rhsm.log contents if need be. Thanks.

Responses

I do see a lot of these types of entries in the /var/log/rhsm/rhsm.log file, but not sure what they mean exactly or if this is a clue to what may be happening:

2016-12-19 09:49:22,324 [DEBUG] @esx.py:142 - Waiting for ESX changes 2016-12-19 09:49:22,332 [DEBUG] @subscriptionmanager.py:112 - Authenticating with certificate: /etc/pki/consumer/cert.pem 2016-12-19 09:49:22,664 [DEBUG] @subscriptionmanager.py:146 - Checking if server has capability 'hypervisor_async' 2016-12-19 09:49:22,935 [DEBUG] @subscriptionmanager.py:158 - Server does not have 'hypervisors_async' capability 2016-12-19 09:49:22,936 [INFO] @subscriptionmanager.py:165 - Sending update in hosts-to-guests mapping: { "31363636-3735-584d-5132-323430343639": [], "31363636-3735-584d-5132-323430343634": [ { "guestId": "420b1fba-43f7-d054-c389-68ffd18c025e", "state": 1, "attributes": { "active": 1, "virtWhoType": "esx", "hypervisorType": "vmware" } } ] }

I'm (still somewhat) new to Satellite and virt-who, but I had to learn about it to diagnose something similar (but probably not the same, see https://access.redhat.com/discussions/2808161 for that).

I might be able to give some guesses. It is useful to know what is logged also to see what you do not have to investigate further.

2016-12-19 09:49:22,324 [DEBUG] @esx.py:142 - Waiting for ESX changes 

Virt-who connects to the vCenter/vSphere on https port and communicates with it using the vSphere api calls in order to find out what hypervisors each vm runs on. Not sure exactly how it works (if it polls at intervals, if it is triggered when there is a change and asks for a full list, or if it just gets it. The call is described at WaitForUpdatesEx I believe). The result is a hosts-to-guest mapping (in json format) that it sends to the subscription manager component in Satellite 6, which is called Candlepin.

You can easily check if the username/password is correct using a browser and the address https://vsphereserver.example.com/vsphere-client/

2016-12-19 09:49:22,332 [DEBUG] @subscriptionmanager.py:112 - Authenticating with certificate: /etc/pki/consumer/cert.pem 

The certificate that identifies the consumer (subscriber), the machine virt-who is running on. If the machine is correctly registered to the satellite server it should work. In recent versions there was added config options rhsm_username, rhsm_password that can be used. (Tip: To see the info in the certificate, run "rct cat-cert /etc/pki/consumer/cert,pem")

2016-12-19 09:49:22,664 [DEBUG] @subscriptionmanager.py:146 - Checking if server has capability 'hypervisor_async' 
2016-12-19 09:49:22,935 [DEBUG] @subscriptionmanager.py:158 - Server does not have 'hypervisors_async' capability 

I think this is normal when connecting to VMware/vSphere hypervisors.

2016-12-19 09:49:22,936 [INFO] @subscriptionmanager.py:165 - Sending update in hosts-to-guests mapping: {
    "31363636-3735-584d-5132-323430343639": [],
    "31363636-3735-584d-5132-323430343634": [{
            "guestId": "420b1fba-43f7-d054-c389-68ffd18c025e",
            "state": 1,
            "attributes": {
                "active": 1,
                "virtWhoType": "esx",
                "hypervisorType": "vmware"
            }
        }
    ]
}

This is the info sent to Candlepin. It is an array of hypervisors (hosts) and the VMs (guests) running on each of them. The first hypervisor is empty, the second has one guest.

It is using and UUID and not hostnames, but there now is an option (hypervisor_id=hostname) to use proper hostnames for the hypervisors which makes things a bit easier for us humans. Not sure if using UUIDs for hypervisors is wrong now, but the docs shows it set (see https://access.redhat.com/documentation/en/red-hat-satellite/6.2/single/virtual-instances-guide/ and go to section 5.1). And I'm absolutely not sure what happens if you change it on an existing setup.

Anyway: What is happening next?

If there is an error it should be in the next (few) lines.

This is ok:

2016-11-30 11:32:15,976 [virtwho.main DEBUG] MainProcess(76159):MainThread @executor.py:send_report:101 - Report for config "vcenter1" sent

This is an example of when something failed.

2016-12-09 11:51:48,976 [virtwho.main ERROR] MainProcess(24310):MainThread @executor.py:send:156 - Error in communication with subscription manager:

You can also look in the /var/log/candlepin/candlepin.log on the satellite server for entries at the same timestamp and later. I see things like:

INFO  org.candlepin.resource.HypervisorResource - Syncing virt host: esx113.example.com (42 guest IDs)
INFO  org.candlepin.resource.ConsumerResource - Updating 42 guest IDs.
INFO  org.candlepin.resource.ConsumerResource - removing IDs.

etc. etc.

Thanks. My output looks very similar to what you posted, so I think it should be working (but isnt). I went ahead and just opened a support ticket.

Still curious.... what is the first line(s) after :

2016-12-19 09:49:22,936 [INFO] @subscriptionmanager.py:165 - Sending update in hosts-to-guests mapping: {
    "31363636-3735-584d-5132-323430343639": [],
    "31363636-3735-584d-5132-323430343634": [{
            "guestId": "420b1fba-43f7-d054-c389-68ffd18c025e",
            "state": 1,
            "attributes": {
                "active": 1,
                "virtWhoType": "esx",
                "hypervisorType": "vmware"
            }
        }
    ]
}

That should be either an error message one can hope leads one in the correct direction, or say something like "Report for config "myconfig" sent"

Actually, neither. Here is a snippet from the log ...

2016-12-18 05:19:18,556 [DEBUG]  @esx.py:142 - Waiting for ESX changes
2016-12-18 05:19:18,559 [DEBUG]  @subscriptionmanager.py:112 - Authenticating with certificate: /etc/pki/consumer/cert.pem
2016-12-18 05:19:19,146 [DEBUG]  @subscriptionmanager.py:146 - Checking if server has capability 'hypervisor_async'
2016-12-18 05:19:19,396 [DEBUG]  @subscriptionmanager.py:158 - Server does not have 'hypervisors_async' capability
2016-12-18 05:19:19,396 [INFO]  @subscriptionmanager.py:165 - Sending update in hosts-to-guests mapping: {
    "31363636-3735-584d-5132-323430343639": [],
    "31363636-3735-584d-5132-323430343634": [
        {
            "guestId": "420b1fba-43f7-d054-c389-68ffd18c025e",
            "state": 1,
            "attributes": {
                "active": 1,
                "virtWhoType": "esx",
                "hypervisorType": "vmware"
            }
        }
    ]
}
2016-12-18 05:34:18,583 [DEBUG]  @esx.py:142 - Waiting for ESX changes

Stephen,

Can you provide some more details here, as I don't fully understand your configuration. From what you've stated so far, you have two data centres, each with a VMware hypervisor. In each data centre you have virt-who installed on one of the virtual machines. I'll use the naming DC1 for the first data centre and DC2 for the second data centre.

Everything's working as expected in DC1, with virt-who installed and virtual machines provisioned on the hypervisor are properly subscribed. In DC2 you also have virt-who running but only the virtual machine on which virt-who is installed is correctly subscribed. For all virtual machines provisioned in DC2, they are only granted temporary subscriptions.

Can you please provide the virt-who configuration files used in both data centres, with confidential details removed?

The fact that virtual machines are being granted temporary subscriptions is not alone an indication of a problem. Subscriptions for virtual machines are associated with the hypervisor. For Red Hat Satellite to grant a subscription, it must know on which hypervisor the virtual machine is hosted. When a virtual machine is first provisioned, it is granted a temporary subscription so that it can access content from the Satellite instance. Within the 24-hour period, the virt-who daemon generally obtains information about which hypervisors host which virtual machines and reports this to the Satellite server. Once this is known, the Satellite server can then grant a permanent subscription to the virtual machine.

The subscription process for virtual machines is described in more detail in the Virtual Instances Guide.

Thanks Russell. What you describe is exactly my situation. I'm going into this a little blind as the person who initially set it up left our company. It was a project put on the shelf for several months and has now gotten revived. Our subscriptions are good from Aug/2016 to Aug/2017. Here are the two virt-who.conf files (*** replacing confidential info):

DC1:

VIRTWHO_BACKGROUND=1

VIRTWHO_DEBUG=1

VIRTWHO_ESX=1

VIRTWHO_ESX_OWNER=***

VIRTWHO_ESX_ENV=Library

VIRTWHO_ESX_SERVER=mp2-vcep1.op-zone1.sfo1

VIRTWHO_ESX_USERNAME=***

VIRTWHO_ESX_PASSWORD=***

DC2:

VIRTWHO_BACKGROUND=1

VIRTWHO_DEBUG=1

VIRTWHO_ESX=1

VIRTWHO_ESX_OWNER=***

VIRTWHO_ESX_ENV=Library

VIRTWHO_ESX_SERVER=mp1-vcep1.op-zone1.aus1

VIRTWHO_ESX_USERNAME=***

VIRTWHO_ESX_PASSWORD=***

I've tried unregistering and re-registering the VMs several times with no luck. I've waited >24hrs but they still only get a temporary subscription. For the VMs on DC1 that do get registered, they get a full subscription immediately (i.e., no waiting).

When you say "Here are the two virt-who.conf files", do you really mean /etc/sysconfig/virt-who (note: without an extension) or /etc/virt-who.conf (with extension)? Or is something mixed up here?

What you show here looks like the typical "environment variables to be read by the start up scripts in /etc/init.d" file that the Red Hat family of Linuxes use that are found in /etc/sysconfig/.

What the /etc/virt-who.conf and /etc/virt-who.d/* files should look like is described in man virt-who-config. Here is an example from that page:

       [test-esx]
       type=esx
       server=1.2.3.4
       username=admin
       password=password
       owner=test
       env=staging
       rhsm_username=admin
       rhsm_password=password

Virt-who also reads in these environment variables (starting with VIRTWHO_) from its environment if they exist. So it can use the config file /etc/virt-who.conf and files in /etc/virt-who.d/*, command line parameters, and environment variables. "Choice is good", seems to be the theme...

To me is seem that the cleanest way is to use /etc/virt-who.conf for generic setup and connection to the Satellite, then put one separate file for each vCenter one wants to connect to in /etc/virt-who.d/. Then it is easy to add new, and to test them one by one by moving the others files out and restarting virt-who (or run a one-shot with debug messages to a file: virt-who -o -d > logfile.txt).

I don't like to use the environment variables, especially not put passwords in there.

By the way: For anyone using systemd (as RHEL7 does), and not sysvinit or upstart (as RHEL6 does) the variables in /etc/sysconfig/virt-who are available because in the /usr/lib/systemd/system/virt-who.service we have:

EnvironmentFile=-/etc/sysconfig/virt-who

These are RHEL7 boxes and we use the variables in /etc/sysconfig/virt-who exclusively (see above). Same exact setup in one DC works but the other does not (or, at least, it does not work for any VM besides the one virt-who is running).

Stephen,

Thanks for that extra information. The cause for failed subscriptions of hosts in DC2 is still a mystery. The virt-who configurations look to be the same. Examining the RHSM log file (/var/log/rhsm/rhsm.log) is the best source of diagnostic information for subscription management. However, that log may contain sensitive information, so I would not recommend posting its content here in the discussion.

I have some final suggestions, not in order of importance. 1. From the information you have included, the virt-who configuration files look to be the same where they should be the same. However, it would we worthwhile to do a "diff" of them in case there's something not obvious. 2. Check that the specified VMware user account and password are valid. You included an extract of the rhsm.log file, but I'm not sure from which DC's instance of virt-who that was obtained. 3. If there is a proxy between the host running virt-who and VMware ESX instance, the virt-who configuration must be amended to account for that. For details, see the following section of the Virtual Instances Guide.

Russell, Thanks. I opened a support ticket and the engineer said my /etc/virt-who.d/template.conf file had nothing in it, which was the problem. However, we are putting all the info in /etc/sysconfig/virt-who and not template.conf. I went ahead and did as he said, but was getting a login error even though the credentials are fine. I reverted, and now I'm getting an error saying it can't find /var/run/libvirt/libvirt-sock-ro. I found this error is because libvirtd service is not running, but when I try to start it, it says the service doesn't even exist. What's strange is that on the DC1's working VM, the service isn't running and doesn't exist either, but it has no error messages. Just very strange all around.

Escalate it.

My experience is that for the first-line support-technicians virt-who is a black box they don't really understand, and the documentation doesn't really tell enough to troubleshoot effectively (and it is a moving target, for good and bad (good mostly)).

The /etc/virt-who.d/template.conf can of course be used, but that isn't the purpose of it. As installed it has all(?) the possible config statements with comments, but everything is commented out. So it is for copying and editing into a real config. At least that is my theory for calling it "template". YMMV.

Where did you get the login error? Against vmware or satellite?

Note that if you configuring by putting VIRTWHO_* environment variables in the /etc/sysconfig/virt-who source script and have login names in AD format (like "DOMAIN\username") you have to escape the backslash (so: "DOMAIN\username"), because that is the way scripts work, the backslash is an escape character.

Yea, the username is escaped ... Examle:

VIRTWHO_USERNAME=ad\\testuser

EDIT: The error looks like it was trying to log directly into the ESXi server I gave for the "hypervisor_id=" variable. Here is the full error:

2016-12-21 07:55:40,905 [DEBUG]  @virtwho.py:133 - Using config named 'vmware'
2016-12-21 07:55:40,906 [INFO]  @virtwho.py:697 - Using configuration "vmware" ("esx" mode)
2016-12-21 07:55:40,954 [DEBUG]  @virtwho.py:216 - Starting infinite loop with 3600 seconds interval
2016-12-21 07:55:41,042 [DEBUG]  @esx.py:55 - Log into ESX
2016-12-21 07:55:45,330 [ERROR]  @esx.py:238 - Unable to login to ESX
Traceback (most recent call last):
  File "/usr/share/virt-who/virt/esx/esx.py", line 235, in login
    self.client.service.Login(_this=self.sc.sessionManager, userName=self.username, password=self.password)
  File "/usr/lib/python2.7/site-packages/suds/client.py", line 542, in __call__
    return client.invoke(args, kwargs)
  File "/usr/lib/python2.7/site-packages/suds/client.py", line 602, in invoke
    result = self.send(soapenv)
  File "/usr/lib/python2.7/site-packages/suds/client.py", line 653, in send
    result = self.failed(binding, e)
  File "/usr/lib/python2.7/site-packages/suds/client.py", line 708, in failed
    r, p = binding.get_fault(reply)
  File "/usr/lib/python2.7/site-packages/suds/bindings/binding.py", line 265, in get_fault
    raise WebFault(p, faultroot)
WebFault: Server raised fault: 'Cannot complete login due to an incorrect user name or password.'
2016-12-21 07:55:45,333 [ERROR]  @virt.py:303 - Virt backend 'vmware' fails with error: Server raised fault: 'Cannot complete login due to an incorrect user name or password.'

It's the same username/password that was in /etc/sysconfig/virt-who.

Nevermind, I resolved the issue on my own after a bit more investigating.

It turned out the folder and ESXi hosts on vCenter did not have permissions set for the user, while the other datacenter was all set up with them. After adding the user to the permissions as Read-Only and restarting virt-who, I was able to successfully attach a full subscription to the VMs. Definitely learned a lot about this whole thing in the last several days. Appreciate you guys helping me try to debug the issue - it gave me some things to look at further and that's when I stumbled on the permission issue in vCenter.

Stephen,

It's great to know you got this sorted out. One benefit of having this discussion is that it may benefit other customers.

Regarding permissions, I should have asked you to check that. I made mention of testing credentials in the Virtual Instances Guide's Troubleshooting chapter, Credentials.

A handy troubleshooting parameter for virt-who is "--one-shot", as described in Configure and Start virt-who Service. This runs virt-who once, and outputs the virtual servers found, in JSON data format.

Russell - Thanks. After reading your section on Credentials, that was exactly what I should have done. While I was told all this was already working at one time, one should never just assume and check the basics first. Lesson learned.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.