Satellite master and capsule communication
Hello,
Does anyone have any detailed technical information on how satellite master and capsule communicates ? how many capsules one master can handle by putting all the capsules in different geographic locations and how the content sync will work to capsule, when i promote content view, how much latency or timeout will happen to sync contents to capsules.
Thanks,
DJ
Responses
Hello,
The Architecture Guide has a section Capsule Server Overview which includes network diagrams.
The Installtion Guide has an appendix Capsule Server Scalability Considerations.
Hello
Re. "timeout value or heartbeat" I can try to find out, but it would help if you explained why you want to know. I think you should also raise a support case if there is a specific issue you need to resolve. You might find this Kbase solution interesting goferd stops connecting to Satellite/Capsule after few hours.
Re. "have any affect on downloading contents to capsules"
If a repository is set to "on demand" it should only download an RPM when required, regardless of the repository location. The repositories in Satellite Server are handled by the integrated Capsule. The download policies are described here Using Download Policies, but it does not make these points clear. I will raise a docs bug and get one of the developers to confirm what I have said is correct.
Speaking purely about communication related to katello-agent / goferd here. Recall the setup is: goferd on any client needs to be connected to qpidd on Satellite via a network of qdrouterd routers:
goferdconnects to Satellite/Capsule (wherever the client is registered to) toqdrouterds port5647- this connection has heartbeat 10s.qdrouterdon Capsule connects toqdrouterdon Satellite to port5646- this connection has heartbeat 1 or 2 seconds I think (can check if interested). Note that if this connection gets disrupted, it is tried to re-connected frequently - but even when the reconnect is successful, it can take up to approx. 20 seconds to "heal" from the disruption and to route links fromgoferdclients again.qdrouterdon Satellite is connected toqpiddvia loopback to Satellite. I think no heartbeat is there, but there is no need of it (due tolointerface used).
The inter-router heartbeats have default interval 1 second. ROUTER_HELLO AMQP frame is sent every second and if no response is received within 3 seconds, connection/session is assumed as dead.
Both is configurable via qdrouterd.conf:
router {
mode: interior
router-id: here.should.be.fqdn.of.the.host
hello-interval: 1
hello-max-age: 3
}
You need to restart qdrouterd service to apply the change. Note that a Satellite/Capsule upgrade (in fact any execution of satellite-installer will purge out this config, due to problems like https://bugzilla.redhat.com/show_bug.cgi?id=1305782.
Re connector to broker: that is the qdrouterd->qpidd connection / connector. Since the version of qpidd that Satellite uses does not implement this type of heatrbeats, they have to be disabled from qdrouterd side - that is achieved via the idle-timeout-seconds option. Anyway since this connection goes via loopback, heartbeats are not such important.
Yes, inter-router I mean qdrouterd(Satellite) <-> qdrouterd(Capsule).
1) It is safe to increase the values. With the obvious impact of detecting lost connection later (but preventing false alarms and redundant connection drops in case peer replies a bit later). Note that you should set the values uniformly accross whole qdrouterd network (i.e. to the Satellite and all Capsules), to prevent misconfigs like "my hello-interval is longer than peer's hello-max-age".
2) No specific requirements. You just might need to tune up the hello interval / max age, in case RTT can reach the limit under normal / heavy load. You might be interested to test and see the latency, by adding to the config:
log {
module: ROUTER_HELLO
enable: trace+
timestamp: true
output: /path/to/qdrouterd.log
}
and restrating qdrouterd (ensure qdrouterd user can write to the file, and be aware a router start purges old file content, instead of appending to its end). Then you will see when each peer sent or received the HELLO keepalive.
I raised this bug: Bug 1446098 - The "Using Download Policies" section lacks detail. Feel free to add yourself to the c.c. list if you want to follow the progress.
Thank you.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
