Node Execution failures on OSE 1.2. All message to a node timeout.
Issue
We are getting Node Execution Failures in our broker logs. We never receive anything back from a node once it gets into this state, not even with a 480 second timeout.
This is the script we use to produce the issue:
[root@rhlappfac610 ~]# cat openshiftloadtest3.sh
#!/usr/bin/env ruby
$threads = 10
$apps_per_thread = 5
$app_carts = "jbossews-1.0"
$sleep_interval = 3
t_ar = Array.new
@count = 0
(1..$threads).each do
t_ar << Thread.new do
t_count = @count.to_s
begin
(1..$apps_per_thread).each do
puts `rhc app-create test#{t_count} #{$app_carts} --no-keys --no-git --no-dns`
sleep $sleep_interval
puts `rhc cartridge-add --app test#{t_count} cron-1.4`
puts `rhc app-delete test#{t_count} --confirm`
sleep $sleep_interval
end
rescue => e
puts "Failed " + e.inspect
end
end
sleep 1
@count += 1
end
Thread.list.each {|t| t.join unless t == Thread.current }
puts "Stess test completed."
We had also increased the Minimum UID of applications to help integrate with LDAP authentication in our environment. Here are changes we've made:
Increase the start UID and associated parameters:
[root@broker plugins.d]# diff openshift-origin-msg-broker-mcollective.conf openshift-origin-msg-broker-mcollective.conf.20140405
15c15
< DISTRICTS_FIRST_UID=50001
---
> DISTRICTS_FIRST_UID=1000
26,28d25
< GEAR_MIN_UID=50001 # Lower bound of UID used to create gears
< GEAR_MAX_UID=56000 # Upper bound of UID used to create gears
< UID_BEGIN=50001
[root@node1 openshift]# diff node.conf node.conf.org
26,31c26,27
< GEAR_MIN_UID=50001 # Lower bound of UID used to create gears
< GEAR_MAX_UID=56000 # Upper bound of UID used to create gears
< UID_BEGIN=50001
<
---
> GEAR_MIN_UID=1000 # Lower bound of UID used to create gears
> GEAR_MAX_UID=6999 # Upper bound of UID used to create gears
Environment
Openshift Enterprise (OSE) 1.2
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.