In some scenarios, the openstack neutron-agent status will show as xxx even though you could see he neutron agents services are up and running in the network and compute nodes. Also you could see a fluctuation in the agent status if you try the agent-list command repeatedly. Confusing, right?
Actually the problem is not in the actual agent status, but with two default configurations in neutron.conf ie agent_down_time and report_interval. It is the interval during which neutron will check the agent status. There is a bug reported against this issue
https://bugs.launchpad.net/neutron/+bug/1293083
As per the details in the bug " report_interval" is how often an agent sends out a heartbeat to the service. The Neutron service responds to these 'report_state' RPC messages by updating the agent's heartbeat DB record. The last heartbeat is then compared to the configured agent_down_time to determine if the agent is up or down"
The neutron agent-list command uses the agent_down_time value to display the status. The default values are set very low, because of which the alive status is shown as down/fluctuating.
Solution: As suggested in the solution for the bug, update the values of agent_down_time and report_interval to 75 and 30 seconds respectively. Since the above mentioned rpc issue with open-vswitch agent in compute is resolved by this, all the agents will be shown as alive
Actually the problem is not in the actual agent status, but with two default configurations in neutron.conf ie agent_down_time and report_interval. It is the interval during which neutron will check the agent status. There is a bug reported against this issue
https://bugs.launchpad.net/neutron/+bug/1293083
As per the details in the bug " report_interval" is how often an agent sends out a heartbeat to the service. The Neutron service responds to these 'report_state' RPC messages by updating the agent's heartbeat DB record. The last heartbeat is then compared to the configured agent_down_time to determine if the agent is up or down"
The neutron agent-list command uses the agent_down_time value to display the status. The default values are set very low, because of which the alive status is shown as down/fluctuating.
Solution: As suggested in the solution for the bug, update the values of agent_down_time and report_interval to 75 and 30 seconds respectively. Since the above mentioned rpc issue with open-vswitch agent in compute is resolved by this, all the agents will be shown as alive
Comments
Post a Comment