How to validate OpenDaylight Cluster?

Solution Verified - Updated -

Environment

  • Red Hat OpenStack Director v13
  • OpenDaylight Oxygen

Issue

  • How to verify OpenDaylight Cluster in RedHat OpenStack 13?

Resolution

  • Red Hat OpenStack Platform 13 supports High Availability clustering for both neutron and the OpenDaylight controller.
  • The OpenDaylight role is composable, so it can be deployed on the same nodes as the neutron nodes, or on separate nodes.
  • All nodes keep in synchronization with each other. In Open_vSwitch database schema (OVSDB) Southbound, available controller nodes share the Open vSwitches, so that each switch is handled by a specific node in the cluster.
  • Red Hat OpenStack Platform director deploys the OpenDaylight controller nodes, it has all the information required to configure clustering for OpenDaylight.
  • The OpenDaylight cluster is running as an active/active mode which distributed in controller plane network.
  • There are no specific networking requirements to support the cluster, such as bonding, MTUs, and so on.
  • The nodes also need a module-shards.conf file that describes how data is replicated in the cluster. The Red Hat OpenStack Platform director makes the correct settings based on the selected deployment configuration.
  • To set up a cluster with multiple nodes, we recommend that you use a minimum of three machines.

  • Verify the installation status of odl-mdsal-clustering feature in Karaf.

    opendaylight-user@root>feature:list -i | grep odl-mdsal-clustering
    odl-mdsal-clustering-commons                    | 1.7.3.redhat-1   |          | Started | odl-controller-1.7.3.redhat-1                   | odl-mdsal-clustering-commons
    

akka.conf

  • The akka.conf file depends on the nodes, while the module-shards.conf file depends on the nodes and the installed datastores (and hence the installed features, which we control to a large extent).
  • Each OpenDaylight node needs an akka.conf configuration file that identifies the node’s role (its name in the cluster) and lists at least some of the other nodes in the cluster, the seed nodes.
  • Follow below steps to review the akka.conf in the opendaylight container.

    # docker exec -it -u root opendaylight_api cat /opt/opendaylight/configuration/initial/akka.conf
    odl-cluster-data {
      akka {
        remote {
          artery {
            enabled = off
            canonical.hostname = "xxx.xx.x.22"
            canonical.port = 2550
          }
          netty.tcp {
            hostname = "xxx.xx.x.22"
            port = 2550
          }
        }
        cluster {
          # Remove ".tcp" when using artery.
          seed-nodes = [
          "akka.tcp://opendaylight-cluster-data@xxx.xx.x.22:2550",
          "akka.tcp://opendaylight-cluster-data@xxx.xx.x.12:2550",
          "akka.tcp://opendaylight-cluster-data@xxx.xx.x.13:2550",
          ]
          roles = ["member-0"]
        }
        persistence {
          # By default the snapshots/journal directories live in KARAF_HOME. You can choose to put it somewhere else by
          # modifying the following two properties. The directory location specified may be a relative or absolute path.
          # The relative path is always relative to KARAF_HOME.
          # snapshot-store.local.dir = "target/snapshots"
          # journal.leveldb.dir = "target/journal"
          journal {
            leveldb {
    # Set native = off to use a Java-only implementation of leveldb.
    # Note that the Java-only version is not currently considered by Akka to be production quality.
    # native = off
            }
          }
        }
      }
    }
    

module-shards.conf

  • The Opendaylight nodes also need a module-shards.conf file that describes how data is replicated in the cluster.
  • The Red Hat OpenStack Platform director makes the correct settings based on the selected deployment configuration.
  • The akka.conf file depends on the nodes, while the module-shards.conf file depends on the nodes and the installed datastores.
  • Follow below steps to review the module-shards.conf in the opendaylight container.

    # docker exec -it -u root opendaylight_api cat /opt/opendaylight/configuration/initial/module-shards.conf
    module-shards = [
      {
          name = "default"
          shards = [
        {
      name = "default"
      replicas = [
        "member-0",
        "member-1",
        "member-2",
        ]
        }
          ]
            },
        ]
    

Note:

  • The opendaylight cluster is not defined dynamically, which means that it does not adjust automatically. It is not possible to start a new node and connect it to an existing cluster by configuring the new node only. The cluster needs to be informed about nodes' additions and removals through the cluster administration.
  • ODL cluster is based on a leader/followers model. One of the active nodes is elected as the leader, and the remaining active nodes become followers which are handled according to the Raft consensus-based model.
  • In ODL, if a node loses its connection with the cluster, its local transactions will no longer make forward progress.
  • Eventually, the OpenDaylight process will timeout (10 minutes by default) and the front-end actor will stop.
  • The cluster communications do not support high latencies, but latencies on the order of data-center level are acceptable.

Cluster Monitoring:

  • OpenDaylight exposes shard information via MBeans, which exposed via a REST API using Jolokia, provided by the odl-jolokia Karaf feature. This is convenient, due to a significant focus on REST in OpenDaylight.
  • To monitor the status of the cluster, you must enable the Jolokia support in OpenDaylight.

    feature:list -i | grep odl-jolokia
    odl-jolokia                                     | 1.10.3.redhat-1  | x        | Started | odl-extras-1.10.3.redhat-1                      | Jolokia JMX/HTTP bridge
    
  • The following REST API provides a schema lists of all available MBeans:

    curl http://xxx.xx.x.10:8081/jolokia/list| python -m json.tool
    {
        "request": {
            "type": "list"
        },
        "status": 200,
        "timestamp": 1516536035,
        "value": {
            "JMImplementation": {
                "type=MBeanServerDelegate": {
                    "attr": {
                        "ImplementationName": {
                            "desc": "The JMX implementation name (the name of this product)",
                            "rw": false,
                            "type": "java.lang.String"
                        },
                        "ImplementationVendor": {
                            "desc": "the JMX implementation vendor (the vendor of this product).",
                            "rw": false,
                            "type": "java.lang.String"
                        },
                        "ImplementationVersion": {
                            "desc": "The JMX implementation version (the version of this product).",
                            "rw": false,
                            "type": "java.lang.String"
                        },
                        "MBeanServerId": {
                            "desc": "The MBean server agent identification",
                            "rw": false,
                            "type": "java.lang.String"
                        },
                        "SpecificationName": {
                            "desc": "The full name of the JMX specification implemented by this product.",
                            "rw": false,
                            "type": "java.lang.String"
                        },
                        "SpecificationVendor": {
                            "desc": "The vendor of the JMX specification implemented by this product.",
                            "rw": false,
                            "type": "java.lang.String"
                        },
                        "SpecificationVersion": {
                            "desc": "The version of the JMX specification implemented by this product.",
                            "rw": false,
                            "type": "java.lang.String"
                        }
                    },
                    "desc": "Represents  the MBean server from the management point of view."
                }
            },
    
    <..trim..>
    }
    
  • To read the information about the shards local to the queried OpenDaylight instance use the following REST calls. For the config datastore:

    # curl http://xxx.xx.x.10:8081/jolokia/read/org.opendaylight.controller:type=DistributedConfigDatastore,Category=ShardManager,name=shard-manager-config | python -m json.tool
    {
        "request": {
            "mbean": "org.opendaylight.controller:Category=ShardManager,name=shard-manager-config,type=DistributedConfigDatastore",
            "type": "read"
        },
        "status": 200,
        "timestamp": 1516536067,
        "value": {
            "LocalShards": [
                "member-0-shard-default-config",
                "member-0-shard-prefix-configuration-shard-config"
            ],
            "MemberName": "member-0",
            "SyncStatus": true
        }
    }
    
  • REST API for the operational data store. The output contains information on shards present on the node:

    # curl http://xxx.xx.x.10:8081/jolokia/read/org.opendaylight.controller:type=DistributedOperationalDatastore,Category=ShardManager,name=shard-manager-operational | python -m json.tool
    {
        "request": {
            "mbean": "org.opendaylight.controller:Category=ShardManager,name=shard-manager-operational,type=DistributedOperationalDatastore",
            "type": "read"
        },
        "status": 200,
        "timestamp": 1516536119,
        "value": {
            "LocalShards": [
                "member-0-shard-default-operational",
                "member-0-shard-prefix-configuration-shard-operational",
                "member-0-shard-entity-ownership-operational"
            ],
            "MemberName": "member-0",
            "SyncStatus": true
        }
    }
    
  • The exact names from the “LocalShards” lists are needed for further exploration, as they will be used as part of the URI to look up detailed info on a particular shard.
  • The output helps to identify shard state (leader/follower, voting/non-voting), peers, follower details if the shard is a leader, and other statistics/counters.
  • An example output for the member-0-shard-default-operational looks like this:

    # curl http://xxx.xx.x.10:8081/jolokia/read/org.opendaylight.controller:Category=Shards,name=member-0-shard-default-operational,type=DistributedOperationalDatastore | python -m json.tool
    {
        "request": {
            "mbean": "org.opendaylight.controller:Category=Shards,name=member-0-shard-default-operational,type=DistributedOperationalDatastore",
            "type": "read"
        },
        "status": 200,
        "timestamp": 1516536153,
        "value": {
            "AbortTransactionsCount": 0,
            "CommitIndex": 39188277,
            "CommittedTransactionsCount": 0,
            "CurrentTerm": 1,
            "FailedReadTransactionsCount": 0,
            "FailedTransactionsCount": 0,
            "FollowerInfo": [],
            "FollowerInitialSyncStatus": true,
            "InMemoryJournalDataSize": 28300599,
            "InMemoryJournalLogSize": 1778,
            "LastApplied": 39188277,
            "LastCommittedTransactionTime": "1970-01-01 00:00:00.000",
            "LastIndex": 39188278,
            "LastLeadershipChangeTime": "2017-12-29 09:38:46.750",
            "LastLogIndex": 39188278,
            "LastLogTerm": 1,
            "LastTerm": 1,
            "Leader": "member-2-shard-default-operational",
            "LeadershipChangeCount": 1,
            "PeerAddresses": "member-1-shard-default-operational: , member-2-shard-default-operational: akka.tcp://opendaylight-cluster-data@xxx.xx.x.13:2550/user/shardmanager-operational/member-2-shard-default-operational",
            "PeerVotingStates": "member-1-shard-default-operational: true, member-2-shard-default-operational: true",
            "PendingTxCommitQueueSize": 0,
            "RaftState": "Follower",
            "ReadOnlyTransactionCount": 0,
            "ReadWriteTransactionCount": 0,
            "ReplicatedToAllIndex": 19330656,
            "ShardName": "member-0-shard-default-operational",
            "SnapshotCaptureInitiated": false,
            "SnapshotIndex": 39186500,
            "SnapshotTerm": 1,
            "StatRetrievalError": null,
            "StatRetrievalTime": "17.48 ms",
            "TxCohortCacheSize": 0,
            "VotedFor": "member-2-shard-default-operational",
            "Voting": true,
            "WriteOnlyTransactionCount": 0
        }
    }
    

Note:

  • The Integration team is maintaining a Python-based tool, that takes advantage of the above MBeans exposed via Jolokia, and the system metrics project offers a DLUX based UI to display the same information.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments