Cluster Control Protocol Reference

  • Upload
    deeps

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

  • 7/24/2019 Cluster Control Protocol Reference

    1/15

    Cluster Control Protocol ReferenceNG FP3

    For additional technical information about Check Point products, consult Check Point

    SecureKnowledge database at

    http://support.checkpoint.com/kb/

  • 7/24/2019 Cluster Control Protocol Reference

    2/15

    Preface

    IntroductionThis document explores various technical aspects of the Cluster Control Protocol as utilized by ClusterXL. Although

    arts of the Cluster Control Protocol are also used by OPSEC High Availability products, this aspect will not be cov

    This document is not meant as an installation guide, and assumes the reader has a working knowledge of the Cluster

    roduct.

    Overview

    The introduction of ClusterXL into an existing network, often implies that certain changes to that network be made. lterations are needed to accommodate both the clustered topology, as well as the Cluster Control Protocol itself.

    Understanding these changes is important for the purposes of planning, implementation, monitoring, and troublesho

    Moreover, adequate comprehension of the ClusterXL decision making process is needed. Otherwise, there will be n

    ontext in which to place observed cluster behavior.

    Therefore, the enclosed sections will explore the following topics:

    Implementation Planning

    Cluster Control Protocol Overview Cluster Control Protocol Logic

  • 7/24/2019 Cluster Control Protocol Reference

    3/15

    Implementation Planning

    NOTE:Due to the enhanced Hot/Standby configuration available in FP3, Legacy HA will not be covered in the

    ollowing sections.

    High Availability (New Mode)

    P3 introduces a new form of operation for High Availability. Simply referred to as "New Mode", it offers all the

    opology advantages of Load Sharing, while maintaining a Hot/Standby orientation.

    mportant factors to consider while planning for a New Mode implementation are:

    Switch support/configuration for layer two multicast forwarding

    VLAN configuration IP address migration

    SmartCenter/CMA location

    witch Support

    The Cluster Control Protocol used by both New Mode, and Load Sharing configurations, makes use of layer two

    multicast. In keeping with multicast standards, this multicast address is used only as the destination, and is used in a

    CCP packets sent on "non-secured" interfaces.

    A layer two switch connected to non-secured interfaces, must be capable of forwarding multicast packets to ports wi

    hat VLAN. It is acceptable that the switch forward such traffic to all ports within the given VLAN. However, it is

    onsidered more efficient to forward to only those ports connecting cluster members.

    The steps needed to enable multicast support will vary according to the switch vendor, and model. Please check you

    witch documentation for details.

    f the connecting switch is incapable of forwarding multicast, CCP can be changed to use broadcast instead. To togg

    etween these two modes use the command (mode survives a reboot):

    cphaconf set_ccp broadcast/multicast'

    VLAN Configuration

    t is not recommended to connect the non-secured interfaces of multiple clusters to the same VLAN. Doing so will c

    he connecting switch ports to flap. If such a need exists, a separate VLAN, and/or switch will be needed for each clu

    A HotFix is also available from Check Point support which allows this configuration to be supported in FP3.

    Connecting together the secured interfaces of multiple clusters is also not recommended for the same reason. While bove mentioned HotFix may be used, there are additional concerns with this configuration which make it currently

    t bl Th f it i b t t t th d i t f f i l t i li k h

  • 7/24/2019 Cluster Control Protocol Reference

    4/15

    P Address Migration

    t is reasonable to assume that many ClusterXL installs will be either for a new VPN-1/FireWall-1 cluster, or to repl

    ifferent clustering solution. However, it is also reasonable to assume that many will be to provide high availability

    xisting single gateway configuration.

    n the latter case, existing NAT, and IPSec connections will need to be altered to accommodate the new clustered

    andscape. Therefore, it is recommended to take the existing IP addresses from the current gateway, and make these

    luster owned VIP's, or cluster addresses when feasible. Doing so will avoid altering current IPSec endpoint identiti

    well keep Hide NAT configurations the same in many cases.

    martCenter/CMA Location

    A SmartCenter/CMA Server, can install a Security Policy to one or more clusters with only a single install action. It

    o by installing the Security Policy to each cluster member, using the general tab IP of each cluster member object a

    ecipient. This is true regardless of the IP address(es) of the cluster object itself.

    This design affords a great level of flexibility when using either New Mode, or Load Sharing configurations, as the

    martCenter/CMA Server can now reside on any given IP segment. One only needs to ensure that the general tab IP

    ddress of the cluster member object is reachable. If not, simply choose one which will be accessible to the

    martCenter/CMA Server.

  • 7/24/2019 Cluster Control Protocol Reference

    5/15

    Load Sharing

    mportant factors to consider while planning for a Load Sharing implementation are:

    Switch support/configuration for layer two multicast forwarding

    Router support for multicast

    VLAN configuration

    IP address migration

    SmartCenter/CMA location

    witch Support

    The Cluster Control Protocol used by both New Mode, and Load Sharing configurations, makes use of layer two

    multicast. In keeping with multicast standards, this multicast address is used only as the destination, and is used in aCCP packets sent on "non-secured" interfaces.

    A layer two switch connected to non-secured interfaces, must be capable of forwarding multicast packets to ports wi

    hat VLAN. It is acceptable that the switch forward such traffic to all ports within the given VLAN. However, it is

    onsidered more efficient to forward to only those ports connecting cluster members.

    The steps needed to enable multicast support will vary according to the switch vendor, and model. Please check you

    witch documentation for details.

    f the connecting switch is incapable of forwarding multicast, CCP can be changed to use broadcast instead. To toggetween these two modes use the command (mode survives a reboot):

    cphaconf set_ccp broadcast/multicast'

    Router Support

    n addition to the use of multicast by CCP, Load Sharing associates a multicast MAC for each configured cluster IP.

    esign ensures that traffic destined to the cluster is received by all members.

    Therefore, ARP replies sent by a cluster member will indicate that the unicast cluster IP, is reachable via a multicast

    MAC. Some routing devices are incapable of receiving such ARP replies. For instance, all versions of Cisco IOS do

    nclude such support. In such cases, adding a static ARP entry for the cluster IP on the routing device will solve the

    Even still there are some routers such as Extreme routers, Avia routers, and some Nortel models (Passport 1200 or X

    which will not accept this type of static ARP entry. For those cases, ClusterXL FP4 will introduce a new mode of

    peration referred to as Pivot mode. Pivot mode operates as a Load Sharing cluster, but without the need of multicas

    he cluster addresses.

    VLAN Configuration

    t is not recommended to connect the non secured interfaces of multiple clusters to the same VLAN Doing so will c

  • 7/24/2019 Cluster Control Protocol Reference

    6/15

    P Address Migration

    t is reasonable to assume that many ClusterXL installs will be either for a new VPN-1/FireWall-1 cluster, or to repl

    ifferent clustering solution. However, it is also reasonable to assume that many will be to provide high availability

    xisting single gateway configuration.

    n the ladder case, existing NAT, and IPSec connections will need to be altered to accommodate the new clustered

    andscape. Therefore, it is recommended to take the existing IP addresses from the current gateway, and make these

    luster owned VIP's, or cluster addresses when feasible. Doing so will avoid altering current IPSec endpoint identiti

    well keep Hide NAT configurations the same in many cases.

    martCenter/CMA Location

    A SmartCenter/CMA Server, can install a Security Policy to one or more clusters with only a single install action. It

    o by installing the Security Policy to each cluster member, using the general tab IP of each cluster member object a

    ecipient. This is true regardless of the IP address(es) of the cluster object itself.

    This design affords a great level of flexibility when using either New Mode, or Load Sharing configurations, as the

    martCenter/CMA Server can reside on any given IP segment. One only needs to ensure that the general tab IP addr

    f the cluster member object is reachable. If not, simply choose one which will be accessible to the SmartCenter/CM

    erver.

  • 7/24/2019 Cluster Control Protocol Reference

    7/15

    Cluster Control Protocol

    CCP Overview

    The Cluster Control Protocol serves an integral role to the operation of ClusterXL. Specifically, CCP is responsible

    he following:

    Health status reports

    Cluster member probing

    State change commands

    Querying for cluster membership

    Sate table synchronization

    Health Status Reports

    CCP will report the status of a cluster member roughly three times a second, per interface. These reports contain stat

    he transmitting cluster member, as well as the presumed state of other cluster members.

    Cluster Member Probing

    f a cluster member fails to receive status for another member on a given segment, CCP will probe that segment in a

    ttempt to illicit a response. The purpose of such probes is to detect the nature of possible interface failures, and to

    etermine which module has the problem. The outcome of this probe will determine what action is taken next.

    tate Change Commands

    f a cluster member wishes to change state, the command to do so takes place on the defined secured interface.

    Querying Cluster Membership

    When a cluster member comes online, such as with a reboot, it will send as series of CCP query/response messages ain knowledge of it's cluster membership.

    tate Table Synchronization

    When state synchronization is enabled, connection information is updated between cluster members on the defined

    ecured interface.

  • 7/24/2019 Cluster Control Protocol Reference

    8/15

    CCP Message Format

    The Cluster Control Protocol payload is made up a general heading, and one of a series of message types, with each

    aving it's own unique purpose, format, and content.

    General Heading

    This portion contains information necessary for the processing of the encapsulated message type, the most importan

    which is:

    Cluster ID - unique identifier shared amongst all members of a given cluster

    Protocol Version - version and Feature Pack revision

    Source Interface - transmitting interface number as recognized by the OS kernel

    Source Machine ID - member identification according to configured priority. Calculated as priority -1=ID

    Policy ID - Last two bytes of MD4 Policy ID

    Message Types

    Below is a complete listing of the possible CCP message types with description:

    FWHA_MY_STATE - Report source machine's state

    FWHA_Query_STATE - Query other machine's state

    FWHA_IF_PROBE_REQ - Interface active check request

    FWHA_IF_PROBE_RPLY - Interface active check reply

    FWHA_IFCONF_REQ - Interface configuration request

    FWHA_IFCONF_REPLY - Interface configuration reply FWHA_POLICY_CHANGE - Policy ID change request/notification

    FWHAP_SYNC - New Sync packet

  • 7/24/2019 Cluster Control Protocol Reference

    9/15

    CCP Transmission

    Non-Secured Interfaces

    or interfaces not defined as secured (non synchronization interfaces), CCP transmits it's packets by default with lay

    multicast. The addressable fields are as follows:

    Source MAC - 00:00:00:00:fe:

    Source IP - 0.0.0.0

    Destination MAC - 01:00:5e:

    Destination IP - network broadcast address

    As an example, lets assume a scenario in which New Mode is being used. On a given segment, the cluster IP is

    0.3.220.103/27. CCP packets sent by the highest priority machine will look like this:

    :0:0:0:fe:0 1:0:5e:3:dc:67 ip 78: 0.0.0.0 > 10.3.220.96

    Here, 1:0:5e:3:dc:67 corresponds to the destination MAC, and indicates that the OID is multicast (1:0:5e:), with the

    orresponding to the last three octets of the cluster address.

    The last octet of the source address 0:0:0:0:fe:0, indicates the Machine ID of the transmitting member, in this case th

    rimary. In case of a second, or third member, this source address will reflect the members priority such as:

    0:0:0:0:fe:1

    0:0:0:0:fe:2

    Using this design, both the destination MAC, and destination IP address will change per IP segment, according to bo

    he cluster, and network address.

    ecured Interfaces

    or interfaces defined as secured (synchronization interfaces), CCP transmits by default as follows:

    Source MAC - 00:00:00:00:fe:

    Source IP - 0.0.0.0

    Destination MAC - ff:ff:ff:ff:ff:ff (all hosts broadcast)

    Destination IP - network broadcast address

    Port usage

    CCP uses UDP as the transmission protocol, with both the source and destination port set to 8116. This is true

    rrespective of the interface type.

  • 7/24/2019 Cluster Control Protocol Reference

    10/15

    ClusterXL Decision Logic

    Topology

    The following section will explore the logic utilized by ClusterXL. We will do so by looking at several common

    ailures, and how ClusterXL responds to such scenarios. A separate section will be dedicated to both New Mode Hig

    Availability, and to Load Sharing configurations.

    Note:The following examples are given in general terms, and do not represent a per packet analysis.

    The topology represented by Figure 1.1 will be assumed.

    Figure 1.1

    Topology Legend

    Dallab_Cluster - ClusterXL cluster

    P1_Primary - Managing CMA

    Net_10.3.220.96 - External segment Net_10.2.220.96 - Admin network

    Net 10.1.220.96 - Corporate network

  • 7/24/2019 Cluster Control Protocol Reference

    11/15

    High Availability (New Mode)

    nterface Failure

    This scenario assumes two cluster members, in which the external interface of the primary has failed.

    . At the point of failure, the primary will recognize that no CCP messages have been heard on the failed interface. A

    uch, it will announce via FWHA_MY_STATE on all other segments, that there may be an issue in the inbound dire

    with one of it's interfaces.

    . The primary will also note that no CCP responses have been received on the failed interface. This causes the prim

    o then announce on all other segments via FWHA_MY_STATE, that the outbound direction for one of the it's inter

    s in question as well.

    . At the same time as the above events, the secondary will recognize that no CCP packets have been received, and

    egins sending FWHA_PROBE_REQ messages on the affected segment. In addition, the secondary will attempt AR

    equests to hosts belonging to the affected segment, and will begin pinging those hosts which respond. This is done

    ttempt to diagnose which member has the problem.

    The pings will continue as long as we cannot identify by other means (i.e. CCP packets) that the interface is alive. T

    will happen when there are N cluster members, and N-1 of them are down. When more than two members are presen

    uch pings will only be issued if all other cluster members do not respond to CCP probing.

    . Since no FWHA_PROBE_RPLY message is received as a response, but the ping requests are being answered, the

    econdary concludes that it's own interfaces are up and working, and that the interface of the primary has failed.

    Therefore, it announces via FWHA_MY_STATE, that all of it's own interfaces are operational.

    . With this report from the secondary, the primary concludes the issue is with it's own interface, and move to the

    Down/Dead" status.

    . The secondary issues gratuitous ARP's for both the physical, and cluster address per IP segment, and moves to the

    Active/Active-Attention" state.

  • 7/24/2019 Cluster Control Protocol Reference

    12/15

    Primary Reboot

    . As the primary goes down, it changes it's state to "Down/dead", and announces this as part of FWHA_MY_STAT

    . This triggers the secondary to prepare itself to become the active member. It does so by sending as series of gratu

    ARP's for both it's physical IP, and cluster IP for each clustered segment. This will update all necessary hosts/routerach segment with the relevant updated MAC address information.

    . The secondary now moves to the "Active/Active-Attention" state, and assumes responsibility for processing all

    onnections.

    . Though the primary is now considered "Down/dead", it will still be able to send/receive CCP packets until its'

    nterfaces are brought down. Once this occurs, the secondary, which is now in the "Active/Active-Attention" state, w

    make notice of the fact that no CCP packets are being received.

    . The secondary will do several things in an effort to ascertain why no CCP packets are being received. First, it senWHA_IF_PROB_REQ packets on all segments in which no CCP packets have been heard. This is to illicit a respo

    rom any member capable of responding. The secondary will ARP on each segment for IP's belonging to that segme

    nd ping those hosts which respond.

    The pings will continue as long as we cannot identify by other means (i.e. CCP packets) that the interface is alive. T

    will happen when there are N cluster members, and N-1 of them are down. When more than two members are presen

    uch pings will only be issued if all other cluster members do not respond to CCP probing.

    .Once the primary has rebooted, but before the policy is loaded, is will begin sending FWHA_IFCONF_REPLY pa

    egularly. It does so without knowing what Cluster it belongs to, so a random ID is used.

    . After the Primary learns the cluster ID, it begins announcing FWHA_MY_STATE. The primary at this stage

    nnounces itself as "Down/Dead".

    . The primary fetches the policy from another cluster member if possible, otherwise from the management server.

    . The primary now initiates full synchronization on the secured interface via the FW1 protocol

    0. Once synchronization is complete, the primary moves to the "Ready" state.

    1. The secondary acknowledges this by moving to the "Standby" state.

    2.Once this state has been acknowledged, the primary issues gratuitous ARP's for both the physical, and cluster IP

    ach segment, and now moves to the "Active/Active-Attention" state.

  • 7/24/2019 Cluster Control Protocol Reference

    13/15

    Registered Device Failure

    This scenario assumes two cluster members, in which the fwd daemon has failed on the primary.

    . Once the fwd daemon has died, this is detected by Cluster XL as the device is no longer reporting state. The

    rimary changes it's state to "Down/dead", and announces this as part of FWHA_MY_STATE.

    . This triggers the secondary to prepare itself to become the active member. It does so by sending as series of gratu

    ARP's for both the physical, and cluster IP for each segment. This will update all necessary hosts/routers on

    ach segment with the relevant updated MAC address information.

    . The secondary now moves to the "Active/Active-Attention" state, and assumes responsibility for processing all

    onnections.

    n this case, the primary is still able to send CCP hello packets, and will continue to do so. Because of this, the secon

    will not make any attempts to diagnose interface related issues such as the pinging of hosts. This differs from an inteailure where CCP messages would not be received, which would trigger such a diagnosis using our topology.

  • 7/24/2019 Cluster Control Protocol Reference

    14/15

    Dual Failure

    This scenario assumes a dual failure by both cluster members of the secured (synchronization) interface connected v

    rossover link.

    . Since it is assumed that the secured interfaces are connected via a crossover link, the failure of one interface willown the line protocol of the other resulting in a dual failure. Once this occurs, both members will become aware of

    act via CCP. Both members will announce as part of FWHA_MY_STATE that N-1 interfaces are up.

    . Since both members have suffered the loss of a single interface, a decision must be made as to what action to take

    ext. Bringing both members down will result in a total failure, but some level of disturbance as already occurred. T

    olution is for the highest priority member to remain in the "Active/Active Attention" state, and for the secondary to

    eport itself as "Down/Dead".

    . The necessary state changes are made, and announced as part of FWHA_MY_STATE.

    . Upon recovery of the secured link, the secondary will resume the "Standby" status.

  • 7/24/2019 Cluster Control Protocol Reference

    15/15

    Load Sharing

    The events carried out by CCP during various failures in Load Sharing mode, closely resembles those covered thus f

    he previous New Mode section. For this reason, a complete analysis will not be given. However, there are some

    mportant differences which should be noted.

    . As opposed to New Mode, all members of a Load Sharing cluster will remain in the "Active/Active Attention" sta

    uring normal operation.

    . Upon the failure of a member, that members state will be changed to "Down/Dead", while all other members will

    emain in "Active/Active Attention", and continure to process connections.

    . In New Mode, for the purposes of packet forwarding, each cluster address is associated with the corresponding

    hysical MAC address of the active member. For this reason, it is necessary to issue gratuitous ARP's during a failu

    This is not to be confused with the multicast MAC used by CCP for message transmission.

    However, in Load Sharing mode, the multicast MAC used per segment by CCP, is also used as the MAC address fo

    urposes of packet forwarding. This is necessary to ensure that each cluster member receives every packet. Therefor

    here will be no issuance of gratuitous ARP's for any cluster address during a failure.

    . Although CCP will advertise the configured priority of the sending cluster member, these priority labels do not di

    level of seniority in Load Sharing during normal operation as they do in New Mode configurations.