Feb 27, 2018
An Active/Passive NFS Server
in a High Availability Cluster
Our EAP system originally used GlusterFS as its shared filesystem infrastructure. Over time, the performance and stability of GlusterFS began to degrade. For reasons we were unable to determine, issues found increased.
As a BSS system, EAP is designed to create and work with a significant volume of small files. However, such operations don't necessarily make use of the advantages of GlusterFS. Under certain application load patterns, the mounted GlusterFS file system became quite fragile and suffered from frightening performance degradation. There were times when even performing directory listing within a mounted directory could cause a crash due to an 'out of memory' error.
The biggest challenge we had was to find a solution which catered to our high performance needs. This wasn't an easy task. There is a stunning array of available shared/network attached/distributed technologies around these days. We treated performance and stability as our crucial indicators for the candidate solutions. It would take weeks of research and analysis to come to a decision.
We decided to completely replace our GlusterFS replicated volumes and implement a solution as simple as possible: an active/passive NFS server in a high availability cluster.
This setup utilizes high availability LVM volumes (HA-LVM) in a failover configuration which is distinct from well known Clustered Logical Volume Manager (CLVMD) and Distributed Lock Manager (DLM) active/active solution.
In this use case, clients access the NFS file system through a floating IP address. If the node on which the NFS server is running becomes inoperative, the NFS server starts up again on the second node of the cluster with minimal service interruption (i.e. less than 10 seconds).
HA-LVM imposes the restriction that a logical volume can only be activated exclusively - active on only one machine at a time. This means that only local (non-clustered) implementations of the storage drivers are used. Avoiding the cluster coordination overhead in this way increases performance.
Our two-node cluster is managed by Pacemaker (an advanced HA cluster resource manager), which takes care of resource availability, failure detection and automatic recovery. In case of any node failure, Pacemaker notifies a 'fencing agent' (STONITH) to cut off I/O operation from the relevant shared storage to ensure data integrity. Our cluster configuration employs SCSI persistent reservations as a fencing method through the use of the fence_scsi agent to revoke access to shared storage devices.
Migration was quite simple.
The new solution was built in parallel with the current production GlusterFS cluster. Files were copied and synced. The new NFS solution was extensively tested for performance and scalability.
Initial migration took place on unused application nodes, and various manual operations tested. Full migration required a small production outage, only to ensure data integrity, and required a simple operation to swap mount points within each application and related machine nodes.
Since the migration, our shared file system behaves much better in terms of performance and stability compared to our experience with the GlusterFS setup. So far, there have been no crashes or performance degradation under heavy application loads.