by Joseph Brady Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.
Treehouse Software customers are using Rocket Data Replicate and Sync (RDRS) to enable mission-critical Mainframe-to-AWS data replication pipelines. Some of these production pipelines are providing vital near-real-time synchronization between source and target, and thus can’t afford any significant downtime in the event of failure. So it’s only natural that a number of our customers have been asking for advice in setting up a high availability (HA) configuration for their RDRS components that run on AWS EC2 instances. As a result, Treehouse Software provides an HA Framework Professional Services engagement, in which our expert Cloud engineers help customers with delivery, setup, rapid deployment, and customization of an RDRS HA framework. The HA Framework seamlessly and quickly provides for a Failover EC2 instance to automatically pick up RDRS processing should the Primary instance (running in another Availability Zone) go down.
Setting Up Automatic Failover with EC2 Instances in Different Availability Zones
The core components of the RDRS HA Framework consist of two EC2 instances running in different Availability Zones: 1) a Primary EC2 instance and 2) a Failover EC2 instance. Both identically-configured EC2 instances are attached to a shared working-storage file system (either an EFS or FSx volume), which allows the Failover instance to seamlessly and quickly pick up RDRS processing should the Primary instance suddenly become unavailable.
A Step Function Automates the Failover Process
In the event of failure of the Primary instance, the HA Framework calls for automatic triggering of a Step Function for reliable failover processing, with steps that include the following:
- Verify that the Primary instance is unavailable (The RDRS service cannot be active on both instances simultaneously, so this verification is vital.).
- Redirect all network traffic from the Primary instance to the Failover instance (via Route 53).
- Start RDRS processing on the Failover instance.
Use a Step Function to Automate the Restoration Process
After operations personnel have completed recovery of the Primary EC2 instance, another Step Function may be manually triggered to reliably transfer RDRS processing back to the Primary instance.
AWS services utilized in the complete recommended framework include Step Functions, Lambda Functions, EventBridge rules, CloudWatch alarms, SNS topics, a Route 53 Private Hosted Zone, and more.
For more information on Treehouse’s High Availability Framework Professional Service and our other offerings, visit Treehouse Software on the AWS Marketplace.
Interested in discussing your project? Contact us today…





























