In this episode of the AWS Mainframe Modernization Broadcasting Channel webinar series, you will discover how enterprises are breaking down data silos by migrating mainframe data to Snowflake on AWS. In a real-world case study, see how Treehouse Software handles the complexities of VSAM, Db2, and Adabas data structures to deliver clean, queryable data, ready for modern analytics. Find out how the Treehouse Dataflow Toolkit (TDT) works with virtually any mainframe data replication tool to provide a fully automated approach for rapid and comprehensive data transfer from Kafka streaming pipelines to Snowflake and other Analytics/AI/ML-friendly targets on AWS–AI-ready, with all target resources automatically created.
View the webinar recording here…
Contact Treehouse Software today to request a product demonstration or discuss your data modernization needs…
Discover how enterprises are breaking down data silos by migrating mainframe data to Snowflake, enabling unified analytics across legacy and modern systems. In a real-world case study, see how Treehouse Software handles the complexities of VSAM, Db2, and Adabas data structures to deliver clean, queryable data ready for modern analytics.
Meet your presenters…
Sunil Divvela is a Worldwide Specialist Solutions Architect for Mainframe Modernization at AWS. He partners with customers and partners to accelerate their mainframe modernization journeys, leading initiatives from portfolio assessment through post-migration support leveraging Generative AI and Agentic AI. Prior to AWS, Sunil served as a Senior Technology Architect at Infosys, where he led multiple mainframe transformation programs.
Ellie Savova is a Cloud Solutions Architect at Treehouse Software, where she develops and architects AWS-cloud-native SaaS applications focused on enterprise data migrations. She works closely with customers to design secure, scalable cloud-native solutions across hybrid environments, integrating legacy data systems with modern analytics platforms. Ellie also leads customer implementations and innovation initiatives focused on cloud architecture, security, automation, and next-generation data-platform capabilities.
by Joseph Brady, Director of Business Development at Treehouse Software
Treehouse Software’s Treehouse Dataflow Toolkit (TDT) is currently in production at a large auto manufacturer as their key component for replicating dealership and vehicle order management data from multiple disparate mainframe databases to Snowflake on AWS.
The TDT solution, along with Treehouse’s decades of mainframe expertise, and our Cloud Engineers’ deep skills with multiple top-level AWS certifications accelerated the customer’s critical data move to Snowflake. Thanks to the Treehouse data delivery architecture, the customer’s data scientists and analysts can now access analytics-ready data through Snowflake. This enables sub-second query performance for rapid, intensive analytical tasks, and data sharing in real time, eliminating the need to move or copy data, thus enabling immediate insights across divisions and subsidiaries. The analytics teams can also make plans to easily add the latest AWS-based Analytics/AI/ML-friendly offerings, such as Amazon Redshift, Amazon Athena/S3, Amazon SageMaker AI, Amazon Bedrock, as well as any yet-to-be-developed Cloud services!
Since TDT is much more than a mere “connector,” the customer was able to eliminate months of development time and costs by using the tool to quickly and automatically prepared the full infrastructure needed for Snowflake data loading. As shown in the following architectural diagram, TDT is taking the customer’s mainframe data that was pumped into Amazon MSK (Managed Streaming for Kafka) by Rocket Data Replicate and Sync (RDRS) and lands it into Snowflake. TDT not only delivers the customer’s data, but its advanced crawler functions automatically prepare landing tables, views, and staging infrastructure for Snowflake. Additionally, when needed, TDT now stands ready to generate an archiving infrastructure and create Apache Iceberg tables for enhanced data management.
The Treehouse solution is enabling the customer to quickly move away from slow, on-premises, batch-oriented processes to their new scalable, highly available, and secure Cloud-native system. They are also better positioned to innovate for future data needs and strategies on a flexible and easily customizable architecture.
Customer Benefits
The customer’s data scientists and analysts can now access analytics-ready data faster and ever before through Snowflake
The new architecture easily allows testing and adding other Cloud-based Analytics/AI/ML-friendly targets and services.
From a data scientist’s perspective, effective dataflow tooling with TDT has delivered tangible benefits:
Fewer reconciliation issues with other divisions
Greater confidence in analytical results
Faster turnaround on urgent questions from leadership and multiple divisions
TDT’s auto scaling and parallelizing Lambda framework allows many parallel selects to all run at once, thus loading large tables with minimal latency.
Since TDT is built in alignment with AWS’s and Snowflake’s best practices, proper security and performance is ensured.
TDT is delivered via CloudFormation Templates, which automated and accelerated the process of installing and configuring the complete TDT application (including AWS Lambda functions and numerous other AWS resources, all wrapped in a well-architected security framework) in their AWS account. This allowed the customer to be up and running with a fully preconfigured implementation of the new data transfer pipeline in minutes.
The customer now has lasting compatibility with emerging Cloud technologies. As AWS and Snowflake introduce new features, TDT readily integrates them, staying ahead of the curve, keeping data pipelines modern and efficient.
In short: better data → better analytical judgment.
Visit Treehouse Software on the AWS Marketplace for all of our Cloud offerings…
by Joseph Brady, Director of Business Development at Treehouse Softwareand Eleonora Savova, Cloud Solutions Architect at Treehouse Software
Every major insurance policy begins with the same foundation: understanding historical claims data to determine how often claims occur, what they cost, and which factors drive risk.
Data analytics has always been the insurance industry’s lifeblood, with actuarial efforts relying on sophisticated analysis of historical data – an approach that, in many ways, resembles how modern machine learning (ML) techniques are used to quantify and manage risk and uncertainty.
All insurance companies hold decades of valuable data across legacy mainframe and non-mainframe systems. Actuaries, data scientists, and other analysts need data delivered into modern analytics environments in a reliable way. Treehouse Software addresses this challenge in the most straightforward and efficient manner by connecting legacy data to today’s top analytics platforms. Our Treehouse Dataflow Toolkit (TDT) enables rapid bulk load and change data capture (CDC) from multiple data sources into Snowflake and AWS. As part of this process, TDT automatically provides the required target infrastructure, making the data immediately available for Analytics, BI, and AI/ML operations.
TDT serves as a key component within AWS-based actuarial analytics architectures, enabling insurance organizations to move away from slow, on-premises, batch-oriented processes and toward scalable, cloud-native systems. These architectures leverage high-performance computing, data lakes, and serverless technologies to automate data ingestion, modeling, and reporting, reducing processing times from days to minutes.
With TDT handling data delivery, actuaries and data scientists can access analytics-ready data through platforms such as Snowflake, Amazon Redshift, etc. TDT operates in the background, ensuring that the most recent data is consistently available without manual intervention.
TDT Value and Benefits
Beyond delivering your data, TDT is MUCH more than a mere “connector”. It is a fully configurable end-to-end solution designated to manage the complete data delivery lifecycle. TDT’s advanced crawler capabilities automatically prepare landing tables, views, and staging infrastructure required by the target.
TDT is built in alignment with AWS’s and Snowflake’s best practices, ensuring proper security and performance. This consistent adherence to best practices is a key differentiator that sets TDT apart from many other “connectors” on the market.
From an actuary’s perspective, effective dataflow tooling with TDT delivers tangible benefits:
Fewer reconciliation issues with Finance
Greater confidence in analytical results
Faster turnaround on urgent questions from leadership
In short: better data → better actuarial judgment.
Treehouse provides highly-detailed CloudFormation Templates which automate and accelerate the process of installing and configuring the complete TDT application (including AWS Lambda functions and a number of other AWS resources) in your AWS account(s). The TDT CloudFormation Templates create stacks consisting of all principal framework components, along with related IAM policies and roles which are carefully engineered to comply with “best practices” (such as a “least privileges” approach to permissions).
The TDT CloudFormation Templates also optionally provide for automatic creation of a VPC, its subnets, and all required standard VPC-oriented resources, as well as optional creation of a source database cluster (consisting of either a sample database provided by Treehouse for a quick trial/POC, or your own database and data).
Simply put, TDT is a Cloud-native, turnkey solution that can eliminate months (or even years) of research and development time and costs and allow customers to be up and feeding data to an actuarial analytics architecture in minutes.
Visit Treehouse Software on the AWS Marketplace for all of our Cloud offerings…
by Joseph Brady, Director of Business Development at Treehouse Software, Inc.
Since 1982, Treehouse Software has been serving enterprises worldwide with industry-leading mainframe software products and outstanding technical support. Today, Treehouse Software brings you Treehouse Dataflow Toolkit (TDT), a serverless (Lambda-based) application that goes beyond basic data transfer. It is a fully automated solution that prepares the full infrastructure needed to automatically prepare the staging infrastructure for the massive data loading to targets, such as Amazon Redshift, Snowflake, Amazon Athena/S3, Amazon S3 Express One Zone Buckets, and Amazon Aurora PostgreSQL. TDT supports data replication between mainframe and non-mainframe sources—without disrupting existing critical work on customers’ legacy systems.
The Treehouse solutions utilizes Rocket Data Replicate and Sync (RDRS) to pull data from the mainframe, where an agent (with a very small footprint) extracts data (either bulk-load or CDC processing). The raw data is then securely passed from the mainframe by RDRS, which transforms and publishes the data to a Kafka topic (in our example above, a topic in an Amazon MSK cluster). The TDT microservices consume the data from MSK/Kafka and land it in S3 buckets, where TDT’s proprietary crawler technology is used to automatically prepare landing tables, views, and additional infrastructure for various analytics friendly targets. Then the mainframe data is loaded into Redshift, Snowflake, S3, or PostgreSQL (all the while adhering to AWS’s and Snowflake’s recommended “best practices” for massive data loading, thus assuring shortest and surest loads). The inherent reliability and scalability of the entire pipeline infrastructure assures near-real-time synchronization between mainframe sources and the target tables, even with very large bulk-loads or transaction-heavy CDC processing.
What about non-mainframe data?
For customers who have non-mainframe data sources, Treehouse offers TDT-DIRECT which pulls data directly from PostgreSQL, SQL Server, Oracle, MySql, and Db2 for bulk-load and CDC into a variety of targets on AWS.
Instantaneous auto scaling…
For massive amounts of data, TDT takes advantage of the auto scaling and parallelizing of the Lambda framework. This allows many parallel selects to all run at once, thus loading large tables with minimal latency. Additionally, all TDT Lambda microservices are fully customizable (they will be YOUR Lambdas) to add extra monitoring capabilities, and any other functionalities for future needs.
TDT’s innovative Lambda-based microservices approach enables faster data flow than any conceivable ODBC-based solution, which is the standard tool used for most “roll your own” approaches, or “we have a connector for that” offerings. TDT offers several key differentiators from standard “connectors” on the market, including:
Automatic creation of target resources – TDT automatically prepares landing tables, views, and additional staging infrastructure for the target. Without TDT’s fully automated approach, a customer can spend months designing and creating target resources, such as delta tables, views, schemas, etc.
Ease of delivery/implementation – TDT is delivered via CloudFormation templates, which automate and accelerate the process of installing and configuring the complete TDT application (including AWS Lambda functions and numerous other AWS resources, all wrapped in a well-architected security framework) in your AWS account. This allows your site to be up and running with a fully preconfigured implementation of your new data transfer pipeline in minutes.
Adherence to best practices– TDTis built in alignment with AWS and Snowflake best practices, ensuring proper security and performance. The fault-tolerant design of the Cloud-native application provides for a robust, future-proof architecture.
Adaptability to evolving Cloud ecosystems – In today’s fast-evolving cloud world, TDT’s flexible design ensures lasting compatibility with emerging technologies. As AWS and Snowflake introduce new features, the application readily integrates them, staying ahead of the curve, keeping your data pipelines modern and efficient.
DMS delivers features for monitoring migration tasks, reviewing AWS CloudWatch metrics, inspecting logs, and validating data, making it a robust and cost effective solution.
Additionally, to ensure the tightest security, DMS implements a comprehensive security framework that safeguards data throughout the migration process using IAM policies, SSL/TLS encryption, and AWS Secrets Manager credential management. Network controls and monitoring tools provide access restriction and real-time visibility.
High-level look at AWS DMS:
Treehouse Software introduces fully automated connectivity between DMS and Snowflake…
Today’s enthusiasm about AI and ML has become one of the prime motivators for customers wanting to move data to the Cloud, and Snowflake has become the platform of choice for many enterprises looking to mobilize data at near-unlimited scale and performance, while tapping into the most advanced AI/ML tools and services.
For DMS customers looking for the fastest and most straightforward way to connect to Snowflake, Treehouse Software brings you TDT-DIRECT, the ultimate DMS plugin for Snowflake. TDT-DIRECT leverages DMS to provide a turnkey approach that enables rapid bulk load and CDC data transfer directly from RDBMSs to Snowflake–AI-ready, with all target resources automatically created.
TDT-DIRECT is MUCH more than a mere “connector”—it is a serverless (Lambda-based), self-service, end-to-end solution that rapidly prepares the full infrastructure needed for loading data from DMS-supported RDBMSs into Snowflake. TDT-DIRECT’s advanced crawler functions automatically prepare all landing tables, views, and staging infrastructure for Snowflake as seen in this example:
TDT-DIRECT is built in alignment with AWS’s and Snowflake’s best practices, ensuring proper security and performance. The fault-tolerant design of the AWS-native application provides for a robust, future-proof architecture. This adherence to best practices is a key differentiator of TDT-DIRECT from other “connector” offerings on the market.
Bonus points for fast and easy implementation…
Treehouse provides highly-detailed CloudFormation Templates which automate and accelerate the process of installing and configuring the complete TDT-DIRECT application (including AWS Lambda functions and a number of other AWS resources) in your AWS account(s). The TDT-DIRECT CloudFormation Templates create stacks consisting of all principal framework components, along with related IAM policies and roles which are carefully engineered to comply with “best practices” (such as a “least privileges” approach to permissions).
The TDT-DIRECT CloudFormation Templates also optionally provide for automatic creation of a VPC, its subnets, and all required standard VPC-oriented resources, as well as optional creation of a source database cluster (consisting of either a sample database provided by Treehouse for a quick trial/POC, or your own sample database data).
Simply put, TDT-DIRECT is a Cloud-native, turnkey solution that can eliminate months (or even years) of research and development time and costs, and allow customers to be up and running in minutes.
Visit Treehouse Software on the AWS Marketplace for all of our Cloud offerings…
When Treehouse was approached by a large auto manufacturer to provide a solution to migrate their mainframe data from disparate source databases to Snowflake on AWS, the Treehouse Cloud engineering team was excited to take on the task. It wasn’t long before our experts drew upon their decades of mainframe expertise, along with deep skills and multiple AWS certifications, to come up with a prototype of the Treehouse Dataflow Toolkit (TDT). A quick proof of concept (POC) demonstrated that TDT worked exactly as expected and was the perfect tool for taking mainframe data that was pumped into Amazon MSK (Managed Streaming for Kafka) by Rocket Data Replicate and Sync (RDRS) and landing it into Snowflake on AWS.
TDT accelerated the customer’s move to Snowflake on AWS, because it is much more than a mere “connector” and goes beyond basic data transfer. It’s an automated, end-to-end solution that prepares the full infrastructure needed for Snowflake data loading. Its advanced crawler functions automatically prepare landing tables, views, and staging infrastructure for Snowflake. Additionally, TDT can generate optional archiving infrastructure and create Apache Iceberg tables for enhanced data management.
by Joseph Brady, Director of Business Development at Treehouse Software, Inc.
Treehouse Dataflow Toolkit Direct (TDT-DIRECT) is a turn-key microservices-based offering that assures auto scalable, highly available, event driven bulk-load and Change Data Capture (CDC) transfers from legacy data sources to data analytics platforms like Snowflake, Amazon Redshift, etc.
This blog focuses on how TDT-DIRECT leverages the auto scaling capabilities of its Lambda microservices. These Lambdas are highly efficient compute services used to process TDT-DIRECT’s data transfer. There is no need to worry about throughput volume with TDT-DIRECT because the Lambdas scale automatically, with new instances spun up as needed to handle increasing data transfer loads.
Instantaneous auto scaling…
For massive amounts of data, TDT-DIRECT takes advantage of the auto scaling and parallelizing of the Lambda framework. This allows many parallel selects to all run at once, thus loading large tables with minimal latency.
And that’s not all! Here are TDT-DIRECT’s other key differentiators from standard “connectors” on the market:
Automatic creation of target resources – For example, TDT-DIRECT automatically prepares landing tables, views, and additional proprietary staging infrastructure for Snowflake. Without TDT-DIRECT’s fully automated approach, a customer can spend months designing and creating target resources, such as delta tables, views, schemas, etc.
Ease of delivery/implementation – TDT-DIRECT is delivered via CloudFormation templates, which automate and accelerate the process of installing and configuring the complete TDT-DIRECT application (including AWS Lambda functions and numerous other AWS resources, all wrapped in a well-architected security framework) in your AWS account. This allows your site to be up and running with a fully preconfigured implementation of your new data transfer pipeline in minutes.
Adherence to best practices– TDT-DIRECT is built in alignment with AWS and Snowflake best practices, ensuring proper security and performance. The fault-tolerant design of the Cloud-native application provides for a robust, future-proof architecture.
Adaptability to evolving Cloud ecosystems – In today’s fast-evolving cloud world, TDT-DIRECT’s flexible design ensures lasting compatibility with emerging technologies. As AWS and Snowflake introduce new features, the application readily integrates them, staying ahead of the curve, keeping your data pipelines modern and efficient.
Simply put, TDT-DIRECT is a Cloud-native, self-contained, turn-key solution that will eliminate months or years of development time and costs.
by Joseph Brady Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.
Treehouse Software customers are using Rocket Data Replicate and Sync (RDRS) to enable mission-critical Mainframe-to-AWS data replication pipelines. Some of these production pipelines are providing vital near-real-time synchronization between source and target, and thus can’t afford any significant downtime in the event of failure. So it’s only natural that a number of our customers have been asking for advice in setting up a high availability (HA) configuration for their RDRS components that run on AWS EC2 instances. As a result, Treehouse Software provides an HA Framework Professional Services engagement, in which our expert Cloud engineers help customers with delivery, setup, rapid deployment, and customization of an RDRS HA framework. The HA Framework seamlessly and quickly provides for a Failover EC2 instance to automatically pick up RDRS processing should the Primary instance (running in another Availability Zone) go down.
Setting Up Automatic Failover with EC2 Instances in Different Availability Zones
The core components of the RDRS HA Framework consist of two EC2 instances running in different Availability Zones: 1) a Primary EC2 instance and 2) a Failover EC2 instance. Both identically-configured EC2 instances are attached to a shared working-storage file system (either an EFS or FSx volume), which allows the Failover instance to seamlessly and quickly pick up RDRS processing should the Primary instance suddenly become unavailable.
A Step Function Automates the Failover Process
In the event of failure of the Primary instance, the HA Framework calls for automatic triggering of a Step Function for reliable failover processing, with steps that include the following:
Verify that the Primary instance is unavailable (The RDRS service cannot be active on both instances simultaneously, so this verification is vital.).
Redirect all network traffic from the Primary instance to the Failover instance (via Route 53).
Start RDRS processing on the Failover instance.
Use a Step Function to Automate the Restoration Process
After operations personnel have completed recovery of the Primary EC2 instance, another Step Function may be manually triggered to reliably transfer RDRS processing back to the Primary instance.
For more information on Treehouse’s High Availability Framework Professional Service and our other offerings, visit Treehouse Software on the AWS Marketplace.
Interested in discussing your project? Contact us today…
by Joseph Brady, Director of Business Development at Treehouse Software, Inc.
Many Treehouse Software customers have discovered that they can save weeks, or months in their mainframe modernization initiatives by doing a Rocket Data Replicate and Sync (RDRS) Proof of Concept (POC) for Mainframe-to-Cloud data replication. Depending on the complexity of the customer’s project, an RDRS POC generally lasts as little as 10 business days after the product is installed and all connectivity is set up between the mainframe and Cloud environments. Treehouse Software provides documentation beforehand that outlines all of the requirements and agenda for the POC, and Treehouse technicians assist in downloading and installing RDRS.
During this paid POC (a portion of the payment is credited towards product purchase), the customer provides a test environment, representative subset of z/OS mainframe data, use case, timeline, and goals for the POC, and the Treehouse team mentors the customer’s technical team via remote screen sharing sessions. The application is executed on customer facilities, in a non-production environment, and a limited-scope implementation of an RDRS application is conducted to prove that the product meets the customer’s desired use case.
By the end of the POC, customers will have used RDRS to replicated mainframe data on their Cloud target, tested out product capabilities, and demonstrated a successful, repeatable data replication process, with documented results. After the POC, the customer has all the connectivity and processes in place to begin setting up the production phase of their mainframe data modernization project. The minimal cost, in terms of human resources and time, makes an RDRS POC a valuable ROI in the customer’s mainframe modernization journey.
About RDRS…
Many Treehouse partners are recommending RDRS for Mainframe-to-Cloud modernization projects. RDRS focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud targets. Through an innovative technology, changes occurring in any mainframe application data are tracked and captured, and then published to a variety of RDBMS and other targets.
Additionally, RDRS utilizes a Windows-based GUI Dashboard, which is ideal for non-mainframe programmers. While mainframe experts are required in the initial design/architecture phase of the POC and occasionally during implementation, the requirement for their involvement is minimal. The RDRS Dashboard acts as a single point of administration, data modeling and mapping, script generation, and monitoring. Comprehensive monitoring and logging of all data movements ensure transparency across all data exchange processes.
Once RDRS is up and running, the customer’s legacy mainframe environment can continue as long as needed, while data is replicated – in real time and bi-directionally – on the new Cloud platform. Now the enterprise can quickly take advantage of the latest Cloud services, such as analytics, machine learning and artificial intelligence (AI), etc., as well as move data to a variety of highly available and secure databases and data stores.
Want to see an RDRS demo first?
Simply fill out our RDRS Demonstration Request Form and a Treehouse representative will be contacting you to set up a time for your requested demonstration.