AWS / Treehouse Software Webinar: Moving Mainframe Data to Snowflake for Enterprise Wide Analytics


In this episode of the AWS Mainframe Modernization Broadcasting Channel webinar series, you will discover how enterprises are breaking down data silos by migrating mainframe data to Snowflake on AWS. In a real-world case study, see how Treehouse Software handles the complexities of VSAM, Db2, and Adabas data structures to deliver clean, queryable data, ready for modern analytics. Find out how the Treehouse Dataflow Toolkit (TDT) works with virtually any mainframe data replication tool to provide a fully automated approach for rapid and comprehensive data transfer from Kafka streaming pipelines to Snowflake and other Analytics/AI/ML-friendly targets on AWS–AI-ready, with all target resources automatically created.

View the webinar recording here…


Contact Treehouse Software today to request a product demonstration or discuss your data modernization needs…

____Treehouse_AWS_Badges

Join Treehouse Software for a new episode of the AWS Mainframe Modernization Broadcasting Channel!


Discover how enterprises are breaking down data silos by migrating mainframe data to Snowflake, enabling unified analytics across legacy and modern systems. In a real-world case study, see how Treehouse Software handles the complexities of VSAM, Db2, and Adabas data structures to deliver clean, queryable data ready for modern analytics.

Meet your presenters…

Sunil Divvela is a Worldwide Specialist Solutions Architect for Mainframe Modernization at AWS. He partners with customers and partners to accelerate their mainframe modernization journeys, leading initiatives from portfolio assessment through post-migration support leveraging Generative AI and Agentic AI. Prior to AWS, Sunil served as a Senior Technology Architect at Infosys, where he led multiple mainframe transformation programs.

Ellie Savova is a Cloud Solutions Architect at Treehouse Software, where she develops and architects AWS-cloud-native SaaS applications focused on enterprise data migrations. She works closely with customers to design secure, scalable cloud-native solutions across hybrid environments, integrating legacy data systems with modern analytics platforms. Ellie also leads customer implementations and innovation initiatives focused on cloud architecture, security, automation, and next-generation data-platform capabilities.

Register now using the QR code above, or HERE!


____Treehouse_AWS_Badges

Treehouse High Availability Framework Service Ensures Minimal Downtime for Customers using Rocket Data Replicate and Sync on AWS

by Joseph Brady Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.

Treehouse Software customers are using Rocket Data Replicate and Sync (RDRS) to enable mission-critical Mainframe-to-AWS data replication pipelines.  Some of these production pipelines are providing vital near-real-time synchronization between source and target, and thus can’t afford any significant downtime in the event of failure.  So it’s only natural that a number of our customers have been asking for advice in setting up a high availability (HA) configuration for their RDRS components that run on AWS EC2 instances.  As a result, Treehouse Software provides an HA Framework Professional Services engagement, in which our expert Cloud engineers help customers with delivery, setup, rapid deployment, and customization of an RDRS HA framework.  The HA Framework seamlessly and quickly provides for a Failover EC2 instance to automatically pick up RDRS processing should the Primary instance (running in another Availability Zone) go down.

Setting Up Automatic Failover with EC2 Instances in Different Availability Zones

The core components of the RDRS HA Framework consist of two EC2 instances running in different Availability Zones:  1) a Primary EC2 instance and 2) a Failover EC2 instance.  Both identically-configured EC2 instances are attached to a shared working-storage file system (either an EFS or FSx volume), which allows the Failover instance to seamlessly and quickly pick up RDRS processing should the Primary instance suddenly become unavailable.

A Step Function Automates the Failover Process

In the event of failure of the Primary instance, the HA Framework calls for automatic triggering of a Step Function for reliable failover processing, with steps that include the following:

  • Verify that the Primary instance is unavailable (The RDRS service cannot be active on both instances simultaneously, so this verification is vital.).
  • Redirect all network traffic from the Primary instance to the Failover instance (via Route 53).
  • Start RDRS processing on the Failover instance.

Use a Step Function to Automate the Restoration Process

After operations personnel have completed recovery of the Primary EC2 instance, another Step Function may be manually triggered to reliably transfer RDRS processing back to the Primary instance.

AWS services utilized in the complete recommended framework include Step Functions, Lambda Functions, EventBridge rules, CloudWatch alarms, SNS topics, a Route 53 Private Hosted Zone, and more.  


For more information on Treehouse’s High Availability Framework Professional Service and our other offerings, visit Treehouse Software on the AWS Marketplace.


Interested in discussing your project? Contact us today…

Test out Mainframe-to-Cloud Data Replication with a Treehouse Software Proof of Concept

by Joseph Brady, Director of Business Development at Treehouse Software, Inc.

 

Many Treehouse Software customers have discovered that they can save weeks, or months in their mainframe modernization initiatives by doing a Rocket Data Replicate and Sync (RDRS) Proof of Concept (POC) for Mainframe-to-Cloud data replication. Depending on the complexity of the customer’s project, an RDRS POC generally lasts as little as 10 business days after the product is installed and all connectivity is set up between the mainframe and Cloud environments. Treehouse Software provides documentation beforehand that outlines all of the requirements and agenda for the POC, and Treehouse technicians assist in downloading and installing RDRS.

During this paid POC (a portion of the payment is credited towards product purchase), the customer provides a test environment, representative subset of z/OS mainframe data, use case, timeline, and goals for the POC, and the Treehouse team mentors the customer’s technical team via remote screen sharing sessions. The application is executed on customer facilities, in a non-production environment, and a limited-scope implementation of an RDRS application is conducted to prove that the product meets the customer’s desired use case.

By the end of the POC, customers will have used RDRS to replicated mainframe data on their Cloud target, tested out product capabilities, and demonstrated a successful, repeatable data replication process, with documented results. After the POC, the customer has all the connectivity and processes in place to begin setting up the production phase of their mainframe data modernization project. The minimal cost, in terms of human resources and time, makes an RDRS POC a valuable ROI in the customer’s mainframe modernization journey.

About RDRS…

____0_RDRS_Overall_Diagram

Many Treehouse partners are recommending RDRS for Mainframe-to-Cloud modernization projects. RDRS focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud targets. Through an innovative technology, changes occurring in any mainframe application data are tracked and captured, and then published to a variety of RDBMS and other targets.

Additionally, RDRS utilizes a Windows-based GUI Dashboard, which is ideal for non-mainframe programmers. While mainframe experts are required in the initial design/architecture phase of the POC and occasionally during implementation, the requirement for their involvement is minimal. The RDRS Dashboard acts as a single point of administration, data modeling and mapping, script generation, and monitoring. Comprehensive monitoring and logging of all data movements ensure transparency across all data exchange processes.

Once RDRS is up and running, the customer’s legacy mainframe environment can continue as long as needed, while data is replicated – in real time and bi-directionally – on the new Cloud platform. Now the enterprise can quickly take advantage of the latest Cloud services, such as analytics, machine learning and artificial intelligence (AI), etc., as well as move data to a variety of highly available and secure databases and data stores.


__TSI_LOGO

Want to see an RDRS demo first?

Simply fill out our RDRS Demonstration Request Form and a Treehouse representative will be contacting you to set up a time for your requested demonstration.

A Treehouse Software Proof of Concept is the low-risk approach to testing mainframe data replication on Cloud and Hybrid Cloud environments

by Joseph Brady, Director of Business Development / Cloud Alliance Leader at Treehouse Software, Inc.

____0_Mainframe_To_Cloud

Many Treehouse Software customers have discovered the value of saving weeks, or months in their mainframe modernization initiatives by engaging in a Rocket Data Replicate and Sync (RDRS) Proof of Concept (POC) for Mainframe-to-Cloud data replication. Depending on the complexity of the customer’s project, an RDRS POC generally lasts as little as 10 business days after the product is installed and all connectivity is set up between the mainframe and Cloud environments.

How does it work?

  1. Treehouse Software provides documentation beforehand that outlines all of the requirements and agenda for the POC, and Treehouse technicians assist in downloading and installing RDRS.
  2. The customer provides a representative subset of z/OS or z/VSE mainframe data (e.g., Db2, Adabas, VSAM, IMS/DB, CA IDMS, CA DATACOM, etc.), use case, and goals for the POC, and the Treehouse team mentors the customer’s technical team via remote screen sharing sessions.
  3. The application is executed on customer facilities, in a non-production environment, and a limited-scope implementation of RDRS is conducted to prove that the product meets the customer’s desired use case.

By the end of the POC, customers will have replicated mainframe data on their Cloud target, tested out product capabilities, and demonstrated a successful, repeatable data replication process, with documented results. After the POC, the customer has all the connectivity and processes in place to begin setting up the production phase of their mainframe data modernization project. The minimal cost and resources makes an RDRS POC a valuable ROI in the customer’s mainframe modernization journey.

About RDRS…

Many Cloud and Systems Integration partners are recommending RDRS for mainframe data modernization projects. RDRS focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud targets. Through an innovative technology, changes occurring in any mainframe application data are tracked and captured, and then published to a variety of RDBMS and other targets.

RDRS utilizes a Windows-based GUI Control Board, which is ideal for non-mainframe programmers. While mainframe experts are required in the design/architecture phase during the POC and occasionally during implementation, the requirement for their involvement is limited. The RDRS Control Board acts as a single point of administration, data modeling and mapping, script generation, and monitoring. Comprehensive monitoring and logging of all data movements ensure transparency across all data exchange processes.

Additionally, once RDRS is up and running, the customer’s legacy mainframe environment can continue as long as needed, while they replicate data – in real time and bi-directionally – on the new Cloud platform. Now the enterprise can quickly take advantage of the latest Cloud services, such as advanced analytics, ML/AI, etc., as well as move data to a variety of highly available and secure databases and data stores.


__TSI_LOGO

Contact Treehouse Software Today…

Contact us to discuss how a Treehouse Software POC can accelerate your mainframe Cloud and hybrid Cloud data modernization journey.

Does your data science team want to accelerate insights and bring advanced ML/AI capabilities to your mainframe data with Amazon Redshift? Sure they do—and Treehouse Software enables that…

by Joseph Brady, Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.

We are beginning to see a pleasant and welcomed trend with Treehouse customers who are looking to modernize their valuable mainframe legacy data on the Cloud—they are including their data science teams in the important planning phase of architecting new Cloud environments and targets. This is especially vital for customers who want to incorporate advanced analytics and ML/AI in their strategic data usage plans on the Cloud. Who can contribute better understandings of ultimate data usage than your resident data scientists?

____0_Amazon_Redshift

We have heard from many of these data scientists that a primary item on their “wish lists” is for a fully managed, AI powered, massively parallel processing (MPP) architecture to extract maximum value and insights. They specifically mention Amazon Redshift as the Cloud data warehouse (which is much more than a data warehouse) of choice for driving digitization across the enterprise, as well as help personalizing customer experiences. Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and ML to deliver the highest performance at any scale. To this desire/question, we can answer with a resounding, “Yes, Treehouse Software has got you covered with Redshift connectivity!”.

The Treehouse Software solution…

Enterprise customers have come to Treehouse Software, because we bring not only proven mainframe data replication tools, but deep subject matter expertise in mainframe technologies, as well as the know-how to target relevant AWS offerings, such as Redshift, S3 (including S3 Express One Zone – see our recent blog on S3 Express One Zone), etc.

The Rocket Data Replicate and Sync (RDRS) solution allows customers’ legacy mainframe environment to operate normally while replicating data on AWS. The technology focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud-based databases and applications. Through an innovative set of technologies, changes occurring in any mainframe datastore are tracked and captured, and ultimately published to Redshift.

____0_Mainframe_To_RedshiftHow does it work?

  1. We start at the source – the mainframe – where an agent (with a very small footprint) extracts data (in the context of either bulk-load or CDC processing).
  2. The raw data is securely passed from the mainframe to RDRS, which speedily transforms mainframe-formatted data into Unicode/JSON and publishes the results to a Kafka topic.
  3. Our efficient, autoscaling microservices take it from there. Treehouse Dataflow Toolkit functions consume the data from Kafka and land it in S3 buckets, where Treehouse’s proprietary crawler technology is used to automatically prepare landing tables, views, and additional infrastructure in Redshift.  Thenthe mainframe data is loaded into Redshift (all the while adhering to AWS’ recommended “best practices” for massive data loading, thus assuring shortest and surest loads).  The inherent reliability and scalability of the entire pipeline infrastructure assure near-real-time synchronization between mainframe sources and Redshift target tables.

Redshift tables and views: something for everybody

Within this framework, the Redshift staging tables (often referred to as “delta tables”) are constantly accruing historical data, ideally suited for data scientists looking to do trend analysis, predictive analytics, ML, and AI work.  For business analysts and others who prefer structured data representations of potentially complex hierarchical data, the Treehouse framework also automatically provides structured user-views, providing the look and feel of a SQL database.

…as innovations move faster along the timeline, keep your options open!

Publishing both bulk-load and CDC data to a reliable and scalable framework like Kafka allows you to maintain a broad array of options to ultimately feed your legacy data to any number of JSON-friendly ETL tools, target datastores, and data analytics packages (some of which may not even have been invented yet!).  In addition to Redshift, the Treehouse Dataflow Toolkit also currently targets Snowflake, Amazon DynamoDB, and Amazon Athena/S3.

Video – Introduction to Data Warehousing on AWS with Amazon Redshift…


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo of our Mainframe-to-AWS real-time and bi-directional data replication solution. 

Treetip: Treehouse Software can help enterprise mainframe customers accelerate their data analytics, machine learning, and AI journeys by targeting the new Amazon S3 Express One Zone

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Treehouse Software specializes in helping enterprise customers with Mainframe-to-Cloud, Multi-Cloud, and Hybrid Cloud data modernization projects. Many times, our customers not only discuss strategies for replicating their mainframe data, but also their plans for what they want to do with that data on the Cloud side.  This makes it important to our team to stay current on the latest Cloud offerings that can benefit our customers’ enterprise modernization planning. Consequently, a very exciting announcement caught our attention during the 2023 AWS re:Invent conference—the general availability of a new type of S3 storage service referred to as Amazon S3 Express One Zone Storage Class

For those unfamiliar, Amazon S3 (“simple storage service”) is the basic file storage service of AWS, and as such it forms a foundational pillar of the entire AWS world. Amazon S3 Express One Zone is a new type of S3 bucket called a “directory bucket”, which is purpose-built to deliver consistent, single-digit millisecond data access for an enterprise’s most frequently used data and latency-sensitive applications. The new S3 directory buckets allow customers to store data in a single Availability Zone (AZ) that they specifically select, as opposed to the default of three AZs for standard S3. This eliminates the latency associated with spreading data across multiple AZs, providing applications with lower-latency storage. S3 directory buckets also follow a different request scaling model compared to traditional buckets, and their authentication is based on sessions rather than on a per-request basis. Bottom line… reduction in compute time = greater cost reduction.

S3 Express One Zone is ideally suited for services such as Amazon SageMaker Model TrainingAmazon AthenaAmazon EMR, and AWS Glue Data Catalog to accelerate Machine Learning (ML) and interactive analytics workloads. With S3 Express One Zone, storage automatically scales up or down based on consumption and need, and customers no longer need to manage multiple storage systems for low-latency workloads.

So, why is S3 Express One Zone important to Treehouse mainframe modernization customers?

____0_Mainframe_To_S3ExpressOneZone

Amazon S3 Express One Zone just made the Amazon S3 targeting in the Treehouse Dataflow Toolkit (TDT) potentially much more potent and valuable to our enterprise mainframe customers.  When an enterprise uses TDT to land their mission critical data in Express One Zone flavored Athena/S3 buckets, it becomes more directly accessible and manipulable by the various AWS ML and AI tools. In short, if customers choose, Express One Zone Athena/S3 becomes an intermediate data store for big data processing workloads and advanced analytics.

So, when we are asked, “What should Treehouse Software be doing to respond to the burgeoning interest in ML, Generative AI, etc.?”, the answer is — We are doing exactly what we need to be doing.  AI and ML frameworks are the newest incentive for people to use RDRS (Rocket Data Replicate and Sync — formerly called tcVISION) and TDT from Treehouse Software to replicate their mainframe data on advanced data analytics frameworks, or possibly into super-charged S3 Express One Zone buckets.  

Video – Deep Dive Introduction to Amazon S3 Express One Zone Storage Class:


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo of our Mainframe-to-AWS real-time and bi-directional data replication solution. 

3-Minute Video: Data Management and Processing with Rocket Data Replicate and Sync (formerly tcVISION)

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Treehouse Software is a worldwide distributor of Rocket Data Replicate and Sync (formerly tcVISION), the leading tool for using change data capture (CDC) for synchronizing mainframe data with real-time and bi-directional data replication. This video focuses on the product’s data management and use of “staged processing” to minimize its footprint on the mainframe system…


__TSI_LOGO

Contact us today for a live, online demo…

Simply fill out our Demonstration Request Form and a Treehouse representative will contact you to set up a time for your requested demonstration.

What is meant by “Regional Data Sovereignty” when replicating enterprise data on AWS?

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

I have recently been taking some classes in preparation for an AWS certification. In some of these classes, an example scenario has been used that speaks to an issue I’ve often heard mentioned by Treehouse mainframe customers­–that of “Regional Data Sovereignty”. For example, a customer might have government compliance requirements that financial information in Frankfurt cannot leave Germany, and many other countries have similar restrictions and regulatory controls in place.

Fortunately, Regional Data Sovereignty is a critical part of the design of AWS Global Infrastructure. Within this infrastructure, there are AWS Regions which address data that is subject to local laws and statutes of the country in which a Region is located. With the understanding that the customer’s data and application live and runs in various geographical Regions, there are four business factors a customer should consider when choosing a Region:

  1. Compliance. Before any other factors, customers must first look at their regional compliance requirements to determine if data must live within certain geographical boundaries.
  2. Proximity. How close the enterprise is to its customer base is another major factor because of possible latency issues between countries.  Locating a Region closest to the customer base is generally the best choice.
  3. Feature availability. Sometimes the closest Region may not have all the AWS features a business needs. Every year thousands of new features and products specifically to answer customer requests and needs are released by AWS. But sometimes those new services require new physical hardware that AWS has to build, so the service might be available one Region at a time. 
  4. Pricing. Even when the hardware is equal from one Region to the next, some locations are more expensive in which to operate. For example, the same workload in Sao Paulo could be significantly more expensive than if it is run out of Oregon in the United States. 

Additionally, events such as natural disasters, can happen to cause customers to lose connection to a data center, so a High Availability (HA) cutover plan should also be considered. The customer can run a second data center, but real estate prices alone could restrict that when considering all the duplicate expense of hardware, employees, electricity, heating and cooling, and security. Most businesses simply end up just storing backups somewhere, and then hope for the disaster to never come. And “hope” is not a good business plan. I recently covered how Treehouse Software can help provide an HA framework for mainframe customers in another blog.

Let’s take a look at the AWS Global Infrastructure and how its Regions are distributed worldwide…

____AWS_Global_Infrastructure

AWS Regions are built to be closest to the highest business traffic demands, such as in Paris, Tokyo, Sao Paulo, Dublin, and Ohio. Inside each Region, there are multiple data centers that have all the compute, storage, and other services customers need to run their applications. By utilizing AWS Regions for high availability of its business services, customers can be assured of minimal downtime of operations. Regions can be connected to each other through the high-speed AWS Direct Connect, which bypasses the public Internet, and the customer’s business decision maker chooses which Region they want to use. Each Region is isolated from every other Region in the sense that absolutely no data goes in or out of the customer’s environment in that Region without explicit permission for that data to be moved. These elements should be part of all critical strategic and security conversations when planning global distribution and availability of an enterprise’s data on AWS. 

Video – AWS Global Infrastructure explained…


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo of our Mainframe-to-AWS real-time and bi-directional data replication solution. 

So, you want to bring Snowflake’s advanced ML/AI capabilities to bear on your mainframe data? Treehouse Software enables that…

by Dan Vimont, Director of Innovation at Treehouse Software, Inc. and Joseph Brady, Director of Business Development at Treehouse Software, Inc.

The exploding popularity of advanced data analytics platforms such as Snowflake, where an ever-expanding array of machine learning and artificial intelligence tools are available to generate vital insights from your enterprise’s data, has quickly transformed the world of data processing.  Your data science teams are sitting there at their Snowflake consoles, eagerly awaiting the arrival of critical data from your mainframes to supercharge their predictive analytics and generative AI frameworks.

They’re waiting…

So, what’s the hold-up?

Oh yeah, getting legacy data out of ancient mainframe datastores and into Cloud analytics frameworks is HARD, right?

Um, no, actually — it’s not.

The Treehouse Software solution…

____0_Mainframe_To_Snowflake01

How does it work?

  1. We start at the source — the mainframe — where an agent (with a very small footprint) extracts data (in the context of either bulk-load or CDC processing).
  2. The raw data is securely passed from the mainframe to MDR (Treehouse Mainframe Data Replicator powered by Rocket® Software) which speedily transforms mainframe-formatted data into Unicode/JSON and publishes the results to a Kafka topic.
  3. Our efficient and autoscaling microservices take it from there. Treehouse Dataflow Toolkit functions consume the data from Kafka, automatically prepare landing tables, views, and additional infrastructure in Snowflake, and then land the data in Snowflake (all the while adhering to Snowflake’s recommended “best practices” for massive data loading, thus assuring shortest and surest loads).

Snowflake tables and views: something for everybody

Within this framework, the Snowflake staging tables are constantly accruing historical data, ideally suited for data scientists looking to do trend analysis, predictive analytics, ML, and AI work.  For business analysts and others who prefer structured data representations of potentially complex hierarchical data, the Treehouse framework also automatically provides structured user-views.

… and the world keeps on changing, so keep your options open!

Publishing both bulk-load and CDC data to a reliable and scalable framework like Kafka allows you to maintain a broad array of options to ultimately feed your legacy data to any number of JSON-friendly ETL tools, target datastores, and data analytics packages (some of which may not even have been invented yet!).  In addition to Snowflake, the Treehouse Dataflow Toolkit also currently targets Amazon Redshift, Amazon DynamoDB, and Amazon Athena/S3.


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo.