AWS / Treehouse Software Webinar: Moving Mainframe Data to Snowflake for Enterprise Wide Analytics


In this episode of the AWS Mainframe Modernization Broadcasting Channel webinar series, you will discover how enterprises are breaking down data silos by migrating mainframe data to Snowflake on AWS. In a real-world case study, see how Treehouse Software handles the complexities of VSAM, Db2, and Adabas data structures to deliver clean, queryable data, ready for modern analytics. Find out how the Treehouse Dataflow Toolkit (TDT) works with virtually any mainframe data replication tool to provide a fully automated approach for rapid and comprehensive data transfer from Kafka streaming pipelines to Snowflake and other Analytics/AI/ML-friendly targets on AWS–AI-ready, with all target resources automatically created.

View the webinar recording here…


Contact Treehouse Software today to request a product demonstration or discuss your data modernization needs…

____Treehouse_AWS_Badges

Join Treehouse Software for a new episode of the AWS Mainframe Modernization Broadcasting Channel!


Discover how enterprises are breaking down data silos by migrating mainframe data to Snowflake, enabling unified analytics across legacy and modern systems. In a real-world case study, see how Treehouse Software handles the complexities of VSAM, Db2, and Adabas data structures to deliver clean, queryable data ready for modern analytics.

Meet your presenters…

Sunil Divvela is a Worldwide Specialist Solutions Architect for Mainframe Modernization at AWS. He partners with customers and partners to accelerate their mainframe modernization journeys, leading initiatives from portfolio assessment through post-migration support leveraging Generative AI and Agentic AI. Prior to AWS, Sunil served as a Senior Technology Architect at Infosys, where he led multiple mainframe transformation programs.

Ellie Savova is a Cloud Solutions Architect at Treehouse Software, where she develops and architects AWS-cloud-native SaaS applications focused on enterprise data migrations. She works closely with customers to design secure, scalable cloud-native solutions across hybrid environments, integrating legacy data systems with modern analytics platforms. Ellie also leads customer implementations and innovation initiatives focused on cloud architecture, security, automation, and next-generation data-platform capabilities.

Register now using the QR code above, or HERE!


____Treehouse_AWS_Badges

Transforming the automotive industry with AI-ready mainframe data delivery to Snowflake and AWS

by Joseph Brady, Director of Business Development at Treehouse Software

Treehouse Software data delivery for Auto industry

Treehouse Software’s Treehouse Dataflow Toolkit (TDT) is currently in production at a large auto manufacturer as their key component for replicating dealership and vehicle order management data from multiple disparate mainframe databases to Snowflake on AWS.

The TDT solution, along with Treehouse’s decades of mainframe expertise, and our Cloud Engineers’ deep skills with multiple top-level AWS certifications accelerated the customer’s critical data move to Snowflake. Thanks to the Treehouse data delivery architecture, the customer’s data scientists and analysts can now access analytics-ready data through Snowflake. This enables sub-second query performance for rapid, intensive analytical tasks, and data sharing in real time, eliminating the need to move or copy data, thus enabling immediate insights across divisions and subsidiaries. The analytics teams can also make plans to easily add the latest AWS-based Analytics/AI/ML-friendly offerings, such as  Amazon Redshift, Amazon Athena/S3, Amazon SageMaker AI, Amazon Bedrock, as well as any yet-to-be-developed Cloud services!

Since TDT is much more than a mere “connector,” the customer was able to eliminate months of development time and costs by using the tool to quickly and automatically prepared the full infrastructure needed for Snowflake data loading. As shown in the following architectural diagram, TDT is taking the customer’s mainframe data that was pumped into Amazon MSK (Managed Streaming for Kafka) by Rocket Data Replicate and Sync (RDRS) and lands it into Snowflake. TDT not only delivers the customer’s data, but its advanced crawler functions automatically prepare landing tables, views, and staging infrastructure for Snowflake. Additionally, when needed, TDT now stands ready to generate an archiving infrastructure and create Apache Iceberg tables for enhanced data management.

The Treehouse solution is enabling the customer to quickly move away from slow, on-premises, batch-oriented processes to their new scalable, highly available, and secure Cloud-native system. They are also better positioned to innovate for future data needs and strategies on a flexible and easily customizable architecture.

Customer Benefits

  • The customer’s data scientists and analysts can now access analytics-ready data faster and ever before through Snowflake
  • The new architecture easily allows testing and adding other Cloud-based Analytics/AI/ML-friendly targets and services.
  • From a data scientist’s perspective, effective dataflow tooling with TDT has delivered tangible benefits:
    • Fewer reconciliation issues with other divisions
    • Greater confidence in analytical results
    • Faster turnaround on urgent questions from leadership and multiple divisions
  • TDT’s auto scaling and parallelizing Lambda framework allows many parallel selects to all run at once, thus loading large tables with minimal latency.
  • Since TDT is built in alignment with AWS’s and Snowflake’s best practices, proper security and performance is ensured.
  • TDT is delivered via CloudFormation Templates, which automated and accelerated the process of installing and configuring the complete TDT application (including AWS Lambda functions and numerous other AWS resources, all wrapped in a well-architected security framework) in their AWS account. This allowed the customer to be up and running with a fully preconfigured implementation of the new data transfer pipeline in minutes.
  • The customer now has lasting compatibility with emerging Cloud technologies. As AWS and Snowflake introduce new features, TDT readily integrates them, staying ahead of the curve, keeping data pipelines modern and efficient.

In short: better data → better analytical judgment. 

Visit Treehouse Software on the AWS Marketplace for all of our Cloud offerings…

Treehouse Dataflow Toolkit (TDT) and TDT-DIRECT are Copyright © Treehouse Software, Inc. All rights reserved.

____Treehouse_AWS_Badges

Contact us today to discuss your project! 

Treehouse Software accelerates insurance companies’ data delivery to the latest Analytics/BI/AI/ML-friendly platforms

by Joseph Brady, Director of Business Development at Treehouse Software and Eleonora Savova, Cloud Solutions Architect at Treehouse Software

Every major insurance policy begins with the same foundation: understanding historical claims data to determine how often claims occur, what they cost, and which factors drive risk.

Data analytics has always been the insurance industry’s lifeblood, with actuarial efforts relying on sophisticated analysis of historical data – an approach that, in many ways, resembles how modern machine learning (ML) techniques are used to quantify and manage risk and uncertainty.

All insurance companies hold decades of valuable data across legacy mainframe and non-mainframe systems. Actuaries, data scientists, and other analysts need data delivered into modern analytics environments in a reliable way. Treehouse Software addresses this challenge in the most straightforward and efficient manner by connecting legacy data to today’s top analytics platforms. Our Treehouse Dataflow Toolkit (TDT) enables rapid bulk load and change data capture (CDC) from multiple data sources into Snowflake and AWS. As part of this process, TDT automatically provides the required target infrastructure, making the data immediately available for Analytics, BI, and AI/ML operations.

TDT serves as a key component within AWS-based actuarial analytics architectures, enabling insurance organizations to move away from slow, on-premises, batch-oriented processes and toward scalable, cloud-native systems. These architectures leverage high-performance computing, data lakes, and serverless technologies to automate data ingestion, modeling, and reporting, reducing processing times from days to minutes.

With TDT handling data delivery, actuaries and data scientists can access analytics-ready data through platforms such as Snowflake, Amazon Redshift, etc. TDT operates in the background, ensuring that the most recent data is consistently available without manual intervention.

TDT Value and Benefits

Beyond delivering your data, TDT is MUCH more than a mere “connector”. It is a fully configurable end-to-end solution designated to manage the complete data delivery lifecycle. TDT’s advanced crawler capabilities automatically prepare landing tables, views, and staging infrastructure required by the target.

TDT is built in alignment with AWS’s and Snowflake’s best practices, ensuring proper security and performance. This consistent adherence to best practices is a key differentiator that sets TDT apart from many other “connectors” on the market.

From an actuary’s perspective, effective dataflow tooling with TDT delivers tangible benefits:

  • Fewer reconciliation issues with Finance
  • Greater confidence in analytical results
  • Faster turnaround on urgent questions from leadership

In short: better data → better actuarial judgment. 

Treehouse provides highly-detailed CloudFormation Templates which automate and accelerate the process of installing and configuring the complete TDT application (including AWS Lambda functions and a number of other AWS resources) in your AWS account(s). The TDT CloudFormation Templates create stacks consisting of all principal framework components, along with related IAM policies and roles which are carefully engineered to comply with “best practices” (such as a “least privileges” approach to permissions).

The TDT CloudFormation Templates also optionally provide for automatic creation of a VPC, its subnets, and all required standard VPC-oriented resources, as well as optional creation of a source database cluster (consisting of either a sample database provided by Treehouse for a quick trial/POC, or your own database and data).

Simply put, TDT is a Cloud-native, turnkey solution that can eliminate months (or even years) of research and development time and costs and allow customers to be up and feeding data to an actuarial analytics architecture in minutes.

Visit Treehouse Software on the AWS Marketplace for all of our Cloud offerings…

Treehouse Dataflow Toolkit (TDT) and TDT-DIRECT are Copyright © Treehouse Software, Inc. All rights reserved.

____Treehouse_AWS_Badges

Contact us today to schedule a demo! 

Escape the complexity: Treehouse Software’s fully automated, Lambda-based solution accelerates highly scalable data delivery to AWS

by Joseph Brady, Director of Business Development at Treehouse Software, Inc.

Since 1982, Treehouse Software has been serving enterprises worldwide with industry-leading mainframe software products and outstanding technical support. Today, Treehouse Software brings you Treehouse Dataflow Toolkit (TDT), a serverless (Lambda-based) application that goes beyond basic data transfer. It is a fully automated solution that prepares the full infrastructure needed to automatically prepare the staging infrastructure for the massive data loading to targets, such as Amazon RedshiftSnowflakeAmazon Athena/S3Amazon S3 Express One Zone Buckets, and Amazon Aurora PostgreSQL. TDT supports data replication between mainframe and non-mainframe sources—without disrupting existing critical work on customers’ legacy systems.

The Treehouse solutions utilizes Rocket Data Replicate and Sync (RDRS) to pull data from the mainframe, where an agent (with a very small footprint) extracts data (either bulk-load or CDC processing). The raw data is then securely passed from the mainframe by RDRS, which transforms and publishes the data to a Kafka topic (in our example above, a topic in an Amazon MSK cluster). The TDT microservices consume the data from MSK/Kafka and land it in S3 buckets, where TDT’s proprietary crawler technology is used to automatically prepare landing tables, views, and additional infrastructure for various analytics friendly targets. Then the mainframe data is loaded into Redshift, Snowflake, S3, or PostgreSQL (all the while adhering to AWS’s and Snowflake’s recommended “best practices” for massive data loading, thus assuring shortest and surest loads). The inherent reliability and scalability of the entire pipeline infrastructure assures near-real-time synchronization between mainframe sources and the target tables, even with very large bulk-loads or transaction-heavy CDC processing.

What about non-mainframe data?

For customers who have non-mainframe data sources, Treehouse offers TDT-DIRECT which pulls data directly from PostgreSQL, SQL Server, Oracle, MySql, and Db2 for bulk-load and CDC into a variety of targets on AWS.

Instantaneous auto scaling…

For massive amounts of data, TDT takes advantage of the auto scaling and parallelizing of the Lambda framework. This allows many parallel selects to all run at once, thus loading large tables with minimal latency. Additionally, all TDT Lambda microservices are fully customizable (they will be YOUR Lambdas) to add extra monitoring capabilities, and any other functionalities for future needs.

TDT’s innovative Lambda-based microservices approach enables faster data flow than any conceivable ODBC-based solution, which is the standard tool used for most “roll your own” approaches, or “we have a connector for that” offerings. TDT offers several key differentiators from standard “connectors” on the market, including:

  • Automatic creation of target resources – TDT automatically prepares landing tables, views, and additional staging infrastructure for the target. Without TDT’s fully automated approach, a customer can spend months designing and creating target resources, such as delta tables, views, schemas, etc.
  • Ease of delivery/implementation – TDT is delivered via CloudFormation templates, which automate and accelerate the process of installing and configuring the complete TDT application (including AWS Lambda functions and numerous other AWS resources, all wrapped in a well-architected security framework) in your AWS account. This allows your site to be up and running with a fully preconfigured implementation of your new data transfer pipeline in minutes.
  • Adherence to best practices  TDTis built in alignment with AWS and Snowflake best practices, ensuring proper security and performance. The fault-tolerant design of the Cloud-native application provides for a robust, future-proof architecture.
  • Adaptability to evolving Cloud ecosystems – In today’s fast-evolving cloud world, TDT’s flexible design ensures lasting compatibility with emerging technologies. As AWS and Snowflake introduce new features, the application readily integrates them, staying ahead of the curve, keeping your data pipelines modern and efficient.

TDT and TDT-DIRECT are designed to deliver: 

  • rapid mainframe and non-mainframe data bulk-loading and CDC to Snowflake and AWS targets
  • access to the latest Analytics, AI, and ML tools and services
  • swift ROI

Contact us today to discuss your needs, or to book a free demo.

Visit Treehouse Software on the AWS Marketplace for all of our Cloud offerings…

Treehouse Dataflow Toolkit (TDT) and TDT-DIRECT are Copyright © Treehouse Software, Inc. All rights reserved.

____Treehouse_AWS_Badges

Contact us today to schedule a demo! 

Treehouse High Availability Framework Service Ensures Minimal Downtime for Customers using Rocket Data Replicate and Sync on AWS

by Joseph Brady Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.

Treehouse Software customers are using Rocket Data Replicate and Sync (RDRS) to enable mission-critical Mainframe-to-AWS data replication pipelines.  Some of these production pipelines are providing vital near-real-time synchronization between source and target, and thus can’t afford any significant downtime in the event of failure.  So it’s only natural that a number of our customers have been asking for advice in setting up a high availability (HA) configuration for their RDRS components that run on AWS EC2 instances.  As a result, Treehouse Software provides an HA Framework Professional Services engagement, in which our expert Cloud engineers help customers with delivery, setup, rapid deployment, and customization of an RDRS HA framework.  The HA Framework seamlessly and quickly provides for a Failover EC2 instance to automatically pick up RDRS processing should the Primary instance (running in another Availability Zone) go down.

Setting Up Automatic Failover with EC2 Instances in Different Availability Zones

The core components of the RDRS HA Framework consist of two EC2 instances running in different Availability Zones:  1) a Primary EC2 instance and 2) a Failover EC2 instance.  Both identically-configured EC2 instances are attached to a shared working-storage file system (either an EFS or FSx volume), which allows the Failover instance to seamlessly and quickly pick up RDRS processing should the Primary instance suddenly become unavailable.

A Step Function Automates the Failover Process

In the event of failure of the Primary instance, the HA Framework calls for automatic triggering of a Step Function for reliable failover processing, with steps that include the following:

  • Verify that the Primary instance is unavailable (The RDRS service cannot be active on both instances simultaneously, so this verification is vital.).
  • Redirect all network traffic from the Primary instance to the Failover instance (via Route 53).
  • Start RDRS processing on the Failover instance.

Use a Step Function to Automate the Restoration Process

After operations personnel have completed recovery of the Primary EC2 instance, another Step Function may be manually triggered to reliably transfer RDRS processing back to the Primary instance.

AWS services utilized in the complete recommended framework include Step Functions, Lambda Functions, EventBridge rules, CloudWatch alarms, SNS topics, a Route 53 Private Hosted Zone, and more.  


For more information on Treehouse’s High Availability Framework Professional Service and our other offerings, visit Treehouse Software on the AWS Marketplace.


Interested in discussing your project? Contact us today…

Test out Mainframe-to-Cloud Data Replication with a Treehouse Software Proof of Concept

by Joseph Brady, Director of Business Development at Treehouse Software, Inc.

 

Many Treehouse Software customers have discovered that they can save weeks, or months in their mainframe modernization initiatives by doing a Rocket Data Replicate and Sync (RDRS) Proof of Concept (POC) for Mainframe-to-Cloud data replication. Depending on the complexity of the customer’s project, an RDRS POC generally lasts as little as 10 business days after the product is installed and all connectivity is set up between the mainframe and Cloud environments. Treehouse Software provides documentation beforehand that outlines all of the requirements and agenda for the POC, and Treehouse technicians assist in downloading and installing RDRS.

During this paid POC (a portion of the payment is credited towards product purchase), the customer provides a test environment, representative subset of z/OS mainframe data, use case, timeline, and goals for the POC, and the Treehouse team mentors the customer’s technical team via remote screen sharing sessions. The application is executed on customer facilities, in a non-production environment, and a limited-scope implementation of an RDRS application is conducted to prove that the product meets the customer’s desired use case.

By the end of the POC, customers will have used RDRS to replicated mainframe data on their Cloud target, tested out product capabilities, and demonstrated a successful, repeatable data replication process, with documented results. After the POC, the customer has all the connectivity and processes in place to begin setting up the production phase of their mainframe data modernization project. The minimal cost, in terms of human resources and time, makes an RDRS POC a valuable ROI in the customer’s mainframe modernization journey.

About RDRS…

____0_RDRS_Overall_Diagram

Many Treehouse partners are recommending RDRS for Mainframe-to-Cloud modernization projects. RDRS focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud targets. Through an innovative technology, changes occurring in any mainframe application data are tracked and captured, and then published to a variety of RDBMS and other targets.

Additionally, RDRS utilizes a Windows-based GUI Dashboard, which is ideal for non-mainframe programmers. While mainframe experts are required in the initial design/architecture phase of the POC and occasionally during implementation, the requirement for their involvement is minimal. The RDRS Dashboard acts as a single point of administration, data modeling and mapping, script generation, and monitoring. Comprehensive monitoring and logging of all data movements ensure transparency across all data exchange processes.

Once RDRS is up and running, the customer’s legacy mainframe environment can continue as long as needed, while data is replicated – in real time and bi-directionally – on the new Cloud platform. Now the enterprise can quickly take advantage of the latest Cloud services, such as analytics, machine learning and artificial intelligence (AI), etc., as well as move data to a variety of highly available and secure databases and data stores.


__TSI_LOGO

Want to see an RDRS demo first?

Simply fill out our RDRS Demonstration Request Form and a Treehouse representative will be contacting you to set up a time for your requested demonstration.

A Treehouse Software Proof of Concept is the low-risk approach to testing mainframe data replication on Cloud and Hybrid Cloud environments

by Joseph Brady, Director of Business Development / Cloud Alliance Leader at Treehouse Software, Inc.

____0_Mainframe_To_Cloud

Many Treehouse Software customers have discovered the value of saving weeks, or months in their mainframe modernization initiatives by engaging in a Rocket Data Replicate and Sync (RDRS) Proof of Concept (POC) for Mainframe-to-Cloud data replication. Depending on the complexity of the customer’s project, an RDRS POC generally lasts as little as 10 business days after the product is installed and all connectivity is set up between the mainframe and Cloud environments.

How does it work?

  1. Treehouse Software provides documentation beforehand that outlines all of the requirements and agenda for the POC, and Treehouse technicians assist in downloading and installing RDRS.
  2. The customer provides a representative subset of z/OS or z/VSE mainframe data (e.g., Db2, Adabas, VSAM, IMS/DB, CA IDMS, CA DATACOM, etc.), use case, and goals for the POC, and the Treehouse team mentors the customer’s technical team via remote screen sharing sessions.
  3. The application is executed on customer facilities, in a non-production environment, and a limited-scope implementation of RDRS is conducted to prove that the product meets the customer’s desired use case.

By the end of the POC, customers will have replicated mainframe data on their Cloud target, tested out product capabilities, and demonstrated a successful, repeatable data replication process, with documented results. After the POC, the customer has all the connectivity and processes in place to begin setting up the production phase of their mainframe data modernization project. The minimal cost and resources makes an RDRS POC a valuable ROI in the customer’s mainframe modernization journey.

About RDRS…

Many Cloud and Systems Integration partners are recommending RDRS for mainframe data modernization projects. RDRS focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud targets. Through an innovative technology, changes occurring in any mainframe application data are tracked and captured, and then published to a variety of RDBMS and other targets.

RDRS utilizes a Windows-based GUI Control Board, which is ideal for non-mainframe programmers. While mainframe experts are required in the design/architecture phase during the POC and occasionally during implementation, the requirement for their involvement is limited. The RDRS Control Board acts as a single point of administration, data modeling and mapping, script generation, and monitoring. Comprehensive monitoring and logging of all data movements ensure transparency across all data exchange processes.

Additionally, once RDRS is up and running, the customer’s legacy mainframe environment can continue as long as needed, while they replicate data – in real time and bi-directionally – on the new Cloud platform. Now the enterprise can quickly take advantage of the latest Cloud services, such as advanced analytics, ML/AI, etc., as well as move data to a variety of highly available and secure databases and data stores.


__TSI_LOGO

Contact Treehouse Software Today…

Contact us to discuss how a Treehouse Software POC can accelerate your mainframe Cloud and hybrid Cloud data modernization journey.

Does your data science team want to accelerate insights and bring advanced ML/AI capabilities to your mainframe data with Amazon Redshift? Sure they do—and Treehouse Software enables that…

by Joseph Brady, Director of Business Development at Treehouse Software, Inc. and Dan Vimont, Director of Innovation at Treehouse Software, Inc.

We are beginning to see a pleasant and welcomed trend with Treehouse customers who are looking to modernize their valuable mainframe legacy data on the Cloud—they are including their data science teams in the important planning phase of architecting new Cloud environments and targets. This is especially vital for customers who want to incorporate advanced analytics and ML/AI in their strategic data usage plans on the Cloud. Who can contribute better understandings of ultimate data usage than your resident data scientists?

____0_Amazon_Redshift

We have heard from many of these data scientists that a primary item on their “wish lists” is for a fully managed, AI powered, massively parallel processing (MPP) architecture to extract maximum value and insights. They specifically mention Amazon Redshift as the Cloud data warehouse (which is much more than a data warehouse) of choice for driving digitization across the enterprise, as well as help personalizing customer experiences. Redshift uses SQL to analyze structured and semi-structured data across data warehouses, operational databases, and data lakes, using AWS-designed hardware and ML to deliver the highest performance at any scale. To this desire/question, we can answer with a resounding, “Yes, Treehouse Software has got you covered with Redshift connectivity!”.

The Treehouse Software solution…

Enterprise customers have come to Treehouse Software, because we bring not only proven mainframe data replication tools, but deep subject matter expertise in mainframe technologies, as well as the know-how to target relevant AWS offerings, such as Redshift, S3 (including S3 Express One Zone – see our recent blog on S3 Express One Zone), etc.

The Rocket Data Replicate and Sync (RDRS) solution allows customers’ legacy mainframe environment to operate normally while replicating data on AWS. The technology focuses on changed data capture (CDC) when transferring information between mainframe data sources and Cloud-based databases and applications. Through an innovative set of technologies, changes occurring in any mainframe datastore are tracked and captured, and ultimately published to Redshift.

____0_Mainframe_To_RedshiftHow does it work?

  1. We start at the source – the mainframe – where an agent (with a very small footprint) extracts data (in the context of either bulk-load or CDC processing).
  2. The raw data is securely passed from the mainframe to RDRS, which speedily transforms mainframe-formatted data into Unicode/JSON and publishes the results to a Kafka topic.
  3. Our efficient, autoscaling microservices take it from there. Treehouse Dataflow Toolkit functions consume the data from Kafka and land it in S3 buckets, where Treehouse’s proprietary crawler technology is used to automatically prepare landing tables, views, and additional infrastructure in Redshift.  Thenthe mainframe data is loaded into Redshift (all the while adhering to AWS’ recommended “best practices” for massive data loading, thus assuring shortest and surest loads).  The inherent reliability and scalability of the entire pipeline infrastructure assure near-real-time synchronization between mainframe sources and Redshift target tables.

Redshift tables and views: something for everybody

Within this framework, the Redshift staging tables (often referred to as “delta tables”) are constantly accruing historical data, ideally suited for data scientists looking to do trend analysis, predictive analytics, ML, and AI work.  For business analysts and others who prefer structured data representations of potentially complex hierarchical data, the Treehouse framework also automatically provides structured user-views, providing the look and feel of a SQL database.

…as innovations move faster along the timeline, keep your options open!

Publishing both bulk-load and CDC data to a reliable and scalable framework like Kafka allows you to maintain a broad array of options to ultimately feed your legacy data to any number of JSON-friendly ETL tools, target datastores, and data analytics packages (some of which may not even have been invented yet!).  In addition to Redshift, the Treehouse Dataflow Toolkit also currently targets Snowflake, Amazon DynamoDB, and Amazon Athena/S3.

Video – Introduction to Data Warehousing on AWS with Amazon Redshift…


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo of our Mainframe-to-AWS real-time and bi-directional data replication solution. 

Treetip: Treehouse Software can help enterprise mainframe customers accelerate their data analytics, machine learning, and AI journeys by targeting the new Amazon S3 Express One Zone

by Joseph Brady, Director of Business Development and Cloud Alliance Leader at Treehouse Software, Inc.

Treehouse Software specializes in helping enterprise customers with Mainframe-to-Cloud, Multi-Cloud, and Hybrid Cloud data modernization projects. Many times, our customers not only discuss strategies for replicating their mainframe data, but also their plans for what they want to do with that data on the Cloud side.  This makes it important to our team to stay current on the latest Cloud offerings that can benefit our customers’ enterprise modernization planning. Consequently, a very exciting announcement caught our attention during the 2023 AWS re:Invent conference—the general availability of a new type of S3 storage service referred to as Amazon S3 Express One Zone Storage Class

For those unfamiliar, Amazon S3 (“simple storage service”) is the basic file storage service of AWS, and as such it forms a foundational pillar of the entire AWS world. Amazon S3 Express One Zone is a new type of S3 bucket called a “directory bucket”, which is purpose-built to deliver consistent, single-digit millisecond data access for an enterprise’s most frequently used data and latency-sensitive applications. The new S3 directory buckets allow customers to store data in a single Availability Zone (AZ) that they specifically select, as opposed to the default of three AZs for standard S3. This eliminates the latency associated with spreading data across multiple AZs, providing applications with lower-latency storage. S3 directory buckets also follow a different request scaling model compared to traditional buckets, and their authentication is based on sessions rather than on a per-request basis. Bottom line… reduction in compute time = greater cost reduction.

S3 Express One Zone is ideally suited for services such as Amazon SageMaker Model TrainingAmazon AthenaAmazon EMR, and AWS Glue Data Catalog to accelerate Machine Learning (ML) and interactive analytics workloads. With S3 Express One Zone, storage automatically scales up or down based on consumption and need, and customers no longer need to manage multiple storage systems for low-latency workloads.

So, why is S3 Express One Zone important to Treehouse mainframe modernization customers?

____0_Mainframe_To_S3ExpressOneZone

Amazon S3 Express One Zone just made the Amazon S3 targeting in the Treehouse Dataflow Toolkit (TDT) potentially much more potent and valuable to our enterprise mainframe customers.  When an enterprise uses TDT to land their mission critical data in Express One Zone flavored Athena/S3 buckets, it becomes more directly accessible and manipulable by the various AWS ML and AI tools. In short, if customers choose, Express One Zone Athena/S3 becomes an intermediate data store for big data processing workloads and advanced analytics.

So, when we are asked, “What should Treehouse Software be doing to respond to the burgeoning interest in ML, Generative AI, etc.?”, the answer is — We are doing exactly what we need to be doing.  AI and ML frameworks are the newest incentive for people to use RDRS (Rocket Data Replicate and Sync — formerly called tcVISION) and TDT from Treehouse Software to replicate their mainframe data on advanced data analytics frameworks, or possibly into super-charged S3 Express One Zone buckets.  

Video – Deep Dive Introduction to Amazon S3 Express One Zone Storage Class:


__TSI_LOGO

Contact Treehouse Software today to discuss your project, or to schedule a demo of our Mainframe-to-AWS real-time and bi-directional data replication solution.