by Joseph Brady, Director of Business Development at Treehouse Software, Inc., Dan Vimont, Director of Innovation at Treehouse Software, Inc., and Ram Dhakne, Staff Solutions Engineer at Confluent
Enterprise customers who are planning to modernize their data on Cloud environments are stating their needs clearly… “We want a way to unify and manage data from our applications, databases, data warehouses, etc., which have long operated in silos.”
These customers also have a crucial need to tap into today’s advanced data analytics platforms, such as Snowflake, Amazon Redshift, and Amazon Athena/S3, where an ever-expanding array of machine learning and artificial intelligence (ML/AI) tools are available to generate vital insights from their enterprise’s data. Data science teams are eagerly awaiting the arrival of critical data from their enterprise’s data sources to supercharge their predictive analytics and generative AI frameworks.
Data Transfer + Unlimited Scaling and Storage
To address the need for rapid, high-volume data transfer from source DBs to Analytics/ML/AI-friendly platforms, Treehouse Software has recently gone to market with two powerful new offerings: Treehouse Dataflow Toolkit (TDT) for Mainframe Data Sources and TDT-DIRECT for Non-Mainframe Data Sources. These Cloud-native, fully automated, turn-key solutions work hand-in-hand with the premiere data streaming platform, Confluent to empower enterprise customers to rapidly migrate data – both bulk-load and change data capture (CDC) – to Snowflake, Amazon Redshift, Amazon Athena/S3, and Amazon S3 Express One Zone.
The TDT offerings are much more than mere “connectors”, providing an innovative and robust Lambda-based microservices infrastructure that automatically generates all target resources required for data transfer. Without TDT-DIRECT’s fully automated approach, a customer can spend months designing and creating target resources, such as delta tables, views, schemas, etc.
TDT-DIRECT extracts data directly from a source DB and loads it via Confluent into Snowflake’s “delta tables”, which inherently retain the entire history of source data ever since the source-to-target synchronization began (perfect for time-based trend/predictive/prescriptive analytics).
Figure 1: TDT-DIRECT automatically creates all Snowflake target structures (schemas, history tables, current views, user views, stages, and file formats), and Confluent delivers the data (e.g., insert, update, delete transactions) via bulk-load and CDC.
Leveraging AWS CloudFormation for ease of implementation…
For ease of implementation, TDT is delivered via CloudFormation templates, allowing customer sites to be up and running with a fully preconfigured implementation of a new data transfer pipeline in minutes. The TDT CloudFormation Templates create stacks consisting of all principal framework components, along with related IAM policies and roles which are carefully engineered to comply with “best practices” (such as a “least privileges” approach to permissions).
The TDT CloudFormation Templates also optionally provide for automatic creation of a VPC, its subnets, and all required standard VPC-oriented resources, as well as optional creation of a source database cluster (consisting of either a sample database provided by Treehouse for a quick trial/POC, or your own database and data).
The Confluent Advantage…
Treehouse Software’s TDT solutions fully support data transfers from mainframe and non-mainframe data sources to Confluent Cloud, which offers enhanced productivity, improved scalability, minimized downtime, and much more—all while reducing total cost of ownership. Confluent Cloud brings customers a Fully Managed Kafka Service and Complete Pre-Built Ecosystem that includes:
- Elastic Scaling: Scale up and down quickly to meet fluctuating customer demand, without the ops burden that comes with scaling your data infrastructure.
- Infinite Storage: Enable powerful use cases by never having to worry about Kafka retention limits again, while only paying for the storage used
- Built-in Resiliency: Ensure high availability and offload Kafka ops with 99.99% uptime SLA, multi-AZ clusters, and no-touch Kafka patches
- Serverless stream processing for Apache Flink®: Flink is the de facto industry standard for stream processing. Confluent Cloud for Apache Flink provides a cloud-native, serverless service for Flink that enables simple, scalable, and secure stream processing that integrates seamlessly with Apache Kafka®. Your Kafka topics appear automatically as queryable Flink tables, with schemas and metadata attached by Confluent Cloud.
A Powerful, Combined Solution…
Treehouse Software and Confluent provide a comprehensive framework that allows the target platform to constantly accrue the most current source data, which is ideally suited for data scientists looking to do trend analysis, predictive analytics, ML, and AI work.
Treehouse Dataflow Toolkit (TDT) and TDT-DIRECT are Copyright ©Treehouse Software, Inc. All rights reserved.
Contact Treehouse Software for a TDT Demo Today!
Treehouse Software offers SIs and consulting companies free “deep dive” learning sessions to educate your team on the value of bringing these turn-key data transfer solutions your customers.
Contact us today to schedule your session!





























