Redwood Software, parent company of Tidal, was again named a Gartner® Magic Quadrant™ for SOAP Leader. Get the report

Adapter Apache Sqoop

Hadoop Sqoop

Apache Sqoop is a big data tool for transferring data between Hadoop and relational database servers. It acts as a bridge for data migration and integration between relational and distributed data environments.

Incorporate big data into any workflow

Enjoy a user-friendly approach to Sqoop.

Use parallel data transfer

Transfer bulk data efficiently between Hadoop and external data stores.

Simplify data integration

Import and export data without extensive coding.

Easily handle failures

Ensure data integrity with built-in fault tolerance.

Resilient data integration

The Tidal adapter makes it easy to import and export data from structured data stores such as relational databases and enterprise data warehouses. Sqoop transfers data between Hadoop and relational databases.

You can use Sqoop to import data from a relational database management system (RDBMS) into the Hadoop Distributed File System (HDFS™), transform the data in Hadoop MapReduce and then export the data back into an RDBMS. The adapter allows you to automate the tasks carried out by Sqoop.

What the adapter enables

The integration enables the following job definitions in Tidal:

  • Code Generation: Generate Java classes that interpret imported records.
  • Export: Export files from HDFS™ back to an RDBMS.
  • Import: Import structure data from an RDBMS to HDFS™.
  • Merge: Combine two datasets, where entries in one dataset will overwrite entries of an older dataset.

How it works

  • Data import:
    • Sqoop introspects the database to gather the necessary metadata for the data being imported
    • A map-only Hadoop job that Sqoop submits to the cluster does the data transfer using the metadata captured in the previous step
  • Data export:
    • Introspect the database for metadata, then transfer the data
    • Sqoop divides the input dataset into splits and uses individual map tasks to push the splits to the database
    • Each map task performs this transfer over many transactions to ensure optimal throughput and minimal resource utilization

Tidal and Hadoop Scoop integration FAQs

  • What is the purpose of Sqoop in Hadoop?

    Sqoop efficiently transfers bulk data between Hadoop and external data stores like relational databases (e.g., MS SQL Server, MySQL), enabling Hadoop-based data processing.

  • Why integrate Tidal workload automation with Hadoop?

    Integrating Tidal workload automation with Hadoop provides centralized control, visibility and management of Hadoop jobs alongside other enterprise applications. This seamless connection improves efficiency, reduces complexity and drives faster responses to business needs. It enables organizations to schedule, monitor and manage complex workloads across diverse platforms, including Hadoop, enterprise applications and the cloud, from a single management interface.