Sqoop:

  • Sqoop  is an open source tool that allows user to extract data from structured data store into  hadoop for further processing.
  • After loading data into HDFS , it can be processed by map reduce or any other high level language like hive etc.
  •  Data can be loaded from HDFS to relational databases for user.
  •  By default, Sqoop will generate comma-delimited text files for imported data.
  • Sqoop uses primary key of relational table for splitting. Each split will be process by individual mapper..
  • Sqoop will run 4 mapper by default to load data into HDFS.
  • User can change number of mapper by property  –m  <no. of mapper>

Comments

Popular posts from this blog

JDBC Hive Connection fails : Unable to read HiveServer2 uri from ZooKeeper

Access Kubernetes ConfigMap in Spring Boot Application

Developing Custom Processor in Apache Nifi