Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

What exactly sqoop will validate with --validate option ?

avatar
New Member
 
1 ACCEPTED SOLUTION

avatar
Master Guru

The validate option works only on tables in HDFS, not on those in Hive and HBase. It works both with "import" and "export" Sqoop commands. The default validation class is org.apache.sqoop.validation.RowCountValidator which compares the number of rows in the source and destination tables. You can customize it by providing your own validation class, which must implement the org.apache.sqoop.validation.Validator interface. Check also the response to the same question asked before, and Sqoop documentation, chapter 11.

View solution in original post

2 REPLIES 2

avatar

@Rajendra Kalepu

Validate option is to validate the data being imported into HDFS/Hive table against the input table using the table row count and number of rows copied. Refer to the link for details.

avatar
Master Guru

The validate option works only on tables in HDFS, not on those in Hive and HBase. It works both with "import" and "export" Sqoop commands. The default validation class is org.apache.sqoop.validation.RowCountValidator which compares the number of rows in the source and destination tables. You can customize it by providing your own validation class, which must implement the org.apache.sqoop.validation.Validator interface. Check also the response to the same question asked before, and Sqoop documentation, chapter 11.