- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Issue when using parquet org.kitesdk.data.DatasetNotFoundException: Descriptor location does not exist
- Labels:
-
Apache Hadoop
-
Apache Sqoop
Created ‎06-08-2016 01:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am getting this issue when using sqoop with parquet
Created ‎06-11-2016 05:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Then tried changing the dependency version for kite-sdk from 1.0.0 to 1.1.0 , and the issue gone . It worked !!! Issue resolved.
Created ‎06-08-2016 01:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you please share the complete error along with sqoop command being used?
The issue might be when HIVE_HOME/HCAT_HOME is not set as Sqoop will use HIVE_HOME/HCAT_HOME to find hive libs, which are needed in hive import as Parquet file.
Thanks and Regards,
Sindhu
Created ‎06-09-2016 10:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
sqoop import --connect jdbc:oracle:thin:@XXX:XXXX/YYYY --username YYYYY --password YYYYY --table A.BBBB --hive-import --hive-database default --hive-table test15 --as-parquetfile -m 1
Job job_1465371735536_0055 failed with state FAILED due to: Job commit failed: java.lang.IllegalArgumentException: Wrong FS: ____file:/tmp/default/.temp/job_1465371735536_0055/mr/job_1465371735536_0055/b402d4ba-1a16-46bc-92c6-91fe141070d2.parquet, expected: hdfs://lxapp5524.dc.corp.telstra.com:8020 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:646)
Created ‎06-09-2016 11:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are only getting the above error if we are using parquet otherwise my table get pulled in hive easily .Please keep in mind that we have installed the HDP without internet so it's highly possible that we missed some thing.
Created ‎06-10-2016 12:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are facing the above error while we were using the below query
sqoop import --connect jdbc:oracle:thin:@XX:1521/DATABASENAME --username USER --password PWD --table SCHEMANAME.TABLENAME --hive-import --hive-table TABLENAME --hive-overwrite --num-mappers 1 --as-parquetfile
It's an issue when we are using Parquet and trying to ingest data in hive only because if we do the ingestion in hdfs with parquet , it gets completed.
Created ‎06-10-2016 03:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am also getting same wrror on hdp 2.4 , while doing sqoop hive-import with parquet . Without parquet it is working fine .
16/06/09 21:12:11 INFO mapreduce.Job: Job job_1465467652802_0011 failed with state FAILED due to: Job commit failed: java.lang.IllegalArgumentException: Wrong FS: _______file:/tmp/default/.temp/job_1465467652802_0011/mr/job_1465467652802_0011/dc944213-b925-4e5b-ac2c-736e5fa8610f.parquet, expected: hdfs://lxapp5524.dc.corp.hdp.com:8020 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:646) at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:194) at org.apache.hadoop.hdfs.DistributedFileSystem.rename(DistributedFileSystem.java:636) at org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:327) at org.kitesdk.data.spi.filesystem.FileSystemDataset.merge(FileSystemDataset.java:56) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$MergeOutputCommitter.commitJob(DatasetKeyOutputFormat.java:370) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:285) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Created ‎06-11-2016 05:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Then tried changing the dependency version for kite-sdk from 1.0.0 to 1.1.0 , and the issue gone . It worked !!! Issue resolved.
Created ‎04-12-2018 07:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We were able to resolve it here:
https://discuss.cloudxlab.com/t/sqoop-import-to-hive-as-parquet-file-is-failing/1089/6
