Member since
09-24-2015
48
Posts
31
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
498 | 03-03-2017 06:37 AM | |
16112 | 09-06-2016 03:57 AM | |
1608 | 09-02-2016 01:43 PM | |
1177 | 09-02-2016 06:33 AM |
06-06-2017
06:26 PM
@JT Ng >> But the errror now is "JA008: File does not exist hdfs://localhost/test-oozie/examples/apps/sqoop/tez.xml#tez.xml" Did you mean to specify tez-site.xml?
... View more
03-24-2017
07:44 PM
2 Kudos
@ksuresh Thanks for catching the issue in the doc. You can either remove the conflicting avro files (sqoop in HDP 2.5 and 2.6 ships with avro 1.8.0 jar files) or you can add the following to the sqoop command line and run. sqoop import -Dmapreduce.job.user.classpath.first=true <rest of the arguments> Beverley Yes, it would be good to mention this in the docs.
... View more
03-10-2017
11:15 PM
BTW, this has to come before distcp arguments (like -update etc)
... View more
03-10-2017
10:47 PM
You can set up your args like this and remove from java-opts <arg>-D</arg>
<arg>ipc.client.fallback-to-simple-auth-allowed=true</arg>
<arg>hdfs://aervits-hdp70:8020/tmp/hellounsecure</arg>
<arg>hdfs://hacluster:8020/user/centos/</arg> Thanks
... View more
03-10-2017
08:01 PM
Let me check. The Java opts might be getting added after the class
... View more
03-10-2017
05:29 PM
You should have the java-opts right after configuration element and before arg.
... View more
03-10-2017
04:59 PM
1 Kudo
Can you add a space between -D and ipc.client... This is not a Java system property but should be the -D option to the Tool runner
... View more
03-03-2017
02:46 PM
1 Kudo
Thanks @Artem Ervits for the series of articles. Oozie has a bunch of features that are not tapped because it is not easily approachable. When we started off with Workflow designer we wanted to make an easy to use editor for users with some knowledge of Oozie and Hadoop to create workflows and explore further. Also, one of the common issues is the dashboard - Oozie UI using old UI we wanted to provide a new UI experience for the dashboard even if workflows are generated outside of workflow manager.
... View more
03-03-2017
06:37 AM
@Artem Ervits Unfortunately such a functionality is not currently available
... View more
02-08-2017
07:15 PM
1 Kudo
MS JDBC driver 4.0 and later allow users to use Java kerberos option with username and password. But this is different from Integrated Authentication which is not supported where you use the kerberos cluster of hadoop to authenticate with Sql Server as there is no kerberos context in containers
... View more
02-08-2017
06:16 PM
> IntegratedSecurity=true; This is not supported - your eval etc will work because they are done from the local java process but there is no kerberos credential context in the container when the mappers run to do the import work
... View more
01-29-2017
12:08 AM
1 Kudo
This is a problem from the hive move task (which has since been fixed in HIVE-15355) which is called by Sqoop after the import into HDFS. So, disabling move task parallelism is the right solution by adding the configuration parameter hive.mv.files.thread=0. That said, I would suggest using --hcatalog-table option with import which allows for 1. better data fidelity 2. remove one intermediate step of landing on HDFS and then invoking the hive client to do the import
... View more
01-16-2017
06:36 PM
You should set something like queueName = sqoop in your job properties and refer to it in the workflow action config for the two parameters. Sorry if it was not clear
... View more
01-15-2017
03:23 PM
2 Kudos
when submitting jobs via oozie, there is a laucher job and the launched job (in this case mr job launched by Sqoop). You are probably seeing the launcher job getting submitted to the default queue. To get the launcher job also go to sqoop queue, you need to add the following config property to the workflow.xml or the job propert oozie.launcher.mapreduce.job.queuename = sqoop In general, if you want to pass any config to the launcher job, you need to prefix the config name with oozie.launcher. -- For example, if you are running a hive action and the need to configure a larger map memory for the hive client
... View more
01-10-2017
06:08 PM
3 Kudos
You need to register download and register the bdb jar with ambari server before starting Falcon server. This is required from HDP 2.5 or later now The following steps would help
wget -O je-5.0.73.jar http://search.maven.org/remotecontent?filepath=com/sleepycat/je/5.0.73/je-5.0.73.jar cp je-5.0.73.jar /usr/share/ chmod 644 /usr/share/je-5.0.73.jar ambari-server setup --jdbc-db=bdb --jdbc-driver=/usr/share/je-5.0.73.jar ambari-server restart Restart Falcon service
... View more
01-06-2017
07:17 AM
@Ed Berezitsky >> Small correction: if you use hcatalog, but your table is still textfile format with "|" field delimiter, you'll still have the same issue The output file field delimiters are only needed for HDFS imports. In the case of Hcatalog imports, you tell the text file format properties as part of the storage stanza and the defaults for hive will be used. Essentially, the default storage format should be ok to handle this. BTW, hcatalog import works with most storage formats, not just ORC @Krishna Srinivas You should be able to use a Hive table using Spark SQL also - but may be you have other requirements also. Glad to see that @Ed Berezitsky's solution worked for you
... View more
01-05-2017
06:12 PM
If HDFS is just an intermediate destination before loading into hive, you can skip the step and directly load into Hive using the hcatalog-table option in sqoop which provides better fidelity of data and removes one step (and supports all Hive data types also) Please see https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_sqoop_hcatalog_integration
... View more
09-08-2016
06:14 PM
If you are using multiple clusters, you need to make sure that the hadoop configuration that Oozie uses for the target cluster (see oozie.service.HadoopAccessorService.hadoop.configurations property in oozie-site.xml) is correctly configured. By default in a single cluster environment, Oozie will point to the local core-site.xml for this by default
... View more
09-07-2016
03:03 AM
Falcon uses hadoop distributed filesystem abstraction to do the replication -be it s3 or wasb. It essentially uses distcp, so whatever requirements are there for distcp in terms of accessing a filesystem applies to Falcon replication as well
... View more
09-06-2016
08:49 PM
1 Kudo
If your cluster endpoint is ponting to HDFS, then the feed locations will be based on that that unless they are absolute path. Can you provide an example of what you are trying to do and the exceptions that you are getting.
Thanks
... View more
09-06-2016
05:14 AM
without compression [numFiles=8, numRows=6547431, totalSize=66551787, rawDataSize=3154024078] with zlib [numFiles=8, numRows=6547431, totalSize=44046849, rawDataSize=3154024078] As you can see, the totalSize is less with zlib.
... View more
09-06-2016
03:57 AM
-rw-r--r--1 root root 2123 Sep214:31 ojdbc6.jar ojdbc.jar is much more than 2123 bytes. Most likely your download of the ojdbc.jar was not good
... View more
09-05-2016
06:22 PM
Make sure the permissions are right for the ojdbc.jar such that the user as whom you are running has access to the jar
... View more
09-05-2016
04:26 PM
1 Kudo
Make sure you add oracle jdbc driver to the sqoop sharelib in HDFS (find the current sharelib location using 'oozie admin -sharelibliblist sqoop' - For older installations (pre HDP -2.2) you can add the file to /user/oozie/share/lib/sqoop directory Make sure you either call oozie admin -sharelibupdate or restart oozie after that
... View more
09-04-2016
04:29 PM
You have a mismatch between yum cache and rpm db (most likely caused by interspersing yum and rpm commands). Do a yum clean all on the host and reinstall
... View more
09-04-2016
04:28 PM
1 Kudo
Starting with HDP 2.2, hive-site.xml need not be added to the workflow as in earlier versions of Oozie as Oozie will automatically add the hive configuration if the action-conf/hive/ directory is properly setup. Ambari already handles this by copying the hive-site.xml to action-conf/hive/ directory whenever hive config changes are done. If you still want to copy your own hive-site.xml, add it to the job-xml instead of as a file as Oozie creates hive-site as part of launching the hive client
... View more
09-04-2016
04:28 PM
Starting with HDP 2.2, hive-site.xml need not be added to the workflow as in earlier versions of Oozie as Oozie will automatically add the hive configuration if the action-conf/hive/ directory is properly setup. Ambari already handles this by copying the hive-site.xml to action-conf/hive/ directory whenever hive config changes are done. If you still want to copy your own hive-site.xml, add it to the job-xml instead of as a file as Oozie creates hive-site as part of launching the hive client
... View more
09-02-2016
01:47 PM
Integrated authentication does not work with SQLServer even in a secure cluster with AD integration as the containers will not have the context (even in secure windows clusters, the impersonation level does not support this) You can use the password alias (which makes use of the Hadoop credentials support) to use SQL server authentication without exposing the passwords.
... View more
09-02-2016
01:43 PM
You can additional options to the --storage-stanza option. The storage stanza is just what gets appended to the create table statement and you can add syntactically valid options (like tblproperties)
... View more