Member since
02-26-2016
28
Posts
6
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1561 | 07-21-2016 12:21 PM |
03-27-2018
07:35 AM
When you know your application ID try accessing your yarn logs Usage: yarn logs -applicationId <application ID> [options] COMMAND_OPTIONS Description -applicationId <application ID> Specifies an application id -appOwner <AppOwner> AppOwner (assumed to be current user if not specified) -containerId <ContainerId> ContainerId (must be specified if node address is specified) -help Help -nodeAddress <NodeAddress> NodeAddress in the format nodename:port (must be specified if container id is specified) https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#logs
... View more
07-18-2017
09:57 AM
if you also want your sqoop command included you can expend the command like this (with set -x and set +x:
{
echo $(date)
set -x
beeline -u ${hive2jdbcZooKeeperUrl} -f "file.hql"
set +x
echo $(date)
} 2>&1 | tee /tmp/sqoop.log
... View more
05-16-2017
07:56 AM
it's not only an issue with split, also with eg concat. escaping a semicolon in hue works fine but beeline errors with the same error as above. Replacing the semicolon with \073 works here as well.
... View more
01-13-2017
08:38 AM
If you want to use crontab that you already decided to use a time-trigger/interval right? You should really use a coordinator.
If you really want to stick with crontab than the command is more or less correct. You have a typo (--oozie should be -oozie, and the port normally is 11000 but I guess you already confirmed the port?) and normally you refer to a job.properties file (stored locally not on hdfs) with -config.
So it should look like: oozie job -oozie http://sandbox.hortonworks.com:11000/oozie -config /path/to/job.properties -run
In the job.properties file you would have parameters listed like namenode, jobtracker, hcmetastoreuri and of course the one you provide via the -D: oozie.wf.application.path Normally the hdfs://namenode part can be omitted from the apppath url.
... View more
12-14-2016
11:58 AM
We are currently using a mysql metastore in our test environment so it is possible (we run hpd 2.5). Only thing is that oozie requires the sqoop-site.xml file to be placed somewhere in hdfs to access the metastore. We don't really like the idea that passwords are just stored like that..
... View more
11-03-2016
01:10 PM
The answer is outdated. It is possible to use a character attribute as split-by attribute. You only need to add -Dorg.apache.sqoop.splitter.allow_text_splitter=true after your 'sqoop job' statement like this: sqoop job -Dorg.apache.sqoop.splitter.allow_text_splitter=true \\
--create ${JOB_NAME} \\
-- \\
import \\
--connect \"${JDBC}\" \\
--username ${SOURCE_USR} \\
--password-file ${PWD_FILE_PATH} \\ no guarantees though that sqoop splits your records evenly over your mappers though.
... View more
10-27-2016
04:52 PM
worked! many thanks, you saved my day.
... View more
10-27-2016
04:02 PM
@Gayathri Reddy G or @Sindhu can you tell me exactly how you added the -Dorg.apache.sqoop.splitter.allow_text_splitter=true to your import statement? I'm trying to create a sqoop job with this import statement but it keeps failing: 16/10/27 17:53:25 ERROR tool.BaseSqoopTool: Error parsing arguments for import:
16/10/27 17:53:25 ERROR tool.BaseSqoopTool: Unrecognized argument: -Dorg.apache.sqoop.splitter.allow_text_splitter=true my shell script to create the sqoop job looks like this: #!/bin/bash
sqoop job \
--create <tablename> \
-- \
import "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" \
--connect "jdbc:<URL>" If I remove the -Dorg line all goes file but offcourse the execution of the job failes thats why I need to pass this allow_text_splitter as a parameter. I tried with double quotes, without, but no luck unfortunately
... View more
09-27-2016
02:59 PM
Hi njayakumar, we have sqoop working with the mysql metastore but oozie gives errors that it can't find the driver to connect to the sqoop metastore. Caused by: java.sql.SQLException: No suitable driver found for 'jdbc:mysql://<server>.com/sqoop' The mysql-connector-java.jar is available in the java folder mentioned by you, also available in: /usr/hdp/current/oozie folders libserver, libtools and lib (as symbolic link) and for sqoop in /usr/hdp/current/scoop/lib folder any thoughts what we are missing here? I already got the same error trying to use the sqoop metastore 'service'. Oozie wasn't able to find that driver either...
... View more
07-21-2016
12:21 PM
1 Kudo
@Artem Ervits, we use a shell script to invoke a oozie workflow. Our script polls certain folders and if there are files they will be passed to the new invoked workflow. The shell script looks something like this: #!/bin/bash -e
for file in $(hdfs dfs -ls -R $pollfolder | grep "^-" | grep -Po "($pollfolder/[a-zA-Z]{2}_.*/[a-zA-Z]{2}_.*-[0-9]{1,}-.*.csv.gz)" | grep -vE '('$automatedfolder'|'$quarantinefolder')')
do
oozie job -oozie $ooziebaseurl -config $jobproperties -run \
-D file=$file \
This shell script can then be a shell action in a separate workflow that is triggered by a coordinator or can be just scheduled with cron. *I removed the creation of the variables that also happens in this script to save some space.
... View more