Member since
07-07-2017
12
Posts
0
Kudos Received
0
Solutions
09-15-2017
05:59 AM
were you able to find a resolution? I get the same error when I try to delete.
... View more
09-14-2017
09:48 AM
Hi Guys, can someone please explain how does the below join work in the bottom query. Its from thisl tutorial. I don't get the bit after "concat". thanks ON o.swid = concat('{', u.swid ,'}');
CREATE TABLE webloganalytics as
SELECT to_date(o.ts) logdate, o.url, o.ip, o.city, upper(o.state) state,
o.country, p.category, CAST(datediff(from_unixtime(unix_timestamp()), from_unixtime(unix_timestamp(u.birth_dt,'dd-MMM-yy')))/365 AS INT) age, u.gender_cd
FROM omniture o
INNER JOIN products p
ON o.url = p.url
LEFT OUTER JOIN users u
ON o.swid = concat('{', u.swid ,'}');
... View more
Labels:
07-20-2017
01:57 AM
Hello Guys, I am having trouble with my query below. Searched a lot on the internet but nothing helped. Please help me understand where the issue is sqoop import \
--connect jdbc:mysql://10.0.0.24/hive_data \
--username root -P \
--query 'select d.driverid, d.name, SUM(t.hours_logged) as hours from driver d JOIN timesheet t on (d.driverid = t.driverid) group by d.driverid,d.name Where $CONDITIONS' --split-by d.driverid \
--target-dir /home/hadoop/sqoop \
--driver com.mysql.jdbc.Driver The error I get is below SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/07/20 01:48:14 INFO manager.SqlManager: Executing SQL statement: select d.driverid, d.name, SUM(t.hours_logged) as hours from driver d JOIN timesheet t on (d.driverid = t.driverid) group by d.driverid,d.name Where (1 = 0)
17/07/20 01:48:14 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Where (1 = 0)' at line 1
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Where (1 = 0)' at line 1 please help
... View more
Labels:
- Labels:
-
Apache Sqoop
07-12-2017
09:02 AM
@Abhishek Kumar will do. Regarding the joins and the SQL, should I expect it to be super hard or just average difficulty? cheers
... View more
07-12-2017
07:10 AM
thank you @Abhishek Kumar. are you saying there will be no ambari for pig and hive available at all in the exam? that would mean getting used to the terminal a lot.
... View more
07-12-2017
04:25 AM
Hello Guys, I am preparing for the hdpcd certification exam and have few questions. The exam will be on HDP2.4 whereas the version available to download from the hortonworks website is HDP2.6, is there much difference in the 2 version? Same goes for other tools HIVE,PIG,SQOOP, they all older versions on the exam. Can I be learning on the new versions in HP2.6 and still be able to do the exam? Will I have access to Ambari HIVE and Ambari PIG in the exam to run my scripts on? or do I have to rely on terminal? I have done all the tutorials on the hortonworks website for PIG and HIVE, is there any additional work recommended? thanks
... View more
Labels:
07-11-2017
04:22 AM
Thanks @Lester Martin. I removed -useHcatalog and readded it and now it seems to display the results of the script. It was very hard to learn without knowing the output of the script at each step.
... View more
07-10-2017
07:16 AM
@Lester Martin I am running the script from Ambari and I have -useHCatalog argument added. But when I run the below script a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();
b = FILTER a BY event != 'normal';
dump b; Instead of getting the output of the script, I get what you can see in the attached txt file. I want to know if this is normal and how can I see what the script is doing. thanks results.txt
... View more
07-07-2017
10:27 PM
@slachterman - I seen you answered a similar question before. Can you please help me out.
... View more
07-07-2017
06:28 AM
Hello Guys,
I am trying to follow the tutorial-100 for apache pig. When I run the script, in the results tab I do not see the output of the script and it is very hard to understand what the script is doing.
In the results I get the below
pache Pig version 0.16.0.2.6.0.3-8 (rexported)
compiled Apr 01 2017, 21:50:35
USAGE: Pig [options] [-] : Run interactively in grunt shell.
Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
Pig [options] [-f[ile]] file : Run cmds found in file.
options include:
-4, -log4jconf - Log4j configuration file, overrides log conf
-b, -brief - Brief logging (no timestamps)
-c, -check - Syntax check
-d, -debug - Debug level, INFO is default
-e, -execute - Commands to execute (within quotes)
-f, -file - Path to the script to execute
-g, -embedded - ScriptEngine classname or keyword for the ScriptEngine
-h, -help - Display this message. You can specify topic to get help for that topic.
properties is the only topic currently supported: -h properties.
-i, -version - Display version information
-l, -logfile - Path to client side log file; default is current working directory.
-m, -param_file - Path to the parameter file
-p, -param - Key value pair of the form param=val
-r, -dryrun - Produces script with substituted parameters. Script is not executed.
-t, -optimizer_off - Turn optimizations off. The following values are supported:
ConstantCalculator - Calculate constants at compile time
SplitFilter - Split filter conditions
PushUpFilter - Filter as early as possible
MergeFilter - Merge filter conditions
PushDownForeachFlatten - Join or explode as late as possible
LimitOptimizer - Limit as early as possible
ColumnMapKeyPrune - Remove unused data
AddForEach - Add ForEach to remove unneeded columns
MergeForEach - Merge adjacent ForEach
GroupByConstParallelSetter - Force parallel 1 for "group all" statement
PartitionFilterOptimizer - Pushdown partition filter conditions to loader implementing LoadMetaData
PredicatePushdownOptimizer - Pushdown filter predicates to loader implementing LoadPredicatePushDown
All - Disable all optimizations
All optimizations listed here are enabled by default. Optimization values are case insensitive.
-v, -verbose - Print all error messages to screen
-w, -warning - Turn warning logging on; also turns warning aggregation off
-x, -exectype - Set execution mode: local|mapreduce|tez, default is mapreduce.
-F, -stop_on_failure - Aborts execution on the first failed job; default is off
-M, -no_multiquery - Turn multiquery optimization off; default is on
-N, -no_fetch - Turn fetch optimization off; default is on
-P, -propertyFile - Path to property file
-printCmdDebug - Overrides anything else and prints the actual command used to run Pig, including
any environment variables that are set by the pig command.
and under the log, I see this
WARNING: Use "yarn jar" to launch YARN applications.
17/07/07 06:16:36 INFO pig.Main: Pig script completed in 196 milliseconds (196 ms)
The script I am running is below. Please advise if the output in the results is normal. If it is normal, how can I see what the output of the script at each step. Thanks a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();b = FILTER a BY event != 'normal';
c = FOREACH b GENERATE driverid, event, (int)1 as occurance;
d = GROUP c BY driverid;
e = FOREACH d GENERATE group as driverid, sum(c.occurance) as t_occ;
g = LOAD 'driver_mileage' USING org.apache.hive.hcatalog.pig.HCatLoader();
h = join e by driverid,g by driverid;
dump h;
... View more
Labels: