Member since
10-07-2015
107
Posts
73
Kudos Received
23
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2519 | 02-23-2017 04:57 PM | |
1971 | 12-08-2016 09:55 AM | |
8853 | 11-24-2016 07:24 PM | |
3957 | 11-24-2016 02:17 PM | |
9314 | 11-24-2016 09:50 AM |
05-10-2016
08:58 AM
Try select `_c0` from xml_table
... View more
05-09-2016
04:36 PM
Well, this goes now into bash programming: The part between the lines "cat << EOF" and "EOF" is a so called "here doc" that writes the actual pig script. Everything starting with $ is a variable ($0,$1,... are predefined in bash with $0 containing the script/program name, $1 the first actual parameter, $2 the second and so on; $@ gives back all provided parameters joined with a space ' ' by default). Note: Setting variables (e.g. FUN, TAB) is done without $, referencing with $ So you can add any logic before "cat << EOF" to set variables leveraging your input parameters and reference them in the "here doc" to get the pig script you want. For more see e.g. http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html and http://www.tldp.org/LDP/abs/html/index.html (and at many other locations).
... View more
05-09-2016
02:02 PM
1 Kudo
Have you considered wrapping it into a shell script? Here is a simple example (test.sh) #!/bin/bash
TMPFILE="/tmp/script.pig"
FUN=$1 # pass the pig function as first parameter
TAB=$2 # pass the column as second parameter
cat <<EOF > "$TMPFILE"
iris = load '/tmp/iris.data' using PigStorage(',')
as (sl:double, sw:double, pl:double, pw:double, species:chararray);
by_species = group iris by species;
result = foreach by_species generate group as species, $FUN(iris.$TAB);
dump result;
EOF
pig -x tez "$TMPFILE" You can call it e.g. as bash ./test.sh MAX sw to get the maximum of column "sw", or bash ./test.sh AVG sl to get the average of column "sl"
... View more
04-22-2016
03:27 PM
1 Kudo
I understand https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions that Hive ACID (which is necessary for DELETE commands) only works on bucketed ORC tables. I would expect that even the INSERT wouldn't work when you use 'transactional'='true' without being compliant with the mentioned prerequsites If you want to have SQL on HBase I would go for Apache Phoenix (https://phoenix.apache.org/)
... View more
04-22-2016
03:08 PM
Set the environment variable HIVE_SERVER to your hive server machine, or use beeline -u jdbc:hive2://this.is.my.hiveserver.name:10000/default?hive.execution.engine=tez -n hive
... View more
04-22-2016
03:02 PM
you need to compile it. spark-submit wants the jar file, see http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications
... View more
04-22-2016
02:32 PM
And if you just want to quickly see whether hive works you can use command line beeline: beeline -u jdbc:hive2://$HIVE_SERVER:10000/default?hive.execution.engine=tez -n hive
... View more
04-22-2016
02:28 PM
Have you tried restarting ambari-server (command line as root: "ambari-server restart"). This the JVM where I think the Hive views runs in ...
... View more
04-22-2016
02:10 PM
See http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications "Self-Contained Applications" for all languages
... View more
04-22-2016
01:22 PM
1 Kudo
Can you try to change that via Ambari Zookeeper config? Ambari will overwrite everything in the config files that you change directly on the filesystem
... View more
- « Previous
- Next »