Member since
04-07-2017
80
Posts
33
Kudos Received
0
Solutions
04-20-2016
11:17 PM
Thank you for providing an alternation approach.
I am learning Pig and would like to try the stream command - see how to run python in pig. Is this the line, to be added as first line so that execution engine understands its python?
#! /usr/bin/env python
I tried but still get the same error. Could you please help. Thank you!!!
... View more
04-20-2016
02:26 AM
2 Kudos
Hi, I have a small Voters list(name,gender,place,age) where I wanted to eliminate the voters whose age is <= 20.
I wanted to try streaming in pig. When I run the dump on stream its fails and is unable to idenetify python commands. I have attached python script, input data file, pig script and log file.
Could you guide where should I install the python in Sandbox. Thank you. Input: AAA,Female,Blr,40
BBB,Female,London,35
YYY,Female,Pondy,12
JJJ,Male,London,4
SSS,Female,Pondy,30 pig script in tez_local mode:
grunt> Voters = LOAD 'file:///user/revathy/pig/Voters.txt' USING PigStorage(',') AS (VoterName:chararray,Gender:chararray,Place:chararray,Age:int);
grunt> Eligible = STREAM Voters THROUGH `/root/revathy/pig/hello.py` AS (VoterName:chararray,Gender:chararray,Place:chararray,Age:int); Python script:(Tested in Python editor) import sys
THRESHOLD = 20 def filterVal(line,val4):
if int(val4) > THRESHOLD:
sys.stdout.writelines(line)
return
try:
for line in sys.stdin.readlines():
val1,val2,val3,val4 = str(line).split(",")
filterVal(line,val4)
except:
print "Error in try block" Log: /root/revathy/pig/hello.py:
line 1: import: command not found
/root/revathy/pig/hello.py:
line 2: THRESHOLD: command not found
/root/revathy/pig/hello.py:
line 3:
: command not found
... View more
Labels:
- Labels:
-
Apache Pig
04-15-2016
02:24 AM
Hi Predrag,
I have found the reason for the warnings.
It should be: source_agent.sinks.avro_sink.channel = memoryChannel I have mentioned channels. I have corrected the warning and the sink file is created in hdfs.
But how do I know that the flume is running successfully.
Thank you.
... View more
04-15-2016
12:35 AM
Hi Predrag, Thank you for your response.
I am preparing for certification and trying to execute flume agent. This is the command I use:
flume-ng agent --conf conf --conf-file /Revathy/Flume/source_agent.conf --name source_agent
exec-source.txt I am trying to understand the execution messages from running the flume agent.
1. How do I confirm that the agent is running fine(successful)? 2. I have attached the conf file and execution log.
There are few warnings like Configuration property ignored, No channel configured.
The code looks fine to me. Can the warning be ignored or should be treated like error and fixed?
log1.pnglog2.pnglog3.png
3. The source file is in LFS. The sink file is not created.
Is the path - hdfs://sandbox.hortonworks.com:8080/Revathy/Flume/test, since I am not sure where to find the port?
Thank you.
... View more
04-13-2016
04:53 PM
exec-source.txtHi, I have used the below Flume program to read a file from LFS to HDFS, for learning. But I see not folder created. Do you see any issue in this file. I wanted to see how interceptor works. Thank you.
... View more
Labels:
- Labels:
-
Apache Flume
04-10-2016
03:24 PM
Hi, I got this link from the before post: http://hortonworks.com/blog/configure-elastic-search-hadoop-hdp-2-0/ "flume example" The diagram show the sink can be hdfs or elastic search.
I would like to try with hdfs in sandbox.
What should be the value of sink for hdfs.
I tried sink.type=logger, the configuration file is throwing an error.
Since I am learning flume by trying out the examples mentioned in sandbox, it would be good if you could help me. Thank you,
Revathy.
... View more
Labels:
- Labels:
-
Apache Flume
04-10-2016
03:18 PM
Hi Daniel, Were you able to run this flume example?
Because I am trying.
What would be the values if the sink is hdfs and not elasticsearch? any idea Thank you.
... View more
04-03-2016
01:35 PM
3 Kudos
Finally the below step has helped:(taken from the previous questions) hdfs dfs -chown -R root:hdfs /user/root sqoop import --connect jdbc:mysql://sandbox.hortonworks.com:3306/test --username root --table customerInfo --driver com.mysql.jdbc.Driver --m 1 It is working fine. But, where to find the hostname "sandbox.hortonworks.com:3306" that should be used. Thank you.
... View more
04-03-2016
01:12 PM
Sqoop-env file looks like below:(Should this be changed as per https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_installing_manually_book/content/set_up_sqoop_configuration.html) export HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}
export HBASE_HOME=${HBASE_HOME:-{{hbase_home}}}
export HIVE_HOME=${HIVE_HOME:-{{hive_home}}}
export ZOOCFGDIR=${ZOOCFGDIR:-/etc/zookeeper/conf}
export SQOOP_USER_CLASSPATH="`ls ${HIVE_HOME}/lib/libthrift-*.jar 2> /dev/null`:${SQOOP_USER_CLASSPATH}"
export SQOOP_CONF_DIR="/usr/hdp/current/sqoop-server/conf" ----> I added this line
... View more
04-03-2016
06:12 AM
2 Kudos
Hi, I am using the sandbox to practice Sqoop.
From the SSH shell, I entered to mysql prompt and have created a table customerInfo under the database test. Now, I have from the command prompt I typed the following command:
sqoop import \
--connect jdbc:mysql://localhost/test \
--username root \
--password xxxx \
--table customerInfo
--m 1 and got the error : access denied for user 'root'@'localhost'.
1) Could you reply which password should I use?
I tried with hadoop. I am using the mysql that comes along with the sandbox. 2) Where can I see the port configuration as mentioned in:
https://alexeikh.wordpress.com/2012/05/03/using-sqoop-for-moving-data-between-hadoop-and-sql-server/ Thank you.
... View more
Labels:
- Labels:
-
Apache Sqoop