Member since
04-14-2016
54
Posts
9
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
20249 | 06-27-2016 07:20 AM | |
1583 | 05-09-2016 10:10 AM |
07-11-2016
03:45 PM
1 Kudo
You need to look into the logs. Most likely yarn logs of the Map Task of your Oozie launcher. This contains the sqoop command execution and any errors you would normally see on the command line. You can get them from resourcemanager ( click on your oozie launcher job and go through to the map task or use yarn application -logs. You can find any issues in the actual data transfer in the kicked off Mapreduce job which is a separate job
... View more
07-01-2016
02:43 PM
Having described all that I still think the proper Spark way is to use
df.write.format("csv").save("/tmp/df.csv")
or df.repartition(1).write.format("csv").save("/tmp/df.csv")
... View more
06-30-2016
09:57 AM
Thank you . it works well .
But this is performed in local Initiation and When I ' execute on cluster with the command:
spark-submit --master yarn-cluster --py-files hdfs:///dev/datalake/app/des/dev/script/lastloader.py --queue DES hdfs:///dev/datalake/app/des/dev/script/return.py
it generates this error in the logs : Log Type: stdout
Log Upload Time: Thu Jun 30 09:19:20 +0200 2016
Log Length: 3254
Traceback (most recent call last):
File "return.py", line 10, in <module>
df = Lastloader()
File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/__pyfiles__/lastloader.py", line 13, in Lastloader
qvol1 = hive_context.table("lake_des_statarbmfvol.qvol_bbg_closes")
File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/pyspark.zip/pyspark/sql/context.py", line 565, in table
File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/pyspark.zip/pyspark/sql/utils.py", line 36, in deco
File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o53.table.
: org.apache.spark.sql.catalyst.analysis.NoSuchTableException
at org.apache.spark.sql.hive.client.ClientInterface$$anonfun$getTable$1.apply(ClientInterface.scala:123)
at org.apache.spark.sql.hive.client.ClientInterface$$anonfun$getTable$1.apply(ClientInterface.scala:123)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.hive.client.ClientInterface$class.getTable(ClientInterface.scala:123)
at org.apache.spark.sql.hive.client.ClientWrapper.getTable(ClientWrapper.scala:61)
at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:406)
at org.apache.spark.sql.hive.HiveContext$$anon$1.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:410)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:203)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:203)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:203)
at org.apache.spark.sql.hive.HiveContext$$anon$1.lookupRelation(HiveContext.scala:410)
at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739)
at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207) Can you help me please?
Best regards
... View more
06-28-2016
07:11 AM
Which is the problem using a local file? Indeed is what you have to do... There is no reason to specify the path of the file on hdfs.
... View more
10-06-2016
02:55 PM
Hi @vasanath rajendran check out this article https://community.hortonworks.com/questions/58096/is-there-a-working-python-hive-library-that-connec.html#answer-58343
... View more
06-23-2016
10:49 AM
I have wrote a hard code in class org.apache.hadoop.hive.serde2.OpenCSVSerde, but it doesn't work when I replace the old jar "/usr/hdp/current/hive-client/lib/hive-serde-1.2.1.2.3.0.0-2557.jar". what should I do to make the new jar work? @Override
public Object deserialize(final Writable blob) throws SerDeException {
Text rowText = (Text) blob;
String text = rowText.toString().replace("\\N","\"\"");
CSVReader csv = null;
try {
csv = newReader(new CharArrayReader(text.toCharArray()), separatorChar,
quoteChar, escapeChar);
... View more
06-21-2016
08:30 AM
Hello,
Thank you for the directive. But I 'm new to the dataframe and what I try to do is be able to make it to retrieve the values of the indices i and i + 1 for example.
Best regards
... View more
06-13-2016
11:28 AM
Thank you. By adding the attribute --map-column-hive Date=Timestamp to Sqoop everything works.
... View more