About nanyim_alain

bleonhardi · ‎07-11-2016

You need to look into the logs. Most likely yarn logs of the Map Task of your Oozie launcher. This contains the sqoop command execution and any errors you would normally see on the command line. You can get them from resourcemanager ( click on your oozie launcher job and go through to the map task or use yarn application -logs. You can find any issues in the actual data transfer in the kicked off Mapreduce job which is a separate job

bwalter1 · ‎07-01-2016

Having described all that I still think the proper Spark way is to use df.write.format("csv").save("/tmp/df.csv") or df.repartition(1).write.format("csv").save("/tmp/df.csv")

nanyim_alain · ‎06-30-2016

Thank you . it works well . But this is performed in local Initiation and When I ' execute on cluster with the command: spark-submit --master yarn-cluster --py-files hdfs:///dev/datalake/app/des/dev/script/lastloader.py --queue DES hdfs:///dev/datalake/app/des/dev/script/return.py it generates this error in the logs : Log Type: stdout Log Upload Time: Thu Jun 30 09:19:20 +0200 2016 Log Length: 3254 Traceback (most recent call last): File "return.py", line 10, in <module> df = Lastloader() File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/__pyfiles__/lastloader.py", line 13, in Lastloader qvol1 = hive_context.table("lake_des_statarbmfvol.qvol_bbg_closes") File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/pyspark.zip/pyspark/sql/context.py", line 565, in table File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__ File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/pyspark.zip/pyspark/sql/utils.py", line 36, in deco File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o53.table. : org.apache.spark.sql.catalyst.analysis.NoSuchTableException at org.apache.spark.sql.hive.client.ClientInterface$$anonfun$getTable$1.apply(ClientInterface.scala:123) at org.apache.spark.sql.hive.client.ClientInterface$$anonfun$getTable$1.apply(ClientInterface.scala:123) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.hive.client.ClientInterface$class.getTable(ClientInterface.scala:123) at org.apache.spark.sql.hive.client.ClientWrapper.getTable(ClientWrapper.scala:61) at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:406) at org.apache.spark.sql.hive.HiveContext$$anon$1.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:410) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:203) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:203) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:203) at org.apache.spark.sql.hive.HiveContext$$anon$1.lookupRelation(HiveContext.scala:410) at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739) at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:207) Can you help me please? Best regards

mgaido · ‎06-28-2016

Which is the problem using a local file? Indeed is what you have to do... There is no reason to specify the path of the file on hdfs.

daleb · ‎10-06-2016

Hi @vasanath rajendran check out this article https://community.hortonworks.com/questions/58096/is-there-a-working-python-hive-library-that-connec.html#answer-58343

huozhanfeng · ‎06-23-2016

I have wrote a hard code in class org.apache.hadoop.hive.serde2.OpenCSVSerde, but it doesn't work when I replace the old jar "/usr/hdp/current/hive-client/lib/hive-serde-1.2.1.2.3.0.0-2557.jar". what should I do to make the new jar work? @Override public Object deserialize(final Writable blob) throws SerDeException { Text rowText = (Text) blob; String text = rowText.toString().replace("\\N","\"\""); CSVReader csv = null; try { csv = newReader(new CharArrayReader(text.toCharArray()), separatorChar, quoteChar, escapeChar);

nanyim_alain · ‎06-21-2016

Hello, Thank you for the directive. But I 'm new to the dataframe and what I try to do is be able to make it to retrieve the values of the indices i and i + 1 for example. Best regards

nanyim_alain · ‎06-13-2016

Thank you. By adding the attribute --map-column-hive Date=Timestamp to Sqoop everything works.

nanyim_alain · ‎06-09-2016

It's good I found !!! thank you

nanyim_alain · ‎06-03-2016

it works ! thank you

Online	Offline
Last Visited	‎01-18-2017 11:10 AM

Member Since	‎04-14-2016 04:00 PM
Last Visited	‎01-18-2017 11:10 AM
Posts	54
Kudos received	9

Cloudera Community

Re: Read hive table with a python script

Re: Import data directly in as-parquetfile format

Re: import data into hive with sqoop

Re: storage dataframe as textfile in hdfs

Re: How to bind variable in Apache Spark SQL

Re: run a python script containing commands spark

Re: Read hive table with a python script

Re: import csv data into hive table orc format

Re: Iterate a dataframe

Re: import with sqoop smalldatetime from sql serve...

Re: hive context save file

Re: Change views hive in hue