About nanyim_alain

nanyim_alain · ‎07-11-2016

Hello, I try to create a job with a command oozie Sqoop got this error: Intercepting System.exit(1) Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1] this is my xml file : <workflow-app name="exemple_hive" xmlns="uri:oozie:workflow:0.5"> <global> <configuration> <property> <name>mapreduce.job.queuename</name> <value>DES</value> </property> </configuration> </global> <start to="sqoop-9fb3"/> <kill name="Kill"> <message>L'action a échoué, message d'erreur[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <action name="sqoop-9fb3"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <command>sqoop import -Dmapred.job.queue.name=DES --connect "jdbc:jtds:sqlserver://xxxx.xxxx.xxxx.xxxx:xxxx;databaseName=xxxxxxxx;user=xxxxxxxx;password=xxxxxxxx;instance=MSPAREBTP02" --driver net.sourceforge.jtds.jdbc.Driver --username hdp-import --table qvol_ccy --hive-import --hive-table test.qvol_ccy -m 1</command> <file>/dev/datalake/app/des/dev/lib/jtds-1.3.1.jar#jtds-1.3.1.jar</file> <file>/dev/datalake/app/des/dev/script/hive-site.xml#hive-site.xml</file> </sqoop> <ok to="End"/> <error to="Kill"/> </action> <end name="End"/> </workflow-app>

nanyim_alain · ‎07-01-2016

Thank you. Please is it possible to replace part-00000 by a file name I want? Example command.txt

nanyim_alain · ‎07-01-2016

Thank. But it generates an error: AttributeError: 'DataFrameWriter' object has no attribute 'text'

nanyim_alain · ‎07-01-2016

Hello, I work with the spark dataframe please and I would like to know how to store the data of a dataframe in a text file in the hdfs. I tried with saveAsTextfile () but it does not workthank you

nanyim_alain · ‎06-30-2016

Thank you . it works well . But this is performed in local Initiation and When I ' execute on cluster with the command: spark-submit --master yarn-cluster --py-files hdfs:///dev/datalake/app/des/dev/script/lastloader.py --queue DES hdfs:///dev/datalake/app/des/dev/script/return.py it generates this error in the logs : Log Type: stdout Log Upload Time: Thu Jun 30 09:19:20 +0200 2016 Log Length: 3254 Traceback (most recent call last): File "return.py", line 10, in <module> df = Lastloader() File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/__pyfiles__/lastloader.py", line 13, in Lastloader qvol1 = hive_context.table("lake_des_statarbmfvol.qvol_bbg_closes") File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/pyspark.zip/pyspark/sql/context.py", line 565, in table File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__ File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/pyspark.zip/pyspark/sql/utils.py", line 36, in deco File "/DATA/fs6/hadoop/yarn/local/usercache/atsafack/appcache/application_1465374541433_9209/container_e52_1465374541433_9209_02_000001/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o53.table. : org.apache.spark.sql.catalyst.analysis.NoSuchTableException at org.apache.spark.sql.hive.client.ClientInterface$$anonfun$getTable$1.apply(ClientInterface.scala:123) at org.apache.spark.sql.hive.client.ClientInterface$$anonfun$getTable$1.apply(ClientInterface.scala:123) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.hive.client.ClientInterface$class.getTable(ClientInterface.scala:123) at org.apache.spark.sql.hive.client.ClientWrapper.getTable(ClientWrapper.scala:61) at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:406) at org.apache.spark.sql.hive.HiveContext$$anon$1.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:410) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:203) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:203) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$class.lookupRelation(Catalog.scala:203) at org.apache.spark.sql.hive.HiveContext$$anon$1.lookupRelation(HiveContext.scala:410) at org.apache.spark.sql.SQLContext.table(SQLContext.scala:739) at org.apache.spark.sql.SQLContext.table(SQLContext.scala:735) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:207) Can you help me please? Best regards

nanyim_alain · ‎06-30-2016

Hello ! I use PySpark

nanyim_alain · ‎06-29-2016

Hello, I tried but I get an error that is: code with "r" like parameter: df=hive_context.sql(s"select c.`date`, c.blglast from qvol1_temp as c join qvol2_temp as uv on c.udl_id = uv.udl_id where uv.ric =$r and c.`date` >= '2016-06-13 00:00:00' and c.`date` <= '2016-06-17 00:00:00' and c.adj_split = False") error: SyntaxError: invalid syntax

nanyim_alain · ‎06-28-2016

For example df= HiveContext.sql("SELECT * FROM src WHERE col1 = ${VAL1}") Thank

nanyim_alain · ‎06-28-2016

Hello Paul Hargis Here is the command that I run with the parameter --files but it generates me an error: bash-4.1$ spark-submit --master yarn-cluster --queue DES --files hdfs://dev/datalake/app/des/dev/script/return.py Error: Must specify a primary resource (JAR or Python or R file) Run with --help for usage help or --verbose for debug output My cordial Thanks

nanyim_alain · ‎06-27-2016

Thank you. I managed to run it. Except that my file is local and when I specify the path of a file on the cluster, I receive an error: bash-4.1$ spark-submit --master yarn-client --queue DES hdfs:///dev/datalake/app/des/dev/script/return.py Error: Only local python files are supported: Parsed arguments: master yarn-client deployMode client executorMemory null executorCores null totalExecutorCores null propertiesFile /usr/hdp/current/spark-client/conf/spark-defaults.conf driverMemory null driverCores null driverExtraClassPath /usr/hdp/current/share/lzo/0.6.0/lib/hadoop-lzo-0.6.0.jar:/usr/local/jdk-hadoop/ojdbc7.jar:/usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar:/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar:/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/hbase-hadoop-compat.jar:/usr/hdp/current/hbase-client/lib/metrics-core-2.2.0.jar driverExtraLibraryPath /usr/hdp/current/share/lzo/0.6.0/lib/native/Linux-amd64-64/ driverExtraJavaOptions null supervise false queue DES numExecutors null files null pyFiles null archives null mainClass null primaryResource hdfs:///dev/datalake/app/des/dev/script/return.py name return.py childArgs [] jars null packages null packagesExclusions null repositories null verbose false

Online	Offline
Last Visited	‎01-18-2017 11:10 AM

Member Since	‎04-14-2016 04:00 PM
Last Visited	‎01-18-2017 11:10 AM
Posts	54
Kudos received	9

Cloudera Community

Re: Read hive table with a python script

Re: Import data directly in as-parquetfile format

import data into hive with sqoop

Re: storage dataframe as textfile in hdfs

Re: storage dataframe as textfile in hdfs

storage dataframe as textfile in hdfs

Re: How to bind variable in Apache Spark SQL

Re: How to bind variable in Apache Spark SQL

Re: How to bind variable in Apache Spark SQL

How to bind variable in Apache Spark SQL

Re: run a python script containing commands spark

Re: run a python script containing commands spark