About JonathanSneep

JonathanSneep · ‎07-25-2018

I believe FetchParquet does what you need; https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-parquet-nar/1.5.0/org.apache.nifi.processors.parquet.FetchParquet/index.html

JonathanSneep · ‎07-24-2018

@Gayathri Devi I've created a database in MariaDB and exported a hive table using sqoop on my lab setup. This worked well for me; [sqoop@jsneep-lab ~]$ sqoop export --connect jdbc:mysql://172.3.2.1/export --username mariadb --password mariadb --table exported --direct --export-dir /apps/hive/warehouse/drivers Make sure you have /usr/share/java/mysql-connector-java.jar present on your system, this gave me trouble initially.

JonathanSneep · ‎07-24-2018

Hi @Anurag Mishra Prior to HDP 3 you could only see that an application was killed by a user, not who killed the application HDP 3 and onwards is more informative about who killed an application. [jsneep@node4 ~]$ yarn jar /usr/hdp/2.6.4.0-91/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 9000 18/07/24 07:44:44 INFO security.TokenCache: Got dt for hdfs://hwc1251-node2.hogwarts-labs.com:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.25.33.145:8020, Ident: (HDFS_DELEGATION_TOKEN token 7 for jsneep) 18/07/24 07:44:45 INFO input.FileInputFormat: Total input paths to process : 10 18/07/24 07:44:45 INFO mapreduce.JobSubmitter: number of splits:10 18/07/24 07:44:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1532417644227_0005 18/07/24 07:44:46 INFO impl.YarnClientImpl: Submitted application application_1532417644227_0005 18/07/24 07:44:46 INFO mapreduce.Job: Running job: job_1532417644227_0005 [root@hwc1251-node4 ~]# yarn application -kill application_1532417644227_0005 18/07/24 07:44:53 INFO mapreduce.Job: Job job_1532417644227_0005 failed with state KILLED due to: Application killed by user. 18/07/24 07:44:53 INFO mapreduce.Job: Counters: 0 Job Finished in 8.516 seconds Ex, above I've submitted a yarn job (application_1532417644227_0005) & killed it. The logs state "Application killed by user." I can also browse the Resource Manager UI at http://<RM IP ADDRESS>:8088/cluster/apps/KILLED and see that it was killed by a user. The apache jira for this: https://issues.apache.org/jira/browse/YARN-5053 | "More informative diagnostics when applications killed by a user" In my HDP3 cluster, when I submit an identical job and kill it; [root@c2175-node4 ~]# yarn app -kill application_1532419910561_0001 18/07/24 08:12:45 INFO client.RMProxy: Connecting to ResourceManager at c2175-node2.hwx.com/172.25.39.144:8050 18/07/24 08:12:45 INFO client.AHSProxy: Connecting to Application History server at c2175-node2.hwx.com/172.25.39.144:10200 Killing application application_1532419910561_0001 For example now via the RM UI, I can browse to http://<RM IP ADDRESS>:8088/ui2/#/yarn-app/application_x_/info and under diagnostics we will see the user and source address of the kill operation. The same would be visible through CLI, via "yarn app -status application_1532419910561_0001 | grep killed" Application application_1532419910561_0001 was killed by user root at 172.25.33.15 Edit: PS, you could make use of YARN Queues & ACLs to limit / determine who has rights to kill yarn applications, I wanted to mention this in case you're looking for something to help you if you're currently unable to get your cluster upgraded to HDP3. Further info: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_yarn-resource-management/content/controlling_access_to_queues_with_acls.html *please mark this answer as accepted if you found this helpful*

JonathanSneep · ‎07-23-2018

@Sambasivam Subramanian I've just noticed that there is a script to help on this available, which I was unaware of, it would be easier this way; [root@lab-1 ~]# /usr/hdp/2.6.5.0-292/zeppelin/bin/install-interpreter.sh -n python Install python(org.apache.zeppelin:zeppelin-python:0.7.0) to /usr/hdp/2.6.5.0-292/zeppelin/interpreter/python ... Interpreter python installed under /usr/hdp/2.6.5.0-292/zeppelin/interpreter/python. 1. Restart Zeppelin 2. Create interpreter setting in 'Interpreter' menu on Zeppelin GUI 3. Then you can bind the interpreter on your note

JonathanSneep · ‎07-23-2018

Hi @Sambasivam Subramanian Yes, you definitely need to have the python interpreter data under /usr/hdp/current/zeppelin-server/interpreter, you can download zeppelin here and extract the python interpreter from the apache download. Also be sure to update interpreter.json afterwards.

JonathanSneep · ‎07-23-2018

@Gayathri Devi Perhaps this works, i haven't tried this myself yet as I don't currently have any lab setup using mariaDB. Edit; This may also be helpful: https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_syntax_3 I've noticed now the --export-dir is also required; sqoop export \ --connect "jdbc:mariadb://localhost/example" \ --username mariadb \ --password mariadb \ --table hivetable \ --export-dir /apps/hive/warehouse/hivetable

JonathanSneep · ‎07-23-2018

Hi @Gayathri Devi it seems you're trying to use -hcatalog-table The argument value for this option is the HCatalog tablename. The presence of the --hcatalog-table option signifies that the import or export job is done using HCatalog tables, and it is a required option for HCatalog jobs. But it looks like you have an accidental space between the -hcatalog and -table, resulting in a command syntax issue; https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_data-movement-and-integration/content/sqoop_hcatalog_integration.html

JonathanSneep · ‎07-20-2018

Hi @Sambasivam Subramanian Can you share all the steps you've taken? Note that zeppelin on HDP supports these, and that using pyspark you can run python scripts; Spark JDBC (supports Hive, Phoenix) OS Shell Markdown Livy (supports Spark, Spark SQL, PySpark, PySpark3, and SparkR) AngularJS source: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_zeppelin-component-guide/content/using-interp.html [root@c1175-node3 zeppelin-server]# cd interpreter/ [root@c1175-node3 interpreter]# ll total 44 drwxr-xr-x 3 zeppelin zeppelin 4096 Jul 20 15:10 angular drwxr-xr-x 3 zeppelin zeppelin 20480 Jul 20 15:10 jdbc drwxr-xr-x 3 zeppelin zeppelin 4096 Jul 20 15:10 lib drwxr-xr-x 3 zeppelin zeppelin 4096 Jul 20 15:10 livy drwxr-xr-x 3 zeppelin zeppelin 4096 Jul 20 15:10 md drwxr-xr-x 3 zeppelin zeppelin 4096 Jul 20 15:10 sh drwxr-xr-x 7 zeppelin zeppelin 4096 Jul 20 15:11 spark It is however also possible to add to this manually, did you download zeppelin manually and copy the interpreter/python directory to your HDP install of zeppelin, as part of the steps you've undertaken? Please also note that you need to update interpreter.json which is normally found under /usr/hdp/current/zeppelin-server/conf

JonathanSneep · ‎07-20-2018

There are indeed 2 addresses, the sandbox is Centos which has an IP, and in centos there is a docker image running HDP which has another IP. From windows itself, ports on the docker image are forwarded and accessible via localhost. Ex when your sandox is up, you could access 127.0.0.1:8888 via a browser. Similarly port 2222 should also be forwarded to be accessible through localhost/127.0.0.1.

JonathanSneep · ‎07-20-2018

@Tulasi Uppu Are you using the IP address that you've seen via docker inspect? Try the ip visible via docker-machine ls And also be sure to check that windows firewall isn't preventing your access 🙂 Let me know if that solves your issue or if you have more questions! If this has helped, please take a moment to login and mark this answer as accepted!

Online	Offline
Last Visited	‎03-07-2019 12:37 AM

Member Since	‎03-07-2019 12:37 AM
Last Visited	‎03-07-2019 12:37 AM
Posts	158
Kudos received	53

Cloudera Community

Re: How to copy files from HDFS recursive to the l...

Re: Ambari 2.6.3.1 using WFM // service check remo...

Re: Sometimes 'Generate flow file' processor is g...

Re: S020 data storage error when I run ‘show datab...

Re: Does the NIFI registry support connection to S...

Re: How to export parquet file to csv (without Hiv...

Re: Sqoop export from hive to maria db

Re: How to find which user has killed the yarn ap...

Re: Python interpreter in Zeppelin

Re: Python interpreter in Zeppelin

Re: Sqoop export from hive to maria db

Re: Sqoop export from hive to maria db

Re: Python interpreter in Zeppelin

Re: MIT Kerberos client Ticket initialization fai...

Re: MIT Kerberos client Ticket initialization fai...