About egarelnabi

egarelnabi · ‎08-23-2016

If you're looking for a sandbox/vm with HDP 2.5 then you can find it here: http://hortonworks.com/downloads/#tech-preview

egarelnabi · ‎08-17-2016

@narender pasunooti Ubuntu 15 is not supported in HDP 2.4/2.4.2 https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_HDP_RelNotes/content/upgrade_procedure.html Consider using Ubuntu 14, which is supported.

egarelnabi · ‎08-16-2016

You would copy the file from "/.reserved/raw/test1/file1.txt" to "/.reserved/raw/test2/file1.txt" while preserving the extended attributes (where the EZEK is saved) using the -px flag. https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser https://issues.apache.org/jira/browse/MAPREDUCE-6007

egarelnabi · ‎08-16-2016

*Removed my previous response and adding the link to the article below: https://community.hortonworks.com/articles/51909/how-to-copy-encrypted-data-between-two-hdp-cluster.html

egarelnabi · ‎08-16-2016

Thanks @Sagar Shimpi. I've seen this, but looking it the code it only seems like it's copying the master keys (EK). My understanding is that to un-encrypt a file you would need both, the master key (EK) stored in the DB as well as the file level encryption key (EDEK) which is store in the Name Node. Am I missing something or misunderstanding?

egarelnabi · ‎08-16-2016

When using Ranger KMS and TDE is it possible to share encryption keys across 2 clusters? The scenario is that we have a Prod and DR cluster. When doing the data replication we'd like to avoid un-encrypting it on Prod, moving it over the wire, and then re-encrypting it when we write to DR. Is this possible?

egarelnabi · ‎08-15-2016

Because spark action in oozie is not supported in HDP 2.3.x and HDP 2.4.0, there is no workaround especially in kerberos environment. We can use either java action or shell action to launch spark job in oozie workflow. In this article, we will discuss how to use oozie shell action to run a spark job in kerberos environment. Prerequisite: 1. Spark client is installed on every host where nodemanager is running. This is because we have no control over which node the 2. Optionally, if the spark job need to interact with hbase cluster, hbase client need to be installed on every host as well. Steps: 1. Create a shell script with the spark-submit command. For example, in the script.sh: /usr/hdp/current/spark-client/bin/spark-submit --keytab keytab --principal ambari-qa-falconJ@FALCONJSECURE.COM --class org.apache.spark.examples.SparkPi --master yarn-client --driver-memory 500m --num-executors 1 --executor-memory 500m --executor-cores 1 spark-examples.jar 3 2. Prepare kerberos keytab which will be used by the spark job. For example, we use ambari smoke test user, the keytab is already generated by Ambari in/etc/security/keytabs/smokeuser.headless.keytab. 3. Create the oozie workflow with a shell action which will execute the script created above, for example, in the workflow.xml: <workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.4"> <start to="shellAction"/> <action name="shellAction"> <shell xmlns="uri:oozie:shell-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <exec>script.sh</exec> <file>/user/oozie/shell/script.sh#script.sh</file> <file>/user/oozie/shell/smokeuser.headless.keytab#keytab</file> <file>/user/oozie/shell/spark-examples.jar#spark-examples.jar</file> <capture-output/> </shell> <ok to="end"/> <error to="killAction"/> </action> <kill name="killAction"> <message>"Killed job due to error"</message> </kill> <end name="end"/> </workflow-app> 4. Create the oozie job properties file. For example, in job.properties: nameNode=hdfs://falconJ1.sec.support.com:8020 jobTracker=falconJ2.sec.support.com:8050 queueName=default oozie.wf.application.path=${nameNode}/user/oozie/shell oozie.use.system.libpath=true 5. Upload the following files created above to the oozie workflow application path in HDFS (In this example: /user/oozie/shell): - workflow.xml - smokeuser.headless.keytab - script.sh - spark uber jar (In this example: /usr/hdp/current/spark-client/lib/spark-examples*.jar) - Any other configuration file mentioned in workflow (optional) 6. Execute the oozie command to run this workflow. For example: oozie job -oozie http://<oozie-server>:11000/oozie -config job.properties -run *This article was created by Hortonworks Support on 2016-04-28

egarelnabi · ‎08-15-2016

Hi @Timothy Spann It really all depends on your particular use case and requirements. First, I'm assuming you have a custom-built application that will be querying this data store. If so, how complex do the queries need to be? Do you need Relational (SQL) or Key-Value store? Also, how much latency can you afford? I would first explore if HBase (or HBase + Phoenix) would be sufficient. This will reduce the number of moving parts you have. If you're set on in-memory data grids/stores then some options would be Redis, Hazelcast, Teracotta Big Memory and GridGain (Apache Ignite). I believe the last two have connectors to Hadoop that allow writing results of MR jobs directly to the data grid (you'll need to confirm that functionality though) Like I said before though, I recommend you exhaust the HBase option before moving out-of-stack.

egarelnabi · ‎08-15-2016

@SBandaru Please take a look at the link below for a discussion around installing Kylin on HDP. It refers to HDP 2.3.2 but should be applicable to 2.4 as well. https://community.hortonworks.com/questions/1293/how-to-make-kylin-work-with-hdp-23.html As for support, Hortonworks does not provide support for Kylin, but I'm sure you can get plenty of help from both, the Hortonworks and Kylin communities.

egarelnabi · ‎08-15-2016

@Ankit A Are you able to run the Spark job from the shell/command line? If so, then you may want to use Shell Action instead. Oozie Spark Action in HDP 2.3.4 is still in tech preview and not supported yet. The below tech note was released with the recommendation to use Shell Actions or Java Actions instead. https://community.hortonworks.com/content/kbentry/51582/how-to-use-oozie-shell-action-to-run-a-spark-job-i-1.html -------------------- Begin Tech Note -------------------- Because spark action in oozie is not supported in HDP 2.3.x and HDP 2.4.0, there is no workaround especially in kerberos environment. We can use either java action or shell action to launch spark job in oozie workflow. In this article, we will discuss how to use oozie shell action to run a spark job in kerberos environment. Prerequisite: 1. Spark client is installed on every host where nodemanager is running. This is because we have no control over which node the 2. Optionally, if the spark job need to interact with hbase cluster, hbase client need to be installed on every host as well. Steps: 1. Create a shell script with the spark-submit command. For example, in the script.sh: /usr/hdp/current/spark-client/bin/spark-submit --keytab keytab --principal ambari-qa-falconJ@FALCONJSECURE.COM --class org.apache.spark.examples.SparkPi --master yarn-client --driver-memory 500m --num-executors 1 --executor-memory 500m --executor-cores 1 spark-examples.jar 3 2. Prepare kerberos keytab which will be used by the spark job. For example, we use ambari smoke test user, the keytab is already generated by Ambari in/etc/security/keytabs/smokeuser.headless.keytab. 3. Create the oozie workflow with a shell action which will execute the script created above, for example, in the workflow.xml: <workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.4"> <start to="shellAction"/> <action name="shellAction"> <shell xmlns="uri:oozie:shell-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <exec>script.sh</exec> <file>/user/oozie/shell/script.sh#script.sh</file> <file>/user/oozie/shell/smokeuser.headless.keytab#keytab</file> <file>/user/oozie/shell/spark-examples.jar#spark-examples.jar</file> <capture-output/> </shell> <ok to="end"/> <error to="killAction"/> </action> <kill name="killAction"> <message>"Killed job due to error"</message> </kill> <end name="end"/> </workflow-app> 4. Create the oozie job properties file. For example, in job.properties: nameNode=falconJ2.sec.support.com:8050 queueName=default oozie.wf.application.path=${nameNode}/user/oozie/shell oozie.use.system.libpath=true 5. Upload the following files created above to the oozie workflow application path in HDFS (In this example: /user/oozie/shell): - workflow.xml - smokeuser.headless.keytab - script.sh - spark uber jar (In this example: /usr/hdp/current/spark-client/lib/spark-examples*.jar) - Any other configuration file mentioned in workflow (optional) 6. Execute the oozie command to run this workflow. For example: oozie job -oozie http://<oozie-server>:11000/oozie -config job.properties -run -------------------- End Tech Note -------------------- See similar/related response here: https://community.hortonworks.com/questions/22772/oozie-spark-action-giving-key-not-found-spark-home.html#answer-45981

Online	Offline
Last Visited	‎08-14-2019 09:54 AM

Member Since	‎10-06-2015 09:21 PM
Last Visited	‎08-14-2019 09:54 AM
Posts	273
Kudos received	202

Cloudera Community

Re: Is it possible to import a complete new taxono...

Re: Is it possible in Apache Atlas to add key-valu...

Re: Do we have tag carry forward in atlas hdp2.6.1...

Re: With ATLAS, which format attribute Date is acc...

Re: Spark streaming support for stream analytics m...

Re: Installing HDP 2.5 beta version

Re: Is Ubuntu 15 and later versions supports HDP 2...

Re: Sharing Encryption Keys between clusters (repl...

Re: Sharing Encryption Keys between clusters (repl...

Re: Sharing Encryption Keys between clusters (repl...

Sharing Encryption Keys between clusters (replicat...

Run Oozie Shell Action instead of Oozie Spark Acti...

Re: In-Memory Layer

Re: Configuration of Kylin with HDP 2.3 or 2.4

Re: spark in oozie is not working