Member since
04-26-2016
78
Posts
32
Kudos Received
0
Solutions
02-12-2016
08:28 AM
2 Kudos
What is the best mechanism to ingest data from relational sources into HDP. To use a combination of ExecuteSQLand putHDFS processors or to use Sqoop and deliver the data to HDP? Thanks
... View more
Labels:
02-11-2016
08:45 PM
1 Kudo
@Neeraj Sabharwal @Artem Ervits Just wondering what is the best mechanism to ingest data from relational sources into HDP. To use a combination of ExecuteSQL and putHDFS processors or to use Sqoop and deliver the data to HDP? Many Thanks
... View more
02-11-2016
08:36 PM
2 Kudos
@Artem Ervits Thanks for the info. For the first query, my intention was not to see whether Nifi works as a Oozie replacement, but to see how to get functionality like oozie in HDF world. On further reading, I found out that at each processor level, I can have scheduling (timer based, cron based or event based etc). This is sufficient for our requirements. For security, I need to look into it deeper. Will come back later with further queries. Many Thanks
... View more
02-09-2016
05:40 PM
2 Kudos
Hi, I am new to HDF and have few queries on HDF and its configuration. Can anyone please answer my below queries. What are the steps required to define a workflow so that a Nifi job can be called. I am looking for something similar to Oozie, which can be used to schedule any task related to Hadoop. In a similar context, I am looking how to achieve the same in HDF What are the ways to secure access to HDF cluster? We wanted to have a HDF cluster on AWS and have a VPC established from our network to AWS. Alongside, we want the HDF cluster to be secured and ring fenced such that designated people / machines only are able to invoke Ni-Fi processing. How to achieve the same? Extending the security question, can something similar to Knox is available for HDF. If not, how to achieve similar ring-fencing? Thanks
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
02-04-2016
05:51 PM
1 Kudo
@Neeraj Sabharwal Thanks. I agree with all the response given so far. Here is my understanding to summarise. Only spark-thrift-server or spark-history-server can be stopped - either through Ambari or through CLI. spark-client can only be installed (put the required libraries on the specified machine's directory) or uninstalled. There is nothing like stop or start. Same is the case through Ambari. When using spark-shell or spark-submit, it would run interactively / submit the job to the cluster and once the application completes running, spark-driver program is ended.
... View more
02-04-2016
05:33 PM
@Neeraj Sabharwal I believe these are applicable for spark standalone mode. Correct me if I'm wrong. However, I wanted to understand how it works in yarn mode. Specifically my question is the below: @Artem mentioned that 'it' would stop immediately after app execution from a shell. My question is what is 'it' here? Is it spark client or spark driver or the app itself. Alternatively, on Ambari what is it shown that is running which can be stopped through UI. I presume its the spark-client. If so, what need to be done to stop the spark-client, similar to the one done through Ambari.
... View more
02-04-2016
04:40 PM
@Artem Ervits Here is my confusion. If it stops immediately after app execution from a shell, why does it appear as running on Ambari, which can be stopped through UI. Are they two different things? spark-shell and spark-client? If so, in a manual installation, what need to be done to stop the spark-client, similar to the one done through Ambari.
... View more
02-04-2016
04:20 PM
@Artem Ervits Thanks for the quick response. I have two clients of Spark at the moment i.e. one is Ambari managed and the other is outside, manually setup. So, if I have to stop the manually managed one, how to do it? Thanks
... View more
02-04-2016
04:14 PM
1 Kudo
Hi, Just wondering what happens when Spark service is chosen on Ambari and is stopped? What service is stopped behind the scenes.To my understanding, when Spark is installed through Ambari, it installs Spark-client, thrift-server and history-server. When I stop Spark through Ambari, what action is invoked. Is it the spark-client that is stopped or all the three? Alternatively, if I have to stop spark-client through CLI, how it need to be done? Please correct me in case my understanding is wrong in any of the above. Thanks
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Spark