Member since
05-05-2016
35
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
607 | 05-06-2016 10:54 AM |
06-13-2017
08:06 PM
Hi I have to design a spark streaming application with below use case. I am looking for best possible approach for this.
I have application which pushing data into 1000+ different topics each has different purpose . Spark streaming will receive data from each topic and after processing it will write back to corresponding another topic.
Ex.
Input Type 1 Topic --> Spark Streaming --> Output Type 1 Topic
Input Type 2 Topic --> Spark Streaming --> Output Type 2 Topic Input Type 3 Topic --> Spark Streaming --> Output Type 3 Topic .
.
.
Input Type N Topic --> Spark Streaming --> Output Type N Topic and so on. I need to answer following questions. 1. Is it a good idea to launch 1000+ spark streaming application per topic basis ? Or I should have one streaming application for all topics as processing logic going to be same ? 2. If one streaming context , then how will I determine which RDD belongs to which Kafka topic , so that after processing I can write it back to its corresponding OUTPUT Topic? 3. Client may add/delete topic from Kafka , how do dynamically handle in Spark streaming ?
4. How do I restart job automatically on failure ? Any other issue you guys see here ?
... View more
- Tags:
- Spark
- spark-streaming
Labels:
06-12-2017
08:29 AM
Thanks Laurent. I agree on that. I am trying to get detailed understanding of communication between HDF and HDP cluster. When HDF Nifi connects (via HDFS processor , Hive connection or spark ) to HDP cluster, does it writes anything to local disks of data nodes of HDP ?
... View more
06-11-2017
09:24 PM
Hi I have HDF cluster with 3 Nifi instance which lunches jobs(Hive/Spark) on HDP cluster. Usually nifi writes all information to different repositories available on local machine. My question is - Does nifi writes any data,provenance information or does spilling on HDP nodes (ex. data nodes in HDP cluster) while accessing HDFS,Hive or Spark services ? Thanks
... View more
06-08-2017
09:55 AM
Hi I have installed HDF 2.1.2 with 3 node Nifi 1.1 cluster and I am trying to configure controller service in it. I created a simple ExecuteHQL processor which depends on ThriftConnectionPool controller service. I dragged ExecuteProcessor on canvas and as soon as I clicked on gear icon for configuring controller service , it throws runtime exception in log. Attempting request for (anonymous) GET http://<hostname>:9090/nifi-api/flow/process-groups/83490f62-015c-1000-0000-00004da8f033/controller-servicesjavax.ws.rs.InternalServerErrorException: HTTP 500 Internal Server Error Anyone faced this?
... View more
Labels:
01-24-2017
12:24 AM
Thanks... It worked...
... View more
01-23-2017
10:44 AM
I am trying to parse my json using Nifi Expression language - jsonpath https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#jsonpath Its uses '.' for node traversal. If json which has one node name with '.' in it. Below is sample json {"feedName":"trigger_category.childfeed123",
"feedId":"eff68e0b-a9e6-4c11-b74f-53f161a47faf",
"dependentFeedNames":["trigger_category.test_shashi"],
"feedJobExecutionContexts":{"trigger_category.test_shashi":[{"jobExecutionId":23946,
"startTime":1485145059971,
"endTime":1485145111733,
"executionContext":{"feedts":"1485145061170"}}]},
"latestFeedJobExecutionContext":{"**trigger_category.test_shashi**":{"jobExecutionId":23946,
"startTime":1485145059971,
"endTime":1485145111733,
"executionContext":{"**feedts**":"1485145061170"}}}} I am trying to read feedts but its parent node 'trigger_category.test_shashi' has dot ('.') in it. How do i escape "."character?
... View more
Labels:
11-25-2016
10:24 AM
Hi I was reading about HDFS encryption at rest and found that when we create a encryption zone in hdfs , folder should be empty or should not be present. Is there any way in which i can convert existing folders into encryption zone which are already having some folders/data into it ? Note : I am using Ranger KMS as key management server.
... View more
11-23-2016
07:05 AM
Nice Article....!!!!
... View more
11-11-2016
06:47 PM
srai: if I do not use proxyugi object with doAs method to log in to beeline thenit works pretty well with only realugi object.. as soon as I impersonate user , it fails
... View more
11-11-2016
09:02 AM
Is possible to follow above approach in Kerberos environment? I tried above step to run job as proxy user but it failed. Got GSS initialization exception. Any pointers?
... View more
11-10-2016
04:19 PM
Hi I am trying to impersonate kerberos in order to connect to hive but it is giving gss init exception. <code>UserGroupInformation ugi = kinit.generateKerberosTicket(configResources, keytab, principal);
serGroupInformation ugiProxy = UserGroupInformation.createProxyUser("shashi", ugi.getCurrentUser());
ugiProxy.doAs( new PrivilegedExceptionAction<Void>(){
@Override
public Void run() throws Exception {
Connection con = DriverManager.getConnection("jdbc:hive2://quickstart.cloudera:10000/default;principal=hive/quickstart.cloudera@CLOUDERA", "shashi", "");
Statement stmt = con.createStatement();
String sql = "show databases ";
ResultSet res = stmt.executeQuery(sql);
if (res.next()) {
System.out.println("DB names ---- >" +res.getString(1));
}
makeHiveJdbcConnection();
return null;
}
But getting following exception . <code>java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://quickstart.cloudera:10000/default;principal=hive/quickstart.cloudera@CLOUDERA: GSS initiate failed
Any pointer for this issue?
... View more
Labels:
09-19-2016
04:58 PM
Did you get solution for this? I am facing exact same issue.
... View more
09-10-2016
03:13 PM
Hi I am trying to understand NiFi data flow mechanism . I read that Nifi has flow file which holds content and metadata (flow file attribute). So I wanted to understand if I have 1 TB of data placed on edge node and would like to pass it to Nifi processors , is it going to load everything into memory to be used by processor?
... View more
Labels:
09-07-2016
07:42 AM
@Michael Young It worked.
... View more
09-05-2016
06:50 PM
@Bryan for you reply. I will initiate it on developer group.
... View more
09-04-2016
01:00 PM
Hi I created a sample controller service - 'MyControllerService' and packaged it into nar and pasted into nifi lib directory. I restarted nifi service to see changes. I was able to see MyControllerService in Controller setting. After that i made small label change into controller service and followed same process but changes are not getting affected. Even i removed nar files from nifi/lib just to check if it getting removed from list. That too not happening. Even I dont see any exception in nifi/log. Any pointer for this issue?
... View more
Labels:
09-03-2016
05:10 PM
@Michael Young thanks for your reply. I have 3 criteria for routing. Just one more i would like to understand. You mentioned here i need to define expression as routing criteria ($twitter.text.contains("elasticsearch")). Can you please tell me from where $twitter.text value will come ? Will that be property of my parent processor?
... View more
09-03-2016
03:00 PM
Hi I was designing on my own custom processor. I added couple of simple property descriptor into it with simple non-empty validators. I was looking for a validator by which I can add multiple values into one property descriptor. Something like below. My property descriptor will have multi value selection option. Does anyone know how can I achieve it ? Thanks Shashi
... View more
Labels:
09-03-2016
02:52 PM
2 Kudos
Hi I have a requirement where I need to redirect a flow based on property value set in parent processor. Below image shows an example of it. If property is set to ABC then flow file should go to ABC processor. I was looking at RouteOnAttribute but not sure how can i use it. Thanks Shashi
... View more
Labels:
08-06-2016
07:37 AM
Does HDP 2.3.4 supports Zeppeling with LDAP? If not then is there any way to do this?
... View more
07-30-2016
06:23 AM
Hi Is there any Java API available for creating Ranger HDFS/Hive Policy?
... View more
Labels:
07-20-2016
09:18 AM
Hi Is there any ranger rest api to authorize user against policy ? Consider example : Lets say I want check if user 'xyz' has permission to access hdfs/hive or not.
... View more
Labels:
07-18-2016
02:33 PM
Hi Is there any REST api available to fetch user/group information from Ranger?
... View more
Labels:
06-06-2016
06:57 AM
Yes. Below is snapshot of configuration.
... View more
06-03-2016
01:41 PM
actually i upcreated a nifi user which belongs to nifi group by default. I have given nifi in all configuration mentioned in link.
... View more
06-03-2016
01:30 PM
Hi Sunile, i tried above command . It is giving me kerberos GSS exception as it will try to use root user for this operation..
... View more
06-03-2016
01:24 PM
1 Kudo
Hi
I am trying to create HDFS Admin super user. I referred below for another super user creation. Create HDFS Admin User I followed exact steps but after running hdfs dfsadmin -report report: Access denied for user abc. Superuser privilege is required. Any pointer here? how should I debug this?
... View more
Labels:
06-03-2016
01:09 PM
I am using 0.5.1. I gave read and write permission to lib and repositories but still it is showing same error. Any clue here?
... View more
06-03-2016
06:37 AM
Hi I want to run my NiFi application using ec2-user rather than default nifi user. I changed run.as=ec2-user in bootstrap.conf but it did not worked .It is not allowing me to start Nifi application getting following error while staring nifi service. ./nifi.sh start
nifi.sh: JAVA_HOME not set; results may vary
Java home:
NiFi home: /opt/nifi/current
Bootstrap Config File: /opt/nifi/current/conf/bootstrap.conf
Error: Could not find or load main class org.apache.nifi.bootstrap.RunNiFi
Any pointer to this issue? Thanks Shashi
... View more
Labels: