Member since
04-26-2016
12
Posts
3
Kudos Received
0
Solutions
10-01-2017
04:09 PM
Cheers, I can not understand the role of this property with LLAP : hive.llap.execution.mode I looked for the documentation but couldn't find something useful. Can someone shed some light on this?
... View more
Labels:
10-01-2017
04:06 PM
Cheers, When working with nested process groups and load balancing, I need to use lot of input ports to get data from higher level to the internal process group. Is there any trick to avoid doing this? Also, when I connect two nifi clusters with S2S they end up using lot of resources even if this connexion is not used. Is this normal behaviour or a know bug? is there something to tune
... View more
Labels:
12-14-2016
10:51 PM
Any idea on how to update an RDD after the streaming is started ?
... View more
12-14-2016
08:57 PM
Thanks for the link @Dan Zaratsian. This is what I want to do but it's missing the most important part for me : how to update the file used for enrichment ? How can another Spark job update the RDD used by Spark Streaming ?
... View more
12-14-2016
07:14 PM
I have a Spark Streaming job that read events from Kafka and enrich them. Data used for enrichment is read in the startup of the job as an RDD. The issue is that this data change over time and should be updated. How can Spark Streaming load this data and update it ?
... View more
Labels:
11-13-2016
10:41 AM
To be more accurate, when a user call a service through Knox, he provides user-name/password in the curl command. Are these credentials sent unencrypted over the wire and hence can be spoofed ? if yes, does ssl provide a solution for this ? How can a client authenticate to Knox without provide these information ? (tokens, or other solution) I have been reading about SPENGO but I don't understand how all these protocols interact.
... View more
11-12-2016
05:51 PM
Hi,
I am writing a Java application that needs to access to HDP services through Knox. I want to have authentication at Knox to protect my cluster. I understand that I'll connect Knox to my LDAP server.
How can my application will authenticate to Knox in order to access to HDP services ? can I avoid sending username/password when call Knox API ? I am on HDP 2.5
... View more
Labels:
08-03-2016
09:17 PM
Hi We are thinking on having dedicated edge per project for our data lake. Each project will have a vm on which we install the required clients. Anyone is doing this ? Any problems or issues that we should be aware of with this configuration ?
... View more
07-27-2016
09:09 PM
2 Kudos
Hi Looking
for experience and guidance on edge nodes in data lake. Do you use the same edge node for all users or it is recommended to have an edge node by organisation/team? do you use physical nodes or VMs ? does an edge node required for production clusters ? Thanks
... View more
07-27-2016
09:02 PM
1 Kudo
Hi, Does anyone know a good GUI for HBase for creating and querying tables ? something like Hue HBase browser for HDP ? any plan for HBase Ambari view ? Thanks
... View more
Labels:
06-24-2016
05:44 AM
@sujitha sanku Thanks
I am talking about libraries that doesn't come with Spark by default like spark-csv. This code works with Spark-shell but not with Zeppelin (same thing if I use Pyspark): import org.apache.spark.sql.SQLContext
val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("/tmp/sales.csv")
df.printSchema()
val selectedData = df.select("customerId", "itemId")
selectedData.collect() Should I add import statement ? why this is working in Spark directly
... View more
06-23-2016
10:14 PM
I want to add a library and use it in Zeppelin (ex. Spark-csv). I succeeded in adding it to Spark and using it by putting my Jar in all nodes and adding spark.jars='path-to-jar' in conf/spark-defaults.conf. However when I call the library from Zeppelin it doesn't work (class not found). From my understanding Zeppelin do a Spark-submit so if the package is already added in Spark it should work. Also, I tried adding using export
SPARK_SUBMIT_OPTIONS=”--jars
/path/mylib1.jar,/path/mylib2.jar" to zeppelin-env.sh but same problem. Has anyone suceeded in adding libraries to Zeppelin ? have you seen this problem ?
... View more
Labels: