Member since
09-14-2015
47
Posts
89
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2241 | 09-22-2017 12:32 PM | |
11348 | 03-21-2017 11:48 AM | |
1023 | 11-16-2016 12:08 PM | |
1441 | 09-15-2016 09:22 AM | |
3326 | 09-13-2016 07:37 AM |
09-06-2016
05:08 PM
8 Kudos
@justin kuspa In HDP 2.5, r is provided in Zeppelin via the Livy interpreter. Try using the following: %livy.sparkr Note, you will need to make sure you have R installed on your machine first. If you haven't already, install it with the following (on all nodes): yum install R R-devel libcurl-devel openssl-devel Validate it was installed correctly: R -e "print(1+1)" Once it is installed, test out sparkr in Zeppelin with Livy to confirm it is working: %livy.sparkr
foo <- TRUE
print(foo)
... View more
09-05-2016
04:53 PM
2 Kudos
@Piyush Jhawar The Ranger Hive plugin protects Hive data when it is accessed via HiveServer2. When you access these tables using HCatalog in Pig you are not going through HiveServer2, but instead Pig is using the files directly from HDFS (HCatalog is just used to map the table metadata to the HDFS files in this case). In order to protect this data, you should also define a Ranger HDFS policy to protect the underlying HDFS directory that is used to store the marketingDb.saletable data. To clarify: Ranger Hive Plugin - Used to protect Hive data when accessed via HiveServer2 (e.g, a user connecting to Hive via JDBC) Ranger HDFS Plugin - Used to protect HDFS files and directories (suitable if users need to access the data outside of HiveServer2 - Pig, Spark etc)
... View more
07-05-2016
05:47 AM
2 Kudos
@Manikandan Durairaj Within the PutHDFS processor, you can set the HDFS owner/group using the 'Remote Owner' and 'Remote Group' properties: Note, this will only work if NiFi is running as a user that has HDFS super-user privilege to change owner/group.
... View more
06-22-2016
11:37 AM
1 Kudo
@Predrag Minovic - Slight update on storm, we can run multiple Nimbus servers: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Ambari_Users_Guide/content/ch05s05.html This is to deal with cases where Nimbus can't be automatically restarted (e.g disk failure on the node). Details of Nimbus HA is outlined here: http://hortonworks.com/blog/fault-tolerant-nimbus-in-apache-storm/
... View more
06-17-2016
01:16 AM
@sankar rao To elaborate on the answer provided by @Artem Ervits The edge node is typically used to install client tools, so it will make sense to install the AWS S3 CLI on the edge. For adding new users to the cluster, you need to ensure that the new users exist in ALL nodes. The reason is that Hadoop by default takes the user/group mappings from the UNIX users by default. So for Hadoop to 'know' about the new user you've created on the edge node, that same userid should exist on all nodes.
... View more
06-09-2016
10:12 AM
2 Kudos
@KC
Your 'InferAvroSchema' is likely capturing the schema as an attribute called 'inferred.avro.schema' (assuming you followed the tutorial here: https://community.hortonworks.com/articles/28341/converting-csv-to-avro-with-apache-nifi.html )
If that's the case, you can view its output by looking at one of the flowfiles in queue after 'InferAvroSchema' (List queue > select a flowfile > view attributes > view inferred.avro.schema property).
If you want to manually define the schema without changing too much of your flow, you can directly replace your 'InferAvroSchema' processor with an 'UpdateAttribute' processor - within the 'UpdateAttribute' define a new property called inferred.avro.schema and paste in your avro schema as the value (json format).
... View more
06-09-2016
09:46 AM
4 Kudos
@KC How are you defining your Avro Schema? Typically the 'failed to convert' errors occur when the csv records don't fit the data types defined in your avro schema. If you're using the 'InferAvroSchema' processor or Kite SDK to define the schema, it is possible that the inferred schema isn't a true representation of your data (keep in mind that these methods infer the schema based on a subset of the data, so if your data isn't very consistent then it is likely that they will misinterpret what the field types are and hit errors during converting). If you know the data, you could get around this by manually defining the Avro schema based on the actual data types.
... View more
04-15-2016
05:35 AM
1 Kudo
@Indrajit swain The reason you can't see the Action option is because you are currently logged in as the 'maria_dev' user which isn't given full Ambari access by default. You can log in as the 'Admin' user account to change this. Note that the default password for the 'Admin' user in HDP 2.4 sandbox has been changed. Refer to the following thread with details on resetting the admin account password: https://community.hortonworks.com/questions/20960/no-admin-permission-for-the-latest-sandbox-of-24.html
... View more
03-13-2016
03:24 AM
1 Kudo
@zhang dianbo Which web browser are you using? I'm able to download the data using the link you've provided without any issues (using Chrome). If you still can't get the box download to work, here is the file you are looking for (uploaded to this post directly): geolocation.zip
... View more
12-08-2015
05:42 AM
5 Kudos
How do we manage authorization control over tables within SparkSQL? Will ranger enforce existing Hive policies when these Hive tables are accessed via SparkSQL?
If not, what is the recommended approach.
... View more
Labels:
- Labels:
-
Apache HCatalog
-
Apache Ranger
-
Apache Spark
- « Previous
-
- 1
- 2
- Next »