Member since
09-24-2015
105
Posts
82
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2050 | 04-11-2016 08:30 PM | |
1695 | 03-11-2016 04:08 PM | |
1676 | 12-21-2015 09:51 PM | |
996 | 12-18-2015 10:43 PM | |
8506 | 12-08-2015 03:01 PM |
03-29-2016
08:48 PM
1 Kudo
@Wes Floyd @Scott Shaw I just had a talk with HDP and Ambari PM's and they recommended that you don't mix OS's between major releases (e.g. RHEL 6.X and RHEL 7.X). They did state some people do mix Family OS's in the same major release (e.g. RHEL 7.X and CentOS 7.X) but while less likely, even that could lead to issues as it isn't tested.
... View more
03-17-2016
03:07 AM
1 Kudo
@vpoornalingam Okay and to confirm, python 2.7.8 is the highest version allowed?
... View more
03-16-2016
02:59 PM
3 Kudos
Hi All, The documentation says Python 2.6 is required but then right below it says: "Python v2.7.9 or later is not supported due to changes in how Python performs certificate validation." Does that mean you can use Python 2.7.X so long as it's less than 2.7.9? Thanks,
... View more
Labels:
03-11-2016
06:53 PM
1 Kudo
@DIALLO Sory what database are you configuring Ambari to use for it's repository?
... View more
03-11-2016
04:24 PM
@Michael Rife Can you please try going to localhost:8080 does it bring up Ambari?
... View more
03-11-2016
04:11 PM
Hi @DIALLO Sory From what you have posted, I don't see any errors. It says Ambari Server has successfully started. You should be able to access Ambari on Hostname:8080. If Ambari Server is indeed down, please send your ambari-server.log so we can better identify what the potential issue is. To see if Ambari Server isn't running try: ps -ef | grep ambari Cheers, Andrew
... View more
03-11-2016
04:08 PM
1 Kudo
Storing ranger audit logs on HDFS is beneficial for a couple of years: A) It provides a more scalable distributed data store, so you can store logs for a lot longer B) If you are currently leveraging Hadoop to store all security/audit logs. You can store your Ranger Audit logs along those in HDFS and be able to do better correlation between access request from different systems to help detect anomalies Storing in the RBDMS was the original default. It provides better response times on smaller data sets but it's not as scalable and you will then need to maintain (e.g. purge/roll logs) on a set frequency. Cheers, Andrew
... View more
03-10-2016
03:08 PM
2 Kudos
@Abdus Sagir Mollah Primary keys can also be useful for bucketing (i.e. paritioning of data) especially if you are trying to leverage the ACID capabilities of Hive. Quote from the below blog:
Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. Entire blog: http://hortonworks.com/blog/adding-acid-to-apache-hive/
... View more
02-23-2016
06:30 PM
@jsequeiros see the updated processor configure screenshot.
... View more
02-23-2016
05:18 PM
1 Kudo
Hi All, I leveraged the CSV to JSON XML workflow example to create a workflow where I wait for a CSV from an HTTP call, I then parse and label the CSV values and lastly send the fields and values to myself via email. The flow is working except for the email message doesn't seem to send the CSV replaceText values of Field1, Field2, Field3, Field4. Instead it is sending the Extract text values of csv.1, csv.2, csv.3, csv.4. The weird thing is when we look at the data provenance at the Email Processor we see the input claim has the fields correctly labeled as field1, field2, etc. Any idea what the issue is? EMAIL Message:
Standard FlowFile Metadata:
id = '4af1cc19-c702-42d0-907e-adcc92b04dab'
entryDate = 'Tue Feb 23 16:58:03 UTC 2016'
fileSize = '130'
FlowFile Attributes:
csv.1 = 'one'
path = './'
flowfile.replay.timestamp = 'Tue Feb 23 16:58:03 UTC 2016'
csv.3 = 'three'
filename = '5773072822254662'
restlistener.remote.user.dn = 'none'
csv.2 = 'two'
csv.4 = 'four'
csv = 'one'
restlistener.remote.source.host = 'XXXX'
flowfile.replay = 'true'
uuid = '4af1cc19-c702-42d0-907e-adcc92b04dab' Template: csvtojson.xml PutEmail Processor
... View more
Labels:
- Labels:
-
Apache NiFi