Member since
11-09-2016
68
Posts
16
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
810 | 12-07-2017 06:32 PM | |
247 | 12-07-2017 06:29 PM | |
532 | 12-01-2017 11:56 AM | |
3597 | 02-10-2017 08:55 AM | |
911 | 01-23-2017 09:44 PM |
03-18-2018
11:42 AM
It's more likely you don't have enough ram memory, double check the size of your queue.
... View more
03-16-2018
10:22 PM
Hi Dominique, Yes it does audit policy change/update, and logins. hope this answer your question.
... View more
03-16-2018
10:13 PM
Can you try to tune it by changing the following ( try 2G then 4G or 6G ..) : set hive.tez.container.size=2048; set hive.tez.java.opts=-Xmx2048m;
... View more
03-16-2018
09:59 PM
1 Kudo
1 - Yes , you can mention only the leaf queue. 2- User1 can be mapped to root queue, and so he can access them all where users 2 & 3 can only submit in their queues. 3- In this case you don't need to enable ACLs, with queue mappings you can manage this.
... View more
03-16-2018
02:59 PM
Quick command to find the total number of partitions in a Kafka cluster, it could help for example in Mirror Maker sizing. Please replace ZK_SERVER values with your cluster details. cd /tmp
zookeeper="ZK_SERVER1:2181,ZK_SERVER2:2181,ZK_SERVER3:2181"
sum=0
for i in $(/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --list --zookeeper $zookeeper ); do count=$(/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --describe --zookeeper $zookeeper --topic $i |grep Leader | wc -l); sum=`expr $sum + $count` ; echo 'total partitions is ' $sum; done
If you want to count partitions with specific filter on the name for Topics zookeeper="ZK_SERVER1:2181,ZK_SERVER2:2181,ZK_SERVER3:2181" sum=0
for i in $(/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --list --zookeeper $zookeeper | grep 'FILTER'); do count=$(/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --describe --zookeeper $zookeeper --topic $i |grep Leader | wc -l); sum=`expr $sum + $count` ; echo 'total partitions is ' $sum; done
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- FAQ
- Kafka
- mirror-maker
- Script
Labels:
03-09-2018
02:55 PM
How to get the number of documents indexed : curl -o /tmp/result.txt --negotiate -u : -X GET "SOLR_SERVER:8886/solr/ranger_audits_shard1_replica1/select?q=*:*&distrib=false"
How to run a delete command via curl ( delete data older than 24h 😞 curl --negotiate -u: "SOLR_SERVER:8886/solr/ranger_audits/update?commit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>evtTime:[* TO NOW-24HOURS]</query></delete>" How to run optimize command via curl : curl --negotiate -u: "SOLR_SERVER:8886/solr/ranger_audits/update?optimize=true"
... View more
- Find more articles tagged with:
- Data Processing
- FAQ
- solrcloud
Labels:
03-09-2018
02:36 PM
##install Luna client,
Unzip Luna client for example under /opt/LUNAHSM
under /opt/LUNAHSM/linux/64/
run
sh install.sh all
Follow the the instructions below, for the questions asked, please answer as below :
Accept conditions
(y/n) y
Products
Choose Luna Products to be installed
[1]: Luna SA
[2]: Luna PCI-E
[3]: Luna G5
[4]: Luna Remote Backup HSM
[N|n]: Next
[Q|q]: Quit
Enter selection: 1
Products
Choose Luna Products to be installed
*[1]: Luna SA
[2]: Luna PCI-E
[3]: Luna G5
[4]: Luna Remote Backup HSM
[N|n]: Next
[Q|q]: Quit
Enter selection: n
Advanced
Choose Luna Components to be installed
[1]: Luna Software Development Kit (SDK)
*[2]: Luna JSP (Java)
*[3]: Luna JCProv (Java)
[B|b]: Back to Products selection
[I|i]: Install
[Q|q]: Quit
Enter selection: i
List of Luna Products to be installed:
- Luna SA
List of Luna Components to be installed:
- Luna JSP (Java)
- Luna JCProv (Java)
... installation complete
<br>#now to swap the certificate : copy SERVER.pem from LUNA server to your KMS server /tmp
cp /tmp/SERVER.pem /usr/safenet/lunaclient/cert/server
#under lunaClient
[root@XXXXX lunaclient]# pwd
/usr/safenet/lunaclient
#get the local IP where the client is installed YY.YY.YY.YY (YY.YY.YY.YY is your local IP)
[root@XXXXX lunaclient]# bin/vtl createCert -n YY.YY.YY.YY
Private Key created and written to: /usr/safenet/lunaclient/cert/client/SERVERkey.pem
Certificate created and written to: /usr/safenet/lunaclient/cert/client/xx.xx.xx.xx.pem
#add a Luna SA Server to the trusted list of servers
[root@XXXXX lunaclient]# bin/vtl addServer -n xx.xx.xx.xx -c /usr/safenet/lunaclient/cert/server/SERVER.pem
New server xx.xx.xx.xx successfully added to server list.
transfer the pem generated to the Luna server.
SWAP COMPLETED.
[root@XXXXX lunaclient]# bin/vtl verify
... View more
03-09-2018
02:24 PM
Quick post to add an auto fix for Solr infra lock issue. Ranger server under : /usr/hdp/current/ranger-admin/contrib/solr_for_audit_setup/conf Edit the file solrconfig.xml Uncomment and change <unlockOnStartup>false</unlockOnStartup> to <unlockOnStartup>true</unlockOnStartup> Submit the new xml: /usr/lib/ambari-infra-solr-client/solrCloudCli.sh --zookeeper-connect-string XXXX:2181/infra-solr --upload-config --config-dir /usr/hdp/current/ranger-admin/contrib/solr_for_audit_setup/conf --config-set ranger_audits --jaas-file Increase the sleep time from 5 to 30 seconds in /opt/lucidworks-hdpsearch/solr/bin/solr sed -i 's/(sleep 5)/(sleep 30)/g'/opt/lucidworks-hdpsearch/solr/bin/solr Or in the following : sed -i 's/(sleep 5)/(sleep 30)/g' /usr/lib/ambari-infra-solr/bin/solr you can also add in the script the following command : hadoop fs -rm /user/infra-solr/ranger_audits/core_node1/data/index/write.lock
... View more
- Find more articles tagged with:
- FAQ
- infra
- ranger_audits
- Security
- solr
Labels:
03-09-2018
02:07 PM
Quick tips to optimise your infra Solr for Ranger audits using SolrCloud. 1) Change the SolrCloud retention period of the audit. On the server of ranger under : /usr/hdp/current/ranger-admin/contrib/solr_for_audit_setup/conf
#### Edit the file or use sed to replace the 90 Days in the solrconfig.xml , choose the right retention period, here is 6 hours sed -i 's/+90DAYS/+6HOURS/g' solrconfig.xml sed -i 's/86400/7200/g' solrconfig.xml 2) Change ZK config, by submitting the xml again /usr/lib/ambari-infra-solr-client/solrCloudCli.sh --zookeeper-connect-string XXXXXX:2181/infra-solr --upload-config --config-dir /usr/hdp/current/ranger-admin/contrib/solr_for_audit_setup/conf --config-set ranger_audits --jaas-file /usr/hdp/current/ranger-admin/conf/ranger_solr_jaas.conf
Check that we loaded it correctly, in the Solr UI or with the following command #Download the solrconfig.xml from Zookeeper
/usr/lib/ambari-infra-solr/server/scripts/cloud-scripts/zkcli.sh --zkhost XXXXXX:2181 -cmd getfile /infra-solr/configs/ranger_audits/solrconfig.xml /tmp/solrconfig.xml
3) Restart Infra
... View more
- Find more articles tagged with:
- ambari-infra
- FAQ
- infra
- Ranger
- ranger-audit
- solrcloud
Labels:
01-03-2018
11:24 AM
1 - could be your queues are busy and if you are on FIFO, this may explain the behaviour. 2- or could be your metastore is busy and not responding properly, could be due to the backend db ( postgres or mysql )
... View more
01-03-2018
11:14 AM
You have reached the max number of files for one folder, and an ls on this folder may not work Maybe your process is creating too many small files, maybe worth check why this is happening. For a quick workaround you can try the following : 1#get the total count of the table
2#get the creations script, make sure the table is partitioned accordingly.
3#take a copy of the table
create table tablecopy as select * from table;
4#check the count on the new table
select count(*) from table
5#check number of hdfs files
hdfs dfs -ls /apps/hive/warehouse//table
6#take a copy of the hdfs folder for further investigation
export HADOOP_HEAPSIZE="8096"
hdfs dfs -cp /apps/hive/warehouse//table /tmp
=> you may have OutOfMemoryError: GC overhead limit exceeded
7#truncate original table
truncate table table;
8#drop table
drop table table;
9#make sure hdfs folder is removed
10#create table again
11#put the data back with insert
... View more
12-07-2017
06:39 PM
Hi David, Yes best to add additional jars to HDFS under ../oozie/lib/share/sqoop/lib Oozie will load them when launching sqoop commands
... View more
12-07-2017
06:32 PM
Is passwordless installed between the namenode and secondary namenode ( or standby if you are HA ) can you check in the namenode UI the checkpoints ?
... View more
12-07-2017
06:29 PM
Hi Michael Yes you can use Blueprints 1 - Export blueprints via API call ( GET /api/v1/clusters/:clusterName?format=blueprint ) 2- change the json document 3- Import again the new modified BluePrint json ( POST ... ) For more details about how to use blueprints, please check here
... View more
12-07-2017
06:23 PM
Did you tried to connect to the znode in zookeeper and see it's content ?
... View more
12-07-2017
06:21 PM
Can you paste a screeenshot of the view config ?
... View more
12-07-2017
06:07 PM
First, you can use ldapsearch to try the same query and see the results to debug Can you paste the logs from Ranger usersync ? Did you tried only to synchronise only users , without groups in the screenshot ?
... View more
12-04-2017
04:27 PM
Oozie packages name looks wrong in the available ones, it's missing the version, try to install without Oozie first
... View more
12-04-2017
03:26 PM
Did you tried to delete the znode and restart nimbus ?
... View more
12-01-2017
11:56 AM
Hi Mike yes, i have seen it once, the efs must be in the same Availability Zone ( if not, you may be charged for data transfer and have some perf degradation .. ) and i think the only downside is that it's network performance , it's less performant that having a local disk. i think it's ok if your use case permit , however if you are planning to use Hbase/Storm where latency is critical , i recommend you to do some benchmarking first.
... View more
12-01-2017
10:41 AM
kinit PRINCIPAL -kt /etc/security/keytabs/PRINCIPAL.keytab
hive --hiveconf hive.execution.engine=mr
SET hive.execution.engine=tez;
SET tez.queue.name=QUEUE_NAME;
use MON_SCHEMA;
select count(*) from TABLE where id =1;
By starting Hive CLI with mr, it open the terminal quickly than default, this is because it will not request for AM resource.
PS : Hive CLI is not recommended and must be deprecated in your production environment, check here for more info
... View more
- Find more articles tagged with:
- Data Processing
- FAQ
- hivecli
Labels:
12-01-2017
10:31 AM
Complementary article to Hive CLI security to clarify the risk of using Hive CLI. Hive CLI ( or Hive shell ) is not recommended and Apache asked users to move to Beeline even if it's still supported by Hortonworks ( HDP2.6 ) Ranger Hive plugin does not enforce permissions for Hive CLI users however it doesn’t bypass systematically
“All” Ranger policies, it bypass only hive policies. This risk is therefore for all hive managed tables. ( ones under
/apps/hive/warehouse/ ) All external DB/Tables will still be protected by HDFS policies.
... View more
Labels:
12-01-2017
10:21 AM
2 Kudos
When hive.server2.enable.doAs=True , HiveServer2 performs the query processing as the user who submitted the query (usually the user you kinit with, it could be service account or an account assigned to a team ). But if the parameter is set to false, the query will run as the user that the hiveserver2 process runs as, mostly Hive This will help to : 1-Better control the users via Hive Ranger policies 2-Better control the ACLs mappings for Yarn, so you can assign every user on a specific Queue.
... View more
- Find more articles tagged with:
- capacity-scheduler
- FAQ
- Hadoop Core
- Hive
- YARN
Labels:
08-07-2017
12:26 PM
There is no Knox UI ( for current version 0.9.X ) if this is the question. If you are looking to secure YARN UIs access through Knox, it's not supported yet. For (HDP2.6) Knox supports only : Ambari UI Ranger Admin Console
... View more
08-07-2017
10:14 AM
Symtoms : NameNode HA states: active_namenodes =[], standby_namenodes =[], unknown_namenodes =[(u'nn1', Solution : Could be in order : 1) Ambari is doing the timeout ( 5 sec is default ) and killing the process if the NN takes long to start you can change the value of the timeout in /var/lib/ambari-server/resources/common-services/HDFS/vXXXX/package/scripts/hdfs_namenode.py From this: @retry(times=5, sleep_time=5, backoff_factor=2, err_class=Fail) To this: @retry(times=25, sleep_time=25, backoff_factor=2, err_class=Fail) if not enough to this: @retry(times=50, sleep_time=25, backoff_factor=2, err_class=Fail) 2) Could be the Zookeeper not getting the status of the NN for this you can try
Restart zookeeper, if it's still not working , then try the following Check the content of the Znode ( hadoop-ha ), save the namespace of the NN and delete the content and restart the NN
... View more
Labels:
08-07-2017
10:02 AM
If you would like NN1 to be active namenode, you can do it by stopping ZKFC of NN2 via ambari. After ~10 sec the failover should happend the the other NN become the active
... View more
- Find more articles tagged with:
- FAQ
- Hadoop Core
- HDFS
- namenode
Labels:
07-28-2017
04:39 PM
Anyone have an idea how can a queue capped to 15% of the cluster take 50% of the cluster ( this is default queue it has guaranteed 3% , maximum 15% ) but in UI it's showing 51% of the cluster - and 1700% of the queue capacity- any clue ?
... View more
Labels:
07-24-2017
02:33 PM
It's more likely to be AD users on the machine not sync or not created.
did you created / or sync hadoop users / groups on this machine ?
... View more