Member since 
    
	
		
		
		07-24-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                46
            
            
                Posts
            
        
                31
            
            
                Kudos Received
            
        
                5
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2044 | 01-30-2017 09:57 PM | |
| 14497 | 12-17-2016 12:11 AM | |
| 3403 | 07-06-2016 06:54 PM | |
| 3103 | 07-05-2016 05:41 PM | |
| 4017 | 06-16-2016 04:03 PM | 
			
    
	
		
		
		11-01-2018
	
		
		09:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Below are some FAQ's which helps you to quickly identify some important info for DPS-DLM deployment.  Pre-req's for DPS and DLM  DB version- postgres 9.3 to 9.6
OS - RHEL 7.0 and above
Ambari - 2.6.2
HDP 2.6.5
Distcp should work b/w source and target clusters.
Beacon user should be created in AD there is no choice of using custom user in this DLM 1.1 release
the onboarded service user for your application should exists in AD and needs
to be resolved(id <username> on both source and target clusters .
docker version
Required ports needs to be open b/w source and target clusters and also to access DPS UI
  where to install DLM and DPS software components?  DLM engine needs to be installed as m-pack on both clusters using Ambari server
DLM app is dockerized container needs to be installed on DPS host
  Which URL needs to be given to register cluster in DPS UI  the Ambari URL integrated with knox http://<>:8443
  I'm unable to see the DLM icon in DPS UI after enabling DLM component in DPS  User needs to be part of Infra-admin role  Verify DLM Engine install  Verify that Beacon was added as a user to the HDFS superuser group.
hdfs groups beacon
The output should display HDFS (or value of the dfs.permissions.superusergroup config) as one of the groups.
Beacon user should be part of ranger policies
https://docs.hortonworks.com/HDPDocuments/DLM1/DLM-1.2.0/installation/content/dlm_verify_the_dlm_engine_installation.html  Mostly used commands for for troubleshooting   On DPS host use below commands
docker ps -- check ports,containers and uptime
docker images
docker exec -it <docker-name> 
docker exec -it 029ec380bb3d /bin/ls -alrt /usr/dp-app/
docker logs --follow dp-app
docker exec -it d6390b6c0c50 /bin/ls -alrt /usr/dp-app/  Required Machine config for DPS and DLM   DPS runs on separate machine which will run all docker containers. <br>Master Node config is recommended for this host with atleast 64 GB of memory
if you are using external database for same host consider more memory and CPU  For hive replication the in target cluster beacon is auto creating deny policy in ranger ..is this expected behavior or bug in DLM 1.1?  This is to prevent any writes from happening outside of replication to the target database
the deny policy is only on the replication target database  For Hive replication can we schedule job per table basis?  No ,in this current DLM 1.1 release only database level is supported.  Please upvote if its helpful. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-27-2018
	
		
		07:46 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		6 Kudos
		
	
				
		
	
		
					
							 Q1.Does Hive LLAP supports stored procedures?   UDF’s   https://community.hortonworks.com/articles/117833/creating-custom-udf-and-adding-udf-jar-to-hive-lla.html  Question on handling small files problem  If ACID tables are not used then how to handle small files problem in Hive?. Is there any archival process to follow like creating HAR files?  Alter Table/Partition Concatenate
Version information
In Hive release 0.8.0 RCFile added support for fast block level merging of small RCFiles using concatenate command.   In Hive release 0.14.0ORC files added support fast stripe level merging of small ORC files using concatenate command.  ALTER TABLE table_name [PARTITION (partition_key = 'partition_value' [, ...])] CONCATENATE;  If the table or partition contains many small RCFiles or ORC files, then the above command will merge them into larger files. In case of RCFile the merge happens at block level whereas for ORC files the merge happens at stripe level thereby avoiding the overhead of decompressing and decoding the data.  Question on Mutations  So if we need to apply a thousand mutations, this would be a thousand operations, rather than one bulk operation.  
Please refer for the lock section - https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_data-access/content/lock-manager.html  Question on cache eviction:  Can LLAP be used to read more data than can fit into memory?   
Yes , it has eviction policy and stored the data in compressed format.  Question on data transfer:  Specific example, if a “select *” is performed on a very large table, can the application receive that
data as a “stream” or does some component (LLAP or HiveServer2, etc) need to hold the entire dataset in memory?   
All the results are streamed to HDFS and the results are streamed from there . NO memory constraint  Question on query result:  Related, does LLAP send results back as they become available (like Hbase scanresults) or only once the query completes? –
Returns the results once SQL completes  Question on compaction:  We may benefit from Hive’s ACID feature to handle “deltas”. Advantages seem to be:
•it would allow the updated data to be available in queries before a compaction has taken place. “You can update the data. compaction should be transparent”
•compaction implementation already exists, no need for bespoke implementation – Hive has inbuilt compaction technique [Major and Minor]  Question on spark and hive llap integration:  •Can LLAP be leveraged to serve data to Spark jobs efficiently? I.e., can LLAP inform Spark on the partitioning of the data it will provide? Or is it very course, plain jdbc, interface?
LLAP Spark Context is in Tech Preview(TP).  Question on cache eviction algorithm:  Seems Hive Metastore does not cache much data. Which means each query for Metadata, which would include statistics, goes through the “datanucleus” ORM layer.
Is this correct?
LLAP has a metadata cache.   Caching   The daemon caches metadata for input files, as well as the data. The metadata and index information can be cached even for data that is not currently cached. Metadata is stored in process in Java objects; cached data is stored in the format described in the I/O section, and kept off-heap (see Resource management).
    Eviction policy. The eviction policy is tuned for analytical workloads with frequent (partial) table-scans. Initially, a simple policy like LRFU is used. The policy is pluggable.
    Caching granularity. Column-chunks are the unit of data in the cache. This achieves a compromise between low-overhead processing and storage efficiency. The granularity of the chunks depends on the particular file format and execution engine (Vectorized Row Batch size, ORC stripe, etc.).  
A bloom filter is automatically created to provide Dynamic Runtime Filtering.  Question Hive LLAP on specific Nodes      In Ambari, how to specify where to run llap daemon on specific node.   Running Hive LLAP on specific Nodes using YARN Node Labels
https://community.hortonworks.com/content/kbentry/170868/running-llap-on-specific-nodes-using-yarn-node-lab.html  How fast I will know when LLAP query execution will fail?.  If this execution mode is sethive. llap.execution.mode=only will fail
immediately before submitting to LLAP  How to cancel LLAP queries which are in RUNNING State?  you should check yarn and see ifllaphas enough containers allocated.  
 1.Yarn top   
 2.Yarn application -kill <appid>   How to modify LLAP Log Options  There is a flag in Ambari-Hive config section UI 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-27-2018
	
		
		07:36 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		4 Kudos
		
	
				
		
	
		
					
							 HIVE LLAP - a one-page architecture overview    https://community.hortonworks.com/articles/149894/llap-a-one-page-architecture-overview.html  Hive - Understanding concurrent sessions + queue
allocation + preemption  https://community.hortonworks.com/articles/56636/hive-understanding-concurrent-sessions-queue-alloc.html    Hive LLAP Dashboards  https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-operations/content/grafana_hive_llap_dashboards.html    Hive LLAP Logs info  https://community.hortonworks.com/articles/149896/llap-debugging-overview-logs-uis-etc.html    Monitoring LLAP metrics  http://www.kartikramalingam.com/hive-llap/    Debugging Hive LLAP Query  https://community.hortonworks.com/articles/149896/llap-debugging-overview-logs-uis-etc.html    Question
on Hive LLAP benchmarks  Please share if any Hive
LLAP benchmarks?  https://hortonworks.com/blog/3x-faster-interactive-query-hive-llap/  LLAP Tuning  Here is an excellent article on
LLAP tuning.  https://community.hortonworks.com/articles/149486/llap-sizing-and-setup.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		10-12-2017
	
		
		12:19 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 GetTCP - Connects over TCP to the provided endpoint(s). Received data will be written as content to the FlowFile  ListenTCP- Listens for incoming TCP connections and reads data from each connection using a line separator as the message demarcator.  Ref:-  https://nifi.apache.org/docs.html  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-30-2017
	
		
		10:18 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Thanks to  @Matt Clarke for resolving this Major issue.  In a typical customer environment there is a challenge while deploying HDF Cluster & enabling LDAPS Authentication because of Username case.  In Active directory  userid exists as (Ex for Empid:- X1122) 
but When I have imported users in Ranger by setting lowercase=true all imported users are displayed like this in lower case (x1122) .  I have created all required policies for kafka and nifi .verified smoke tests for Kafka and they are PASSED. 
But smoke tests for NiFi are FAILED because because NiFi respects only AD value(X1122) and there is no inbuilt intelligence todo a case conversion.   
All the NiFI ranger policies has userid as (x112233).So Ranger Nifi policies are not applicable in this scenario and ranger nifi plugin authorization is not working correctly. 
So,NiFi Ranger Authorization has Failed to access View NiFI UI under /flow ranger policy.  NiFi does not have a option to change case sensitive of returned results, but with the ldap-provider there are two configuration options for "identity Strategy":   1. (default) USE_DN --> This strategy will use the users complete DN returned by LDAP upon successful authentication for authorization.<br> 
2. USE_USERNAME --> This strategy will use the username as typed in the login screen for authorization upon successful authentication with LDAP.   No matter what method of authentication is used, the value used above based on configuration is passed through and identity mapping patterns configured in NiFi and the result sent to the configured authorizer. That authorizer in your case is Ranger.   
We resolve this issue by using "USE_USERNAME" 
So as long as user logs in as all lowercase, it will work   We also changed user search filter to:   
<property name="User Search Filter">(&(sAMAccountName={0})(memberOf=CN=hwx,OU=Groups,OU=Global,OU=XX,DC=XX,DC=XX)) 
 </property> 
and proper search base needed to be: 
<property name="User Search Base">OU=Users,OU=XX,DC=XX,DC=XX</property> 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		09-16-2017
	
		
		11:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 HIVE Beeline:   ============  Binary mode   > !connect 'jdbc:hive2://prod07.app.hwx.com:10000/;transportMode=binary'
http mode
beeline -u  'jdbc:hive2://prod07.app.hwx.com:10001/;transportMode=http;httpPath=cliservice'   In HS2 HA Environment
with zookeeper out auto-discovery mode   !connect jdbc:hive2://prod09.app.hwx.com:2181,prod10.app.hwx.com:2181,prod11.app.hwx.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2   In Kerberos Environment - Hive Beeline command  
!connect 'jdbc:hive2://prod07.app.hwx.com:10001/default;principal=hive/prod07.app.hwx.com@EXAMPLE.COM;transportMode=http;httpPath=cliservice'   KNOX with Beeline
!connect jdbc:hive2://knox101.app.hwx.com:8443/default;transportMode=http;httpPath=gateway/default/hive;ssl=true  
knox with webhdfs   curl -iku raj_ops -X GET https://knox101.app.hwx.com:8443/gateway/default/webhdfs/v1/tmp?op=LISTSTATUS  if this article helps you.please up vote it. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-27-2017
	
		
		06:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 I have provided below the implementation steps for integrating KNOX with Loadbalancer assuming once your Loadbalancer is ready.  SSL connection should terminate on Knox servers at Loadbalancer side Sticky session should be enabled. Currently the SSL connection terminates at load balancer side and internally it encrypts and loops through one of the KNOX channels.
  JKS file creation   
 OPEN in IE browser the load balancer URL 
  Example:-https://hadoop-knox.dev.XXXXXX.com/ 
  Click on lock symbol click on view certificates and Certificate path   choose Root click on view certificate--details--copytofile--base 509 format --save as .pem file
choose intermediateIssuer CA click on view certificate--details--copytofile--base 509 format --save as .pem file
choose loadbalncercert and  click on details--copytofile--base 509 format --save as .pem file  copy these 3 files into knox edge node.   I have copied to certfiles folder  
 /tmp/knoxhacerts/new/certfiles/lb-rootca.pem 
  /tmp/knoxhacerts/new/certfiles/lb-intermediate-issuer.pem 
  /tmp/knoxhacerts/new/certfiles/hadoop-knox-dev-lb.pem   create new JKS file as below
  cp /usr/hdp/current/knox-server/data/security/keystores/gateway.jks /tmp/knoxhacerts/dev-knox-test-1.jks   keytool -storepasswd -keystore /tmp/knoxhacerts/dev-knox-test-1.jks    enter current master secret password then change the password by using new password.
    Import all the PEM encoded files to these JKS file.
keytool -import -alias rootca-lb -keystore dev-knox-test-1.jks -file  /tmp/knoxhacerts/new/certfiles/lb-rootca.pem
keytool -import -alias intca-lb -keystore dev-knox-test-1.jks -file  /tmp/knoxhacerts/new/certfiles/lb-intermediate-issuer.pem
keytool -import -alias dev-lb -keystore dev-knox-test-1.jks -file  /tmp/knoxhacerts/new/certfiles/hadoop-knox-dev-lb.pem  Ca-cert chain for ODBC:  Copy the contents of all below files by opening in a notepad editor to one merge-chainfile(merge-cacertchain.crt) by   
 /tmp/knoxhacerts/new/certfiles/lb-rootca.pem 
  /tmp/knoxhacerts/new/certfiles/lb-intermediate-issuer.pem
  /tmp/knoxhacerts/new/certfiles/hadoop-knox-dev-lb.pem   Verification Step:
  Use SSLPoke to verify connectivity.  Try the Java class SSLPoke to see if your truststore contains the right certificates. This will let you connect to a SSL service, send a byte of input, and watch the output.   Download SSLPoke.class (https://confluence.atlassian.com/kb/files/779355358/779355357/1/1441897666313/SSLPoke.class)   compile   javac SSLPoke.java   Execute the class as per the below,   changing the URL and port appropriately.
<JAVA_HOME>/bin/java SSLPoke jira.example.com 443  Failed Scenario:  A failed connection would produce the below: 1 /usr/bin/java SSLPoke jira.example.com 443 2 sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
  HAPPY Path:  devenap02.dev.abc.net# java -Djavax.net.ssl.trustStore=/tmp/knoxhacerts/new/dev-knox-test-1.jks SSLPoke  hadoop-knox.dev.XXXXXX.com 443   Successfully connected
  Pls upvote if this article helps. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		04-27-2017
	
		
		06:45 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 if you want to verify the Certificate contents of KNOX
Server execute below command  openssl s_client -showcerts
-connect 127.0.0.1:8443  if developers want to connect to KNOX with SSL enabled
  copy cert contents from above
command to knox.crt file and import to a Keystore by executing below command  keytool -import -keystore
myLocalTrustStore.jks -file knox.crt  Now developers use as below   beeline> !connect  "jdbc:hive2://hadoop-knox.dev.XXXX.com:8443/default;transportMode=http;  httpPath=gateway/default/hive;ssl=true;sslTrustStore=/tmp/knoxhacerts/new/myLocalTrustStore.jks;trustStorePassword=knoxdev"  Hive JDBC  jdbc:hive2://{gateway-host}:{gateway-port}/;    ssl=true;    sslTrustStore={gateway-trust-store-path};    trustStorePassword={gateway-trust-store-password};    transportMode=http;    httpPath={gateway-path}/{cluster-name}/hive  If you want to list the imported certs in a JKS file
execute below command.  keytool -v -list -keystore
gateway.jks  command to create new truststore myNewTrustSTore.jks  keytool -import -alias knox
-keystore ./myNewTrustStore.jks -file ./knox-cert.pem  knox-cert.pem is the cert you
saved knox.crt  certificate in pem format  if you want to change SSL certificate for KNOX  http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_Security_Guide/content/knox_ca_signed_certificates_production.html  Pls upvote if this article helps. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		03-13-2017
	
		
		09:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @suresh krish  Follow the steps mentioned here.  http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_security/content/_optional_install_a_new_mit_kdc.html  Please refer to this article  http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_security/content/_kerberos_overview.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-03-2017
	
		
		07:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_Security_Guide/content/ranger_rest_api_create_policy.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













