Member since 
    
	
		
		
		10-06-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                273
            
            
                Posts
            
        
                202
            
            
                Kudos Received
            
        
                81
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 4119 | 10-11-2017 09:33 PM | |
| 3649 | 10-11-2017 07:46 PM | |
| 2615 | 08-04-2017 01:37 PM | |
| 2245 | 08-03-2017 03:36 PM | |
| 2284 | 08-03-2017 12:52 PM | 
			
    
	
		
		
		07-31-2017
	
		
		04:20 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 @Muhammad Imran Tariq  No, Atlas requires Titan Graph Database which only supports BerkeleyDB, HBase and Cassandra as storage backends.  http://atlas.apache.org/Architecture.html  http://titan.thinkaurelius.com/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-24-2017
	
		
		02:42 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 For a comparison between compression formats take a look at this link:  http://comphadoop.weebly.com/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-19-2017
	
		
		06:16 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Varun  R  Take a look at the below articles that cover Tez performance tuning as well as an overview of how it works.  https://community.hortonworks.com/articles/22419/hive-on-tez-performance-tuning-determining-reducer.html  https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-14-2017
	
		
		06:43 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 You can
query using the API for entities created/modified after a certain date.  You would do this by running a DSL query
against the “createTime” attribute via rest.  For
example, if you would like to query for hive tables created/modified after 2017-04-18
6:49pm Your rest call would look like:  http://localhost:21000/api/atlas/v2/search/dsl?query=createTime%3E'2017-04-18T18%3A49%3A44.000Z'&typeName=hive_table  The date
format is as follows:  {yyyy}-{mm}-{dd}T{hh}:{mm}:{ss}.{zzz}Z  {year}-{month}-{day}T{hours}:{minutes}:{seconds}.{timezone}Z  e.g. 2017-04-18T18:49:44.000Z  You can also use a subset of the date rather than the entire string.  For example you can query by year only (2017), full date only (2017-04-18), date and time only (2017-04-18T18:49:44) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-14-2017
	
		
		06:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Im using HDP 2.6 and I'm trying to use DSL to query Atlas for tables created after a certain date.  So far, I've tried querying based on the attribute  "createTime" but am unable to figure out the date format used (milliseconds, seconds, etc...).  It seems that it only takes year (>2017, <2018) but nothing more.  Does anyone have any idea how to query for tables created/modified after a certain date/time? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Atlas
 
			
    
	
		
		
		07-10-2017
	
		
		02:39 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @subash sharma  Which version of HDP are you using?  Even though Column Level Lineage is advertised as available with Atlas 0.8 in the Apache page, it has only been made GA with HDP 2.6.1 rather than HDP 2.6.0.  The delay was to ensure it works with the other appropriate HiveQL commands beyond CTAS. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-29-2017
	
		
		05:33 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Adding to Sonu's response:  Moving to Hadoop from BI/EDW background is certainly a very common path. Those coming from that background usually find themselves more comfortable with Hive as an entry point. Hive provides an abstraction layer on top of MapReduce/Tez, and is based on SQL-like syntax that is ANSI compliant. It also has the advantage of providing a JDBC/ODBC connector, so most of the industry BI tools such as Tableau, Qlik, Microstrategy, etc.. can integrate and interact with Hive. This means that business analysts may continue to use the tools they are already familiar with while leveraging the power of Hadoop in the background  I would recommend you start by looking at Hive. Once comfortable with it, you can start to explore Hive data modelling and optimization, and then branch out to the other areas that Sonu recommended. I've also seen people in the field focus their entire career/job around just Hive.  Take a look at the link below for an introduction to Hive. There's plenty of internet resources and books that you can leverage to advance your knowledge. Hortonworks also provides Developer Training that covers an introduction to Hive as well as other engines/tools.  https://hortonworks.com/tutorial/how-to-process-data-with-apache-hive/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-28-2017
	
		
		01:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Smart Data   I am assuming that you are trying to create entities and lineage for HDFS files.  If so, then yes, you would need to use the REST API to create the lineage.  You can use the API to create the entities themselves rather than going through Kafka.  If you're using HDP 2.6.1, you can also create your entities through the Atlas UI as per the link below.  https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_data-governance/content/ch_atlas_searching_and_viewing_entities.html#atlas_manually_creating_entities  Finally, below is a step-by-step example of creating entities and lineage for an HDFS file that is picked and processed by Spark and the results written back to HDFS.  It will give you a good idea of how the APIs may be leveraged.  https://community.hortonworks.com/content/kbentry/91237/creating-custom-types-and-entities-in-atlas.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-28-2017
	
		
		01:06 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @subash sharma   As you are already aware, currently there is no out-of-box integration between HDF/Nifi and Atlas.  This is a roadmap item and has bee n documented in the below:  https://issues.apache.org/jira/browse/NIFI-3709  However, you can use the Atlas REST API to create the HDF entities and lineage.  Below are a couple of examples that show how this may be done:  https://community.hortonworks.com/repos/66014/nifi-atlas-bridge.html  https://community.hortonworks.com/repos/39432/nifi-atlas-lineage-reporter.html  As always, if you find this post helpful, don't forget to "accept" answer. 
						
					
					... View more