Member since 
    
	
		
		
		09-11-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                269
            
            
                Posts
            
        
                281
            
            
                Kudos Received
            
        
                55
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 4187 | 03-15-2017 07:12 AM | |
| 2503 | 03-14-2017 07:08 PM | |
| 3027 | 03-14-2017 03:36 PM | |
| 2482 | 02-28-2017 04:32 PM | |
| 1713 | 02-28-2017 10:02 AM | 
			
    
	
		
		
		10-18-2016
	
		
		03:56 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@Jasper Below is an example for associating a trait to an entity. Basically it a HTTP POST request with body containing the trait information.  curl -v 'http://172.22.73.216:21000/api/atlas/entities/f9432735-5972-4525-b3fd-17544babd5ee/traits' -H 'Cookie: JSESSIONID=159cepww1tnzaczw2f4wbhhdu' -H 'Origin: http://172.22.73.216:21000' -H 'Accept-Encoding: gzip, deflate' -H 'X-XSRF-HEADER: ""' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36' -H 'Content-Type: application/json' -H 'Accept-Language: en-US,en;q=0.8' -H 'Accept: application/json, text/javascript, */*; q=0.01' -H 'Referer: http://172.22.73.216:21000/index.html' -H 'X-Requested-With: XMLHttpRequest' -H 'Connection: keep-alive' --data-binary '{"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct","typeName":"addMultipleTraits-3-mho2upf7jh","values":{}}' --compressed
*   Trying 172.22.73.216...
* Connected to 172.22.73.216 (172.22.73.216) port 21000 (#0)
> POST /api/atlas/entities/f9432735-5972-4525-b3fd-17544babd5ee/traits HTTP/1.1
> Host: 172.22.73.216:21000
> Cookie: JSESSIONID=159cepww1tnzaczw2f4wbhhdu
> Origin: http://172.22.73.216:21000
> Accept-Encoding: gzip, deflate
> X-XSRF-HEADER: ""
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36
> Content-Type: application/json
> Accept-Language: en-US,en;q=0.8
> Accept: application/json, text/javascript, */*; q=0.01
> Referer: http://172.22.73.216:21000/index.html
> X-Requested-With: XMLHttpRequest
> Connection: keep-alive
> Content-Length: 134
>
* upload completely sent off: 134 out of 134 bytes
< HTTP/1.1 201 Created
< Date: Tue, 18 Oct 2016 03:47:09 GMT
< Location: http://172.22.73.216:21000/api/atlas/entities/f9432735-5972-4525-b3fd-17544babd5ee/traits/f9432735-5972-4525-b3fd-17544babd5ee
< Content-Type: application/json; charset=UTF-8
< Transfer-Encoding: chunked
< Server: Jetty(9.2.12.v20150709)
<
* Connection #0 to host 172.22.73.216 left intact
{"requestId":"qtp297811323-15 - 302927ef-c011-4c22-8cee-352dd4b18c2d"}
  In the above response, you see "201 created", this confirms that the trait association is successful. To verify if the trait is attached to the entity, make GET API request for that entity. For example:  Request looks like this:  curl 'http://172.22.73.216:21000/api/atlas/entities/f9432735-5972-4525-b3fd-17544babd5ee' -H 'Cookie: JSESSIONID=159cepww1tnzaczw2f4wbhhdu' -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en-US,en;q=0.8' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36' -H 'Accept: application/json, text/javascript, */*; q=0.01' -H 'Referer: http://172.22.73.216:21000/index.html' -H 'X-Requested-With: XMLHttpRequest' -H 'Connection: keep-alive' --compressed | python -m json.tool  Response should contain the trait you associated:  {
    "definition": {
        "id": {
            "id": "f9432735-5972-4525-b3fd-17544babd5ee",
            "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
            "state": "ACTIVE",
            "typeName": "storm_topology",
            "version": 0
        },
        "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
        "traitNames": [
            "addMultipleTraits-1-mho2upf7jh",
            "addMultipleTraits-10-mho2upf7jh",
            "addMultipleTraits-3-mho2upf7jh"
        ],
        "traits": {
            "addMultipleTraits-1-mho2upf7jh": {
                "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Struct",
                "typeName": "addMultipleTraits-1-mho2upf7jh",
                "values": {}
            },
            "addMultipleTraits-10-mho2upf7jh": {
                "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Struct",
                "typeName": "addMultipleTraits-10-mho2upf7jh",
                "values": {}
            },
            "addMultipleTraits-3-mho2upf7jh": {
                "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Struct",
                "typeName": "addMultipleTraits-3-mho2upf7jh",
                "values": {}
            }
        },
        "typeName": "storm_topology",
        "values": {
            "clusterName": "cl1",
            "conf": null,
            "description": null,
            "endTime": null,
            "id": "kafka_mongo1-2-1476723908",
            "inputs": [
                {
                    "id": "dc50e010-af69-490a-b7b6-8f98bba776da",
                    "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                    "state": "ACTIVE",
                    "typeName": "DataSet",
                    "version": 0
                }
            ],
            "name": "kafka_mongo1",
            "nodes": [
                {
                    "id": {
                        "id": "12086a26-89b9-4aba-bd66-e44067c0396b",
                        "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                        "state": "ACTIVE",
                        "typeName": "storm_bolt",
                        "version": 0
                    },
                    "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                    "traitNames": [],
                    "traits": {},
                    "typeName": "storm_bolt",
                    "values": {
                        "conf": {
                            "HiveBolt.currentBatchSize": "0",
                            "HiveBolt.kerberosEnabled": "false",
                            "HiveBolt.options.autoCreatePartitions": "true",
                            "HiveBolt.options.batchSize": "1",
                            "HiveBolt.options.callTimeout": "10000",
                            "HiveBolt.options.databaseName": "default",
                            "HiveBolt.options.heartBeatInterval": "240",
                            "HiveBolt.options.idleTimeout": "3600",
                            "HiveBolt.options.mapper.columnFields._fields": "id,msg",
                            "HiveBolt.options.mapper.columnFields._index.id": "0",
                            "HiveBolt.options.mapper.columnFields._index.msg": "1",
                            "HiveBolt.options.mapper.fieldDelimiter": ",",
                            "HiveBolt.options.mapper.partitionFields._fields": "dt",
                            "HiveBolt.options.mapper.partitionFields._index.dt": "0",
                            "HiveBolt.options.maxOpenConnections": "100",
                            "HiveBolt.options.metaStoreURI": "thrift://atlas-testing-unsecure-4.openstacklocal:9083",
                            "HiveBolt.options.tableName": "test",
                            "HiveBolt.options.txnsPerBatch": "2",
                            "HiveBolt.timeToSendHeartBeat.value": "0"
                        },
                        "description": null,
                        "driverClass": "org.apache.storm.hive.bolt.HiveBolt",
                        "inputs": [
                            "kafkaspout_test"
                        ],
                        "name": "hivebolt_test",
                        "outputs": null
                    }
                },
                {
                    "id": {
                        "id": "42d963e7-64aa-451e-848f-9b8d87886a69",
                        "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                        "state": "ACTIVE",
                        "typeName": "storm_spout",
                        "version": 0
                    },
                    "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
                    "traitNames": [],
                    "traits": {},
                    "typeName": "storm_spout",
                    "values": {
                        "conf": {
                            "KafkaSpout._currPartitionIndex": "0",
                            "KafkaSpout._lastUpdateMs": "0",
                            "KafkaSpout._spoutConfig.bufferSizeBytes": "1048576",
                            "KafkaSpout._spoutConfig.fetchMaxWait": "10000",
                            "KafkaSpout._spoutConfig.fetchSizeBytes": "1048576",
                            "KafkaSpout._spoutConfig.forceFromStart": "true",
                            "KafkaSpout._spoutConfig.hosts.brokerZkPath": "/brokers",
                            "KafkaSpout._spoutConfig.hosts.brokerZkStr": "atlas-testing-unsecure-2.openstacklocal:2181,atlas-testing-unsecure-4.openstacklocal:2181,atlas-testing-unsecure-1.openstacklocal:2181,atlas-testing-unsecure-5.openstacklocal:2181",
                            "KafkaSpout._spoutConfig.hosts.refreshFreqSecs": "60",
                            "KafkaSpout._spoutConfig.id": "a3fd0b73-b2d4-4fd9-a97f-909ce6a9b336",
                            "KafkaSpout._spoutConfig.maxOffsetBehind": "9223372036854775807",
                            "KafkaSpout._spoutConfig.metricsTimeBucketSizeInSecs": "60",
                            "KafkaSpout._spoutConfig.socketTimeoutMs": "10000",
                            "KafkaSpout._spoutConfig.startOffsetTime": "-2",
                            "KafkaSpout._spoutConfig.stateUpdateIntervalMs": "2000",
                            "KafkaSpout._spoutConfig.topic": "test_topic",
                            "KafkaSpout._spoutConfig.useStartOffsetTimeIfOffsetOutOfRange": "true",
                            "KafkaSpout._spoutConfig.zkRoot": "/test_topic",
                            "KafkaSpout._uuid": "1c08a736-c039-497c-8160-673fdda7e129"
                        },
                        "description": null,
                        "driverClass": "storm.kafka.KafkaSpout",
                        "name": "kafkaspout_test",
                        "outputs": [
                            "hivebolt_test"
                        ]
                    }
                }
            ],
            "outputs": [
                {
                    "id": "e5d69223-f2aa-44b2-a5a9-c3f3743e6a63",
                    "jsonClass": "org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
                    "state": "ACTIVE",
                    "typeName": "DataSet",
                    "version": 0
                }
            ],
            "owner": "storm",
            "qualifiedName": "kafka_mongo1",
            "startTime": "2016-10-17T17:05:07.365Z"
        }
    },
    "requestId": "qtp297811323-15 - fd83329c-fd4f-4889-b050-5e02c405a886"
}
  For more information on these APIs, please refer to these links..  http://atlas.incubator.apache.org/AtlasTechnicalUserGuide.pdf  http://atlas.incubator.apache.org/api/resource_EntityResource.html#path__entities_-guid-_traits.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2016
	
		
		03:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @mayki wogno If you know the job_id/workflow id, then grep for it in the oozie logs, this might help. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2016
	
		
		02:51 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Incase if we dont know the oozie job id:  Checking the Status of multiple Workflow Jobs  Example:  $ oozie jobs -oozie http://localhost:8080/oozie -localtime -len 2 -fliter status=RUNNING
.
Job Id                          Workflow Name         Status     Run  User      Group     Created                Started                 Ended
.----------------------------------------------------------------------------------------------------------------------------------------------------------------
4-20090527151008-oozie-joe     hadoopel-wf           RUNNING    0    joe      other     2009-05-27 15:34 +0530 2009-05-27 15:34 +0530  -
0-20090527151008-oozie-joe     hadoopel-wf           RUNNING    0    joe      other     2009-05-27 15:15 +0530 2009-05-27 15:15 +0530  -
.----------------------------------------------------------------------------------------------------------------------------------------------------------------
  The jobs sub-command will display information about multiple jobs.  The offset and len option specified the offset and number of jobs to display, default values are 1 and 100 respectively.  The localtime option displays times in local time, if not specified times are displayed in GMT.  The verbose option gives more detailed information for each job.  A filter can be specified after all options.  The filter option syntax is: [NAME=VALUE][;NAME=VALUE]* .  Valid filter names are:  
 name: the workflow application name from the workflow definition.  user: the user that submitted the job.  group: the group for the job.  status: the status of the job.   The query will do an AND among all the filter names. The query will do an OR among all the filter values for the same name. Multiple values must be specified as different name value pairs. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2016
	
		
		02:25 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Checking the server logs of a Workflow, Coordinator or Bundle Job  Example:  $ oozie job -oozie http://localhost:8080/oozie -log 14-20090525161321-oozie-joe 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2016
	
		
		02:23 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@mayki wogno Checking the Status of a Workflow, Coordinator or Bundle Job or a Coordinator Action  Example:  $ oozie job -oozie http://localhost:8080/oozie -info 14-20090525161321-oozie-joe
.
.----------------------------------------------------------------------------------------------------------------------------------------------------------------
Workflow Name :  map-reduce-wf
App Path      :  hdfs://localhost:9000/user/joe/workflows/map-reduce
Status        :  SUCCEEDED
Run           :  0
User          :  joe
Group         :  users
Created       :  2009-05-26 05:01 +0000
Started       :  2009-05-26 05:01 +0000
Ended         :  2009-05-26 05:01 +0000
Actions
.----------------------------------------------------------------------------------------------------------------------------------------------------------------
Action Name             Type        Status     Transition  External Id            External Status  Error Code    Start                   End
.----------------------------------------------------------------------------------------------------------------------------------------------------------------
hadoop1                 map-reduce  OK         end         job_200904281535_0254  SUCCEEDED        -             2009-05-26 05:01 +0000  2009-05-26 05:01 +0000
.----------------------------------------------------------------------------------------------------------------------------------------------------------------
  The info option can display information about a workflow job or coordinator job or coordinator action.  The offset and len option specified the offset and number of actions to display, if checking a workflow job or coordinator job.  The localtime option displays times in local time, if not specified times are displayed in GMT.  The verbose option gives more detailed information for all the actions, if checking for workflow job or coordinator job.  Source: https://oozie.apache.org/docs/3.1.3-incubating/DG_CommandLineTool.html#Checking_the_Status_of_a_Workflow_Coordinator_or_Bundle_Job_or_a_Coordinator_Action 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2016
	
		
		06:04 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @rama Is source and target cluster running the same hdfs version? If no, then use the below command.  hadoop distcp webhdfs://namenode1:<port>/source/dir webhdfs://namenode2:<port>/destination/dir
  NameNode URI and NameNode HTTP port should be provided in the source and destination command, if you are using webhdfs.  Also make sure to provide absolute paths while using distcp. (https://hadoop.apache.org/docs/r1.2.1/distcp.html).  In the actual question, I also observed that you are not using port number for the target cluster url..  hadoop distcp hdfs://xx.xx.xx.xx:8020/apps/hive/warehouse/sankar5_dir hdfs://xx.xx.xx.xx:<port>//apps/hive/warehouse/sankar5_dir 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2016
	
		
		04:27 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 After distcp, do you see the same directory structure in target cluster? If yes, you should be able to import on target cluster as well.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2016
	
		
		04:03 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@rama Can you check if the export table command result is stored in which directory?  export table db_c720_dcm.network_matchtables_act_creative to 'apps/hive/warehouse/sankar5_dir';  Actually when you execute the above command, the final data will be written to /user/<user_name>/apps/hive/warehouse/sankar5_dir directory in HDFS (of course, it will need to be writable by the current user).  So, please make the path exists in the expected directory before executing the distcp comamnd. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-13-2016
	
		
		02:35 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		9 Kudos
		
	
				
		
	
		
					
							 This article assumes that, you have a HDP-2.5 cluster with Atlas & Hive enabled. Also make sure that atlas is up and running on that cluster. Please refer to this documentation link for deploying cluster with Atlas enabled.  Atlas provides a script/tool to import metadata from hive for all hive entities like tables, database, views, columns etc... This tool/script requires hadoop and hive classpath jars, to make them available to the script   Make sure that the environment variable HADOOP_CLASSPATH is set OR HADOOP_HOME to point to root directory of your Hadoop installation.  set HIVE_HOME env variable to the root of Hive installation.  Set env variable HIVE_CONF_DIR to Hive configuration directory.  Copy <atlas-conf>/atlas-application.properties to the hive conf directory   
Once the above steps are successfully completed, now we are ready to run the script.  Usage: <atlas package>/hook-bin/import-hive.sh  When you run the above command, you should see the below messages over console.  [root@atlas-blueprint-test-1 ~]# /usr/hdp/current/atlas-server/hook-bin/import-hive.sh
Using Hive configuration directory [/etc/hive/conf]
Log file for import is /usr/hdp/current/atlas-server/logs/import-hive.log
2016-10-13 01:57:18,676 INFO  - [main:] ~ Looking for atlas-application.properties in classpath (ApplicationProperties:73)
2016-10-13 01:57:18,701 INFO  - [main:] ~ Loading atlas-application.properties from file:/etc/hive/2.5.0.0-1245/0/atlas-application.properties (ApplicationProperties:86)
2016-10-13 01:57:18,922 DEBUG - [main:] ~ Configuration loaded: (ApplicationProperties:99)
2016-10-13 01:57:18,923 DEBUG - [main:] ~ atlas.authentication.method.kerberos = False (ApplicationProperties:102)
2016-10-13 01:57:18,972 DEBUG - [main:] ~ atlas.cluster.name = atlasBP (ApplicationProperties:102)
2016-10-13 01:57:18,972 DEBUG - [main:] ~ atlas.hook.hive.keepAliveTime = 10 (ApplicationProperties:102)
2016-10-13 01:57:18,972 DEBUG - [main:] ~ atlas.hook.hive.maxThreads = 5 (ApplicationProperties:102)
2016-10-13 01:57:18,973 DEBUG - [main:] ~ atlas.hook.hive.minThreads = 5 (ApplicationProperties:102)
2016-10-13 01:57:18,973 DEBUG - [main:] ~ atlas.hook.hive.numRetries = 3 (ApplicationProperties:102)
2016-10-13 01:57:18,973 DEBUG - [main:] ~ atlas.hook.hive.queueSize = 1000 (ApplicationProperties:102)
2016-10-13 01:57:18,973 DEBUG - [main:] ~ atlas.hook.hive.synchronous = false (ApplicationProperties:102)
2016-10-13 01:57:18,974 DEBUG - [main:] ~ atlas.kafka.bootstrap.servers = atlas-blueprint-test-1.openstacklocal:6667 (ApplicationProperties:102)
2016-10-13 01:57:18,974 DEBUG - [main:] ~ atlas.kafka.hook.group.id = atlas (ApplicationProperties:102)
2016-10-13 01:57:18,974 DEBUG - [main:] ~ atlas.kafka.zookeeper.connect = [atlas-blueprint-test-1.openstacklocal:2181, atlas-blueprint-test-2.openstacklocal:2181] (ApplicationProperties:102)
2016-10-13 01:57:18,974 DEBUG - [main:] ~ atlas.kafka.zookeeper.connection.timeout.ms = 200 (ApplicationProperties:102)
2016-10-13 01:57:18,974 DEBUG - [main:] ~ atlas.kafka.zookeeper.session.timeout.ms = 400 (ApplicationProperties:102)
2016-10-13 01:57:18,981 DEBUG - [main:] ~ atlas.kafka.zookeeper.sync.time.ms = 20 (ApplicationProperties:102)
2016-10-13 01:57:18,981 DEBUG - [main:] ~ atlas.notification.create.topics = True (ApplicationProperties:102)
2016-10-13 01:57:18,982 DEBUG - [main:] ~ atlas.notification.replicas = 1 (ApplicationProperties:102)
2016-10-13 01:57:18,982 DEBUG - [main:] ~ atlas.notification.topics = [ATLAS_HOOK, ATLAS_ENTITIES] (ApplicationProperties:102)
2016-10-13 01:57:18,982 DEBUG - [main:] ~ atlas.rest.address = http://atlas-blueprint-test-1.openstacklocal:21000 (ApplicationProperties:102)
2016-10-13 01:57:18,993 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:168)
2016-10-13 01:57:18,998 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:181)
2016-10-13 01:57:19,043 DEBUG - [main:] ~ ==> InMemoryJAASConfiguration.initialize() (InMemoryJAASConfiguration:220)
2016-10-13 01:57:19,045 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.initialize() (InMemoryJAASConfiguration:347)
2016-10-13 01:57:19,045 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:190)
2016-10-13 01:57:19,046 DEBUG - [main:] ~ <== InMemoryJAASConfiguration.init() (InMemoryJAASConfiguration:177)
.
.
.
.
.
2016-10-13 01:58:09,251 DEBUG - [main:] ~ Using resource http://atlas-blueprint-test-1.openstacklocal:21000/api/atlas/entities/1e78f7ed-c8d4-4c11-9bfa-da08be7c6b60 for 0 times (AtlasClient:784)
2016-10-13 01:58:10,700 DEBUG - [main:] ~ API http://atlas-blueprint-test-1.openstacklocal:21000/api/atlas/entities/1e78f7ed-c8d4-4c11-9bfa-da08be7c6b60 returned status 200 (AtlasClient:1191)
2016-10-13 01:58:10,703 DEBUG - [main:] ~ Getting reference for process default.timesheets_test@atlasBP:1474621469000 (HiveMetaStoreBridge:346)
2016-10-13 01:58:10,703 DEBUG - [main:] ~ Using resource http://atlas-blueprint-test-1.openstacklocal:21000/api/atlas/entities?type=hive_process&property=qualifiedName&value=default.timesheets_test@atlasBP:1474621469000 for 0 times (AtlasClient:784)
2016-10-13 01:58:10,893 DEBUG - [main:] ~ API http://atlas-blueprint-test-1.openstacklocal:21000/api/atlas/entities?type=hive_process&property=qualifiedName&value=default.timesheets_test@atlasBP:1474621469000 returned status 200 (AtlasClient:1191)
2016-10-13 01:58:10,898 INFO  - [main:] ~ Process {Id='(type: hive_process, id: 28f5a31a-4812-497e-925b-21bfe59ba68a)', traits=[], values={outputs=[(type: DataSet, id: 1e78f7ed-c8d4-4c11-9bfa-da08be7c6b60)], owner=null, queryGraph=null, recentQueries=[create external table timesheets_test (emp_id int, location string, ts_date string, hours int, revenue double, revenue_per_hr double) row format delimited fields terminated by ',' location  'hdfs://atlas-blueprint-test-1.openstacklocal:8020/user/hive/timesheets'], inputs=[(type: DataSet, id: c259d3a8-5684-4808-9f22-972a2e3e2dd0)], qualifiedName=default.timesheets_test@atlasBP:1474621469000, description=null, userName=hive, queryId=hive_20160923090429_25b5b333-bba5-427f-8ee1-6b743cbcf533, clusterName=atlasBP, name=create external table timesheets_test (emp_id int, location string, ts_date string, hours int, revenue double, revenue_per_hr double) row format delimited fields terminated by ',' location  'hdfs://atlas-blueprint-test-1.openstacklocal:8020/user/hive/timesheets', queryText=create external table timesheets_test (emp_id int, location string, ts_date string, hours int, revenue double, revenue_per_hr double) row format delimited fields terminated by ',' location  'hdfs://atlas-blueprint-test-1.openstacklocal:8020/user/hive/timesheets', startTime=2016-09-23T09:04:29.069Z, queryPlan={}, operationType=CREATETABLE, endTime=2016-09-23T09:04:30.319Z}} is already registered (HiveMetaStoreBridge:305)
2016-10-13 01:58:10,898 INFO  - [main:] ~ Successfully imported all 22 tables from default  (HiveMetaStoreBridge:261)
Hive Data Model imported successfully!!!
  The below message from the console log shows how many tables are imported and is the import successful or not.  2016-10-13 01:58:10,898 INFO  - [main:] ~ Successfully imported all 22 tables from default  (HiveMetaStoreBridge:261)
Hive Data Model imported successfully!!!  Now we can verify for the imported tables over Atlas UI. It should reflect all the 22 tables that are imported as per above.      The logs for the import script are in <atlas package>/logs/import-hive.log  Running import script on kerberized cluster  The above will work perfectly for a simple cluster but for a kerberized cluster, you need to provide additional details to run the command.  <atlas package>/hook-bin/import-hive.sh -Dsun.security.jgss.debug=true -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=[krb5.conf location] -Djava.security.auth.login.config=[jaas.conf location]   krb5.conf is typically found at /etc/krb5.conf  for details about jaas.conf, see the atlas security documentation  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		10-13-2016
	
		
		01:31 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							@Subhankar Dasgupta Atlas can track lineage of entities(like hive table) only after it is implemented/plugged in to cluster. Any lineage information before adding atlas to the system cannot be tracked.   There is a provision to import metadata (not lineage) of already existing hive tables by using an import tool, which comes as part of Atlas installation. For more information on this, please refer here. 
						
					
					... View more