Reply
Highlighted
New Contributor
Posts: 2
Registered: ‎08-02-2017

Cloudera Navigator pre-registered metadata is not being applied

According to documentation I try to pre-register metadata for certain file to apply tags when it's extracted from HDFS.

 

1. I  extract the identity of HDFS source

[root@cloudera ~]# curl http://cloudera.sorn:7187/api/v11/entities?query='(type:SOURCE)AND(sourceType:HDFS)' -u admin:admin
[ {
"originalName" : "hdfs",
"originalDescription" : null,
"sourceId" : null,
"firstClassParentId" : null,
"parentPath" : null,
"deleteTime" : null,
"extractorRunId" : null,
"customProperties" : null,
"name" : "HDFS",
"description" : null,
"tags" : null,
"properties" : null,
"technicalProperties" : null,
"clusterName" : "cluster",
"sourceUrl" : "hdfs://cloudera.sorn:8020",
"sourceType" : "HDFS",
"sourceExtractIteration" : 3,
"sourceTemplate" : null,
"hmsDbHost" : null,
"hmsDbName" : null,
"hmsDbPort" : null,
"hmsDbUser" : null,
"type" : "SOURCE",
"deleted" : null,
"userEntity" : false,
"metaClassName" : "source",
"packageName" : "nav",
"identity" : "8",
"internalType" : "source"
}

2. I use the identiy shown previously to create preregistration for file /user/mark/newfile with tags nav and properties priority:medium

 

cat hdfspreregistration 
{
          "sourceId":"8",
          "parentPath":"/user/mark",
          "originalName":"newfile",
          "name":"newfile",
          "description":"This is going to be an awesome file.",
          "tags":["fav"],
          "properties":{"priority":"medium"}
}

[mark@cloudera ~]$ curl http://cloudera.sorn:7187/api/v11/entities -u admin:admin -X POST -H "Content-Type: application/json" -d "$(cat hdfspreregistration)"
{
  "originalName" : "newfile",
  "originalDescription" : null,
  "sourceId" : "8",
  "firstClassParentId" : null,
  "parentPath" : "/user/mark",
  "deleteTime" : null,
  "extractorRunId" : null,
  "customProperties" : null,
  "name" : "newfile",
  "description" : "This is going to be an awesome file.",
  "tags" : [ "fav" ],
  "properties" : {
    "priority" : "medium"
  },
  "technicalProperties" : null,
  "sourceType" : null,
  "type" : null,
  "deleted" : null,
  "userEntity" : false,
  "metaClassName" : "UNDEFINED",
  "packageName" : "nav",
  "identity" : "3247",
  "internalType" : "UNDEFINED"
}

3. Verify that entity is preregistered

 

[mark@cloudera ~]$ curl http://cloudera.sorn:7187/api/v11/entities/?query=-internalType:*  -X GET  -u admin:admin
[ {
  "originalName" : "newfile",
  "originalDescription" : null,
  "sourceId" : "8",
  "firstClassParentId" : null,
  "parentPath" : "/user/mark",
  "deleteTime" : null,
  "extractorRunId" : null,
  "customProperties" : null,
  "name" : "newfile",
  "description" : "This is going to be an awesome file.",
  "tags" : [ "fav" ],
  "properties" : {
    "priority" : "medium"
  },
  "technicalProperties" : null,
  "sourceType" : null,
  "type" : null,
  "deleted" : null,
  "userEntity" : false,
  "metaClassName" : "UNDEFINED",
  "packageName" : "nav",
  "identity" : "3247",
  "internalType" : "UNDEFINED"
} ]

4. Copy new file

 

hdfs dfs -copyFromLocal newfile /user/mark/newfile

5. After checkpoint and extract poll time I can see in logs that there was a problem processing this file.

 

2017-08-16 11:05:56,005 INFO com.cloudera.nav.hdfs.client.InotifyClient [CDHExecutor-0-CDHUrlClassLoader@574e34fe]: Processing inotify event, starting with tx id 4640
2017-08-16 11:05:59,911 ERROR com.cloudera.nav.hdfs.client.InotifyClient [CDHExecutor-0-CDHUrlClassLoader@574e34fe]: Error handling event (txid: 5050): Renamed /user/mark/newfile._COPYING_ to /user/mark/newfile at time 1502877858201
2017-08-16 11:05:59,913 ERROR com.cloudera.nav.hdfs.client.InotifyClient [CDHExecutor-0-CDHUrlClassLoader@574e34fe]: Error handling RENAME event
java.lang.ClassCastException: com.cloudera.nav.core.model.GenericEntity cannot be cast to com.cloudera.nav.hdfs.model.FSEntity
        at com.cloudera.nav.hdfs.extractor.HdfsOperationHandler.getEntry(HdfsOperationHandler.java:642)
        at com.cloudera.nav.hdfs.extractor.HdfsOperationHandler.rename(HdfsOperationHandler.java:276)
        at com.cloudera.nav.hdfs.client.InotifyClient.handleRenameEvent(InotifyClient.java:283)
        at com.cloudera.nav.hdfs.client.InotifyClient.handleEvent(InotifyClient.java:129)
        at com.cloudera.nav.hdfs.client.InotifyClient.doImport(InotifyClient.java:75)
        at com.cloudera.nav.hdfs.client.InotifyExtractor.doImport(InotifyExtractor.java:34)
        at com.cloudera.nav.hdfs.extractor.HdfsExtractorShim$1.run(HdfsExtractorShim.java:276)
        at com.cloudera.nav.hdfs.extractor.HdfsExtractorShim$1.run(HdfsExtractorShim.java:273)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
        at com.cloudera.cmf.cdh5client.security.UserGroupInformationImpl.doAs(UserGroupInformationImpl.java:44)
        at com.cloudera.nav.hdfs.extractor.HdfsExtractorShim.doImport(HdfsExtractorShim.java:273)
        at com.cloudera.nav.hdfs.extractor.HdfsExtractorShim.doExtraction(HdfsExtractorShim.java:235)
        at com.cloudera.nav.hdfs.extractor.HdfsExtractorShim.run(HdfsExtractorShim.java:141)
        at com.cloudera.cmf.cdhclient.CdhExecutor$RunnableWrapper.call(CdhExecutor.java:221)
        at com.cloudera.cmf.cdhclient.CdhExecutor$RunnableWrapper.call(CdhExecutor.java:211)
        at com.cloudera.cmf.cdhclient.CdhExecutor$CallableWrapper.doWork(CdhExecutor.java:236)
        at com.cloudera.cmf.cdhclient.CdhExecutor$1.call(CdhExecutor.java:125)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2017-08-16 11:06:00,365 INFO com.cloudera.nav.hdfs.client.InotifyClient [CDHExecutor-0-CDHUrlClassLoader@574e34fe]: Processing done, next start id = 5061.

After that the file doesn't show up in navigator search.

 

 

Anybody has idea what am I doing wrong and how to fix it?

 

Regards

 

 

 

 

Posts: 672
Kudos: 80
Solutions: 42
Registered: ‎04-06-2015

Re: Cloudera Navigator pre-registered metadata is not being applied

I've been advised that we have seen this problem as of Navigator 5.12.1: a fix was applied to 5.13.0 (and will be applied to 5.12.2). Upgrading to one of these releases will resolve the problem. 


Cy Jervis, Community Manager

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

Announcements
New solutions