About Manus

Manus · ‎06-13-2016

Hi guys, I was running waterline jobs such as(profile job,tag job,lineage job) but while running that map-reduce code I was getting exceptions "Permission Denied error" on some waterline data directory so I resolved them by using sudo -u waterlinedata hadoop fs -chmod 777 <directory name> and everything is worked fine.

Manus · ‎06-09-2016

Hi, I have download Sandbox machine for waterlinedata and I am getting waterline UI with per-loaded/per-profiled data everything is working fine(default). Now I want to profile some files which are present under /user/waterlinedata/newStaggingData directory.After copying from local to HDFS I am running command ./waterline profileOnly /user/waterlinedata/newStaggingData and now accoding to my knowledge profiling is nothing but identify file format,calculate data quality matrics and store all details in inventory etc. but am not able to see such details in front of my files within waterline UI. Please attached images.capture1.png I know that after executing above command waterline runs map-reduce job and I am sure that it's running perfectly but still not getting any fileformat/data quality metrics in UI. otherwise send me the steps to for how to profile data which are present inside of particular directory. Thanks in advance.

Manus · ‎06-08-2016

Hello all, I am exploring on REST API for Apache Atlas and wants to fetch hive lineage data from Apache Atlas repository using REST API.I refer following link https://docs.hortonworks.com/HDPDocuments/HDP2/HDP2.3.0/bk_data_governance/content/section_atlas_restapi_hivelineageresource.html http://atlas.incubator.apache.org/api/resource_HiveLineageResource.html please provide me an example to get the lineage data using REST API. Thanks in advance.

Manus · ‎06-07-2016

Thanks to see you again Ryan, Did you mean I can login to beeline client console with(hr_user/hr_admin)user credentials and able to see same error(hr_user does not have permission to access ssn and location column) which I was getting in hive veiw right? Questions 1)how to connect beeline using command line option for hr_user/hr_admin. 2)could you please post those commands for connecting to beeline?

Manus · ‎06-06-2016

Hello Guys, I have used hive-view for seeing data lineage part in atlas but you know that,we have Atlas and Ranger integration which is mainly talks about "Tag based Policy".I have attached a link which explains demo of "Tag based policy". Link------> http://hortonworks.com/hadoop-tutorial/tag-based-policies-atlas-ranger/ In that demo they have used hive view to see/cross check whether security policy is really working or not? Instead of using "Hive view",can we cross check the same flow/policy with the beeline?.I meant, can we do something like, login to beeline with hr_user/hr_admin user and check the Atlas-Ranger tag based policy?. Please tell me is it possible on Atlas-Ranger tech review machine.

Manus · ‎06-06-2016

Thank you ran, I have used hive-view for seeing data lineage part in atlas but you know that,we have Atlas and Ranger integration which is mainly talks about "Tag based Policy".I have attached a link which explains demo of "Tag based policy". Link------> http://hortonworks.com/hadoop-tutorial/tag-based-policies-atlas-ranger/ In that demo they have used hive view to see/cross check whether security policy is really working or not? Instead of using "Hive view",can we cross check same flow with the beeline?.I meant, can we login to beeline with hr_user/hr_admin user to check the Atlas-Ranger tag based policy?.

Manus · ‎06-04-2016

Hello Guys, I am using Atlas-Ranger public preview machine for exploring more on Apache Atlas but have some doubt/questions on that. Here I have list down few of them: 1)When we create new Tag using Atlas UI,at that time atlas would also ask us to enter "attribute".what is that attribute and it's significance,why is the purpose of using attribute? 2)As all of us know that, the atlas captures metadata for all entities(like hive table,column etc.) and also allow us to assign a tag to any entity(i.e. table/column) so my question is Where does the atlas stores those tags and metadata? 3)In the newer version, Atlas and Ranger works together I know that we can create "Tag based Policy" but the question is, How does Atlas and Ranger works internally/what is it's internal mechanism? or How does it stops user from selecting particular table column?

Manus · ‎06-01-2016

Thank you Sunile for giving such a quick response. According to your answer, I can write something like ............................................ .....falconexample.DiagnosticReport1 d on (p.id = substr(d.subject.reference,9) and p.ds='2016-06-01-10') inner join falconexample.Observation1 o on (p.id = substr(o.subject.reference,9)and o.ds='2016-06-01-10' and d.ds='2016-06-01-10') Right?

Manus · ‎06-01-2016

Hello everyone, I have posted query below in which i have used 3 partition table and all are the external table pointing to some partition ditectory on hdfs.In every table partition is done on "ds" column(i.e. ds=2016-06-01-08). ds=year-month-day-hour On hourly basis data is landing on hdfs,creating directory as above timestamp and tables are pointing to all those partition directories. A question is: As already i explained all tables are partitioned,but Where to write where clause in below query? so that data will be processed for that particular partition only. Please let me know. Query: INSERT OVERWRITE TABLE falconexample.Patient_proce PARTITION (${falcon_output_partitions_hive}) select p.id,p.gender, p.Age, p.birthdate, o.component[1].valuequantity.value, o.component[1].valuequantity.unit from (select *, floor(datediff(to_date(from_unixtime(unix_timestamp())), to_date(birthdate)) / 365.25) as Age FROM falconexample. patient1) p inner join falconexample.DiagnosticReport1 d on p.id = substr(d.subject.reference,9) inner join falconexample.Observation1 o on p.id = substr(o.subject.reference,9) where p.Age>17 and p.Age<86 and o.component[1].valuequantity.value <140;

Manus · ‎06-01-2016

Hello Guys, The error is has been solved ,I have solved it by adding additional statement in hive script along with above query as statement:- add jar hdfs://<hostname>:8020//user/oozie/share/lib/lib_20160503082834/hive/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;

Online	Offline
Last Visited	‎07-16-2020 05:12 AM

Member Since	‎07-25-2018 10:48 AM
Last Visited	‎07-16-2020 05:12 AM
Posts	174
Kudos received	29

Cloudera Community

Re: Facing transport exception in hive

Re: How to implement the given problem statement i...

Re: Why output port not redirecting the flowfiles ...

Re: Why Falcon pipeline is failing?

Re: why Falcon job is Failing?

Re: Wy data is not being profiled using waterline?

Wy data is not being profiled using waterline?

How can we get hive lineage data using REST API in...

Re: Can we check Atlas-Ranger based feature using ...

Can we check Atlas-Ranger based feature using beel...

Re: In Atlas-Ranger sandbox machine,atlas not work...

The Few Question about Apache Atlas?

Re: How to ask hive query to fetch data for specif...

How to ask hive query to fetch data for specific p...

Re: Why Falcon pipeline is failing?