Member since
07-25-2018
174
Posts
29
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5414 | 03-19-2020 03:18 AM | |
3457 | 01-31-2020 01:08 AM | |
1337 | 01-30-2020 05:45 AM | |
2589 | 06-01-2016 12:56 PM | |
3073 | 05-23-2016 08:46 AM |
06-13-2016
06:47 AM
Hi guys, I was running waterline jobs such as(profile job,tag job,lineage job) but while running that map-reduce code I was getting exceptions "Permission Denied error" on some waterline data directory so I resolved them by using sudo -u waterlinedata hadoop fs -chmod 777 <directory name> and everything is worked fine.
... View more
06-09-2016
01:27 PM
Hi, I have download Sandbox machine for waterlinedata and I am getting waterline UI with per-loaded/per-profiled data everything is working fine(default). Now I want to profile some files which are present under /user/waterlinedata/newStaggingData directory.After copying from local to HDFS I am running command ./waterline profileOnly /user/waterlinedata/newStaggingData and now accoding to my knowledge profiling is nothing but identify file format,calculate data quality matrics and store all details in inventory etc. but am not able to see such details in front of my files within waterline UI. Please attached images.capture1.png I know that after executing above command waterline runs map-reduce job and I am sure that it's running perfectly but still not getting any fileformat/data quality metrics in UI. otherwise send me the steps to for how to profile data which are present inside of particular directory. Thanks in advance.
... View more
Labels:
- Labels:
-
HDFS
06-08-2016
05:23 AM
1 Kudo
Hello all, I am exploring on REST API for Apache Atlas and wants to fetch hive lineage data from Apache Atlas repository using REST API.I refer following link https://docs.hortonworks.com/HDPDocuments/HDP2/HDP2.3.0/bk_data_governance/content/section_atlas_restapi_hivelineageresource.html http://atlas.incubator.apache.org/api/resource_HiveLineageResource.html please provide me an example to get the lineage data using REST API. Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Atlas
06-07-2016
02:37 AM
Thanks to see you again Ryan, Did you mean I can login to beeline client console with(hr_user/hr_admin)user credentials and able to see same error(hr_user does not have permission to access ssn and location column) which I was getting in hive veiw right? Questions 1)how to connect beeline using command line option for hr_user/hr_admin. 2)could you please post those commands for connecting to beeline?
... View more
06-06-2016
03:54 PM
Hello Guys, I have used hive-view for seeing data lineage part in atlas but you know that,we have Atlas and Ranger integration which is mainly talks about "Tag based Policy".I have attached a link which explains demo of "Tag based policy". Link------> http://hortonworks.com/hadoop-tutorial/tag-based-policies-atlas-ranger/ In that demo they have used hive view to see/cross check whether security policy is really working or not? Instead of using "Hive view",can we cross check the same flow/policy with the beeline?.I meant, can we do something like, login to beeline with hr_user/hr_admin user and check the Atlas-Ranger tag based policy?. Please tell me is it possible on Atlas-Ranger tech review machine.
... View more
Labels:
- Labels:
-
Apache Atlas
-
Apache Ranger
06-06-2016
12:17 PM
Thank you ran, I have used hive-view for seeing data lineage part in atlas but you know that,we have Atlas and Ranger integration which is mainly talks about "Tag based Policy".I have attached a link which explains demo of "Tag based policy". Link------> http://hortonworks.com/hadoop-tutorial/tag-based-policies-atlas-ranger/ In that demo they have used hive view to see/cross check whether security policy is really working or not? Instead of using "Hive view",can we cross check same flow with the beeline?.I meant, can we login to beeline with hr_user/hr_admin user to check the Atlas-Ranger tag based policy?.
... View more
06-04-2016
05:58 AM
1 Kudo
Hello Guys, I am using Atlas-Ranger public preview machine for exploring more on Apache Atlas but have some doubt/questions on that. Here I have list down few of them: 1)When we create new Tag using Atlas UI,at that time atlas would also ask us to enter "attribute".what is that attribute and it's significance,why is the purpose of using attribute? 2)As all of us know that, the atlas captures metadata for all entities(like hive table,column etc.) and also allow us to assign a tag to any entity(i.e. table/column) so my question is Where does the atlas stores those tags and metadata? 3)In the newer version, Atlas and Ranger works together I know that we can create "Tag based Policy" but the question is, How does Atlas and Ranger works internally/what is it's internal mechanism? or How does it stops user from selecting particular table column?
... View more
Labels:
- Labels:
-
Apache Atlas
06-01-2016
05:46 PM
Thank you Sunile for giving such a quick response. According to your answer, I can write something like ............................................ .....falconexample.DiagnosticReport1 d on (p.id = substr(d.subject.reference,9) and p.ds='2016-06-01-10') inner join falconexample.Observation1 o on (p.id = substr(o.subject.reference,9)and o.ds='2016-06-01-10' and d.ds='2016-06-01-10') Right?
... View more
06-01-2016
05:23 PM
1 Kudo
Hello everyone, I have posted query below in which i have used 3 partition table and all are the external table pointing to some partition ditectory on hdfs.In every table partition is done on "ds" column(i.e. ds=2016-06-01-08). ds=year-month-day-hour On hourly basis data is landing on hdfs,creating directory as above timestamp and tables are pointing to all those partition directories. A question is: As already i explained all tables are partitioned,but Where to write where clause in below query? so that data will be processed for that particular partition only. Please let me know. Query: INSERT OVERWRITE TABLE falconexample.Patient_proce PARTITION (${falcon_output_partitions_hive}) select p.id,p.gender, p.Age, p.birthdate, o.component[1].valuequantity.value, o.component[1].valuequantity.unit from (select *, floor(datediff(to_date(from_unixtime(unix_timestamp())), to_date(birthdate)) / 365.25) as Age FROM falconexample. patient1) p inner join falconexample.DiagnosticReport1 d on p.id = substr(d.subject.reference,9) inner join falconexample.Observation1 o on p.id = substr(o.subject.reference,9) where p.Age>17 and p.Age<86 and o.component[1].valuequantity.value <140;
... View more
Labels:
- Labels:
-
Apache Hive
06-01-2016
12:56 PM
Hello Guys, The error is has been solved ,I have solved it by adding additional statement in hive script along with above query as statement:- add jar
hdfs://<hostname>:8020//user/oozie/share/lib/lib_20160503082834/hive/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;
... View more