About Mark_Petronic

Mark_Petronic · ‎02-12-2016

+1 for shameless plugs. Nice article and thank you for it!

Mark_Petronic · ‎02-12-2016

@Benjamin Leonhardi, Nice writeup. Thank you for taking the time to be so thorough!!! All the links were very helpful, too. You read my mind on the Yahoo performance link - that was the next topic I was going to research. 🙂 Couple follow up clarification questions/comments... Q1 - Chris' blog (thanks @Chris Nauroth) answered the remaining point. Only the "edits in progress" changes need to be applied to the fsimage by the failover NN when it takes over, all the completed edits on the JNs should have already been applied. Q2 - So, my focus was on the JNs writting to each of their own disks and I completely missed the point that the NN needs some place to build the fsimage file. So, would you just point that to the same disk used but the JNs (assuming that I am going to collocate the JNs on the same host as the NNs? Q3 - I was thinking more about a disk failure were a failed disk means a failed JN. So separate disks for each JN means more reliability. Do you recommend some other arrangement? Q5 - "So he writes a checkpoint regularly and distributes it to the active namenode." You mean "distributes it" through the JN's in the normal HA manner of publishing edits, right, or am I missing something here?

Mark_Petronic · ‎02-11-2016

I am trying to put together a hardware specification for name nodes running in HA mode. That made me have to think about disk allocation for name nodes. I pretty much get it with non-HA. Use one RAID drive and another NFS mount for redundancy. SNN incrementally applies changes in the edit log to the fsimage, etc. But I want to run HA. And I want to use Journal Nodes (JN) and the Quorum Journal Manager (QJM) approach. So, that made me think about this scenario and I was not sure I was getting it right and wanted to ask some gurus for input. Here's what I think... Can you please confirm or correct? I think a scenario type question will help me more easily ask the questions so here goes. Assume a clean install. Primary and failover NNs both have empty fsimage files. Primary starts running and writing changes to all three JN's. As I understand it, the failover NN will be reading all those changes, via the JNs, and applying them to his empty fsimage to prepare it to be 100% complete should he be called to take over (faster startup time). Now the primary fails. The failover NN starts up and reads in the fsimage file and starts accepting client requests as normal. It now starts to write edits to the JNs. But the formally primary NN is still down so it is NOT reading updates from the JNs. So, it's fsimage remains empty, essentially. Next, I fix the formally primary NN and start it up. It now becomes the failover NN. At this point, I guess it starts reading changes from the JNs and building up its empty fsimage with all changes to date in hopes that it will once again rule the world and become active should the other NN fail some day. Q1 - Is it true that the failover NN will NEVER have to apply any edit log changes at start up but simply loads its fsimage and starts running because it assumes fsimage is already 100% up to date via recent JN reads? Q2 - In a setup with 3 JNs as a quorum, what should the disk layout look like on the three servers hosting those JNs? Because the edits are now distributed x3, should I just have a single disk per JN host dedicated to the JNs? No need for the one RAID and second NFS type arrangement used in non-HA mode? Specifically, the disk resources typically used for non-HA NN, where the NN writes edit log changes, now become disk resources used exclusively by the JNs, right? Meaning, the NNs never read/write anything directly to disk (except for configuration, I assume) but rather ALL goes through the JNs. Q3 - I believe I still should have one dedicated disk for each JN on each host to isolate the unique work load of the NN for other processes. So, for example, there might be one disk for the OS, one for JNs, and another for the ZK instances that are sharing the same server to support the ZKFC. Correct? Q4 - Because JNs are distributed, it makes me think I should treat these disks like I do disks on the DNs, meaning no RAID, just plain old JBOD. Does that sound right? Q5 - Is it the NN on the failover server that actually does the JN reads and fsimage updates now in HA mode given that there is no SNN in such a configuration? Thanks in advance for confirmation or any insight on this...

Mark_Petronic · ‎02-09-2016

https://community.hortonworks.com/questions/15422/hive-and-avro-schema-defined-in-tblproperties-vs-s.html

Mark_Petronic · ‎02-09-2016

I built a DWH style application on Hive that is centered on Avro to handle schema evolution. My source data is CSV and they change when new releases of the applications are deployed (like adding more columns, removing columns, etc). So, I decided to convert everything to Avro on ingest and then strictly deal with Avro inside Hadoop. For most of my data, I convert to Avro and drop into HDFS into date partitions and allow users to access that data using external Hive tables. For some of these CSV files, there are up to 6 versions in flight at once (each defined using a separate Avro schema), all landing in the same Hive table. So, like you, I wanted to be able to query over all the data no matter what the version. The approach seems to be working well. I did, however want to chime in on TBLPROPERTIES. I just posted a questions related to this. Seems like we should be using SERDEPROPERTIES, not TBLPROPERTIES, when defining the URL to the schema file. All my Avro schema files are in an HDFS directory - I did not want to use the literal approach of defining the schema inline in the TBLPROPERTIES. I was creating me Hive tables using TBLPROPERTIES pointing to the URL of the newest schema which is defined with defaults and properly defined to be a superset of all earlier schemas allowing them to all coexist in the same Hive table. However, I recently tried to build a Spark SQL application using HiveContext to read these tables and was surprise to find that Spark threw Avro exceptions. Creating the table using SERDEPROPERTIES to define the avcs URL was the solution to make the data accessible from both Hive and Spark. Just though I would mention to save you some hassles down the road if you every need Spark SQL access to that data.

Mark_Petronic · ‎01-19-2016

Interestingly, I just upgraded Ambari from 2.1 to 2.2 as part of my upgrade plans and the Hive service check now passes. The stack trace does show Ambari running various command scripts that implement this check.

Mark_Petronic · ‎01-19-2016

Well, I solved it another way - I just upgraded to Ambari 2.2. I will watch see if this comes back as I run with this version and open a ticket if that happens. I was in the midst of upgrading anyway when this started to happen. I need to move to HDP 2.3.4 and Spark 1.5. But, 8 GB, really? Wow! That seems ridiculously expensive for a monitoring framework.

Mark_Petronic · ‎01-18-2016

My installation differs from that links instructions but I did not change these from what the base install of HDP created. Why the discrepancy? For "/apps/hive", your link says: hdfs dfs -chown -R $HIVE_USER:$HDFS_USER /apps/hive hdfs dfs -chmod -R 775 /apps/hive My setup is: drwxr-xr-x 3 hdfs hdfs 96 Sep 30 15:09 hive For "/tmp/hive", your link says: hdfs dfs -chmod -R 777 /tmp/hive My setup is: drwx-wx-wx 11 ambari-qa hdfs 352 Jan 15 22:13 hive

Mark_Petronic · ‎01-18-2016

I'm working though this procedure to upgrade Ambari from 2.1.1 to 2.2.0 before I start an HDP upgrade from 2.3.0 to 2.3.4: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.0.0/bk_upgrading_Ambari/content/_preparing_to_upgrade_ambari_and_hdp.html It says to run the service checks on installed components first. They all passed except the Hive check and I get this access error: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.security.AccessControlException: Permission denied: user=ambari-qa, access=WRITE, inode="/apps/hive/warehouse":hive:hdfs:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219) /apps/hive/warehouse user:group is to hive:hdfs. The ambari-qa user that is used when running the service checks on the node where Ambari is saying the checks are being run is ambari-qa:hadoop: [jupstats@vmwhadnfs01 hive]$ id ambari-qa uid=1001(ambari-qa) gid=501(hadoop) groups=501(hadoop),100(users) So, ambari-qa is a member of the hadoop group but that group has no write permission to hive managed tables which are owned by hive and allowed read access to only hdfs users. I'm not sure what Ambari service check is trying to do by it is clearly trying to write something in that managed table space. As I understand it, the hadoop superuser is the user that starts the namenode, user "hdfs" in my case. So, my questions are: 1. Should "hdfs" really be the group for /apps/hive/warehouse or would it be better to have that be the "hadoop" group? 2. What are the best practice recommendations for the user:group permissions on /apps/hive/warehouse? For example, I have some Java and Python apps that run every 30 minutes to ingest data into hive management and external tables. Those processes run as a service user "jupstats" and group "ingest". My /apps/hive/warehouse/jupstats.db directory is where the managed tables lives and that directory is set to jupstats:ingest to restrict access appropriately. This seems right to me. Do you experts agree? Same for the directories where I also write some HDFS data that is accessed by external Hive tables. Those files are owned as jupstats:ingest. 3. I think I am generally lacking knowledge in how to best setup up access to various Hive tables that are eventually going to need to be accessed by various users. My thought was that all my jupstats.db tables, which are read only by group ingest, will be made made readable by these users by adding those users to the "ingest" group. Does that approach seem reasonable? 4. This still leaves me with the question of how to I setup Hive so that this Ambari service check can pass? Should I add ambari-qa to the "hdfs" group? That feels wrong and dangerous in that it is like adding ambari-qa to a root-like account since user "hdfs" is the hadoop superuser and can wack a lot of stuff. Thanks for any help/tips on this...

Mark_Petronic · ‎01-18-2016

@Neeraj Sabharwal Hey Neeraj! I'm having this same issue now, too. Do you have any input regarding how support troubleshoot it for you or do I need to hit them up as well with a ticket for an answer?

Online	Offline
Last Visited	‎12-01-2018 04:39 PM

Member Since	‎11-24-2015 02:54 PM
Last Visited	‎12-01-2018 04:39 PM
Posts	56
Kudos received	58

Cloudera Community

Re: How to change location of avro.schema.url loca...

Re: Detected data dir(s) that became unmounted and...

Re: Best practice for Avro schema/field naming reg...

Re: Hive's "alter table partition concatenate" no...

Re: Understanding check pointing with namenode HA

Re: Understanding check pointing with namenode HA

Understanding check pointing with namenode HA

Re: Can Hive avro tables support changing schemas?

Re: Can Hive avro tables support changing schemas?

Re: Recommended user:group ownership for /apps/hiv...

Re: How to get rid of stale alerts in Ambari

Re: Recommended user:group ownership for /apps/hiv...

Recommended user:group ownership for /apps/hive/wa...

Re: How to get rid of stale alerts in Ambari