About pminovic

pminovic · ‎06-27-2016

Knox-0.9 is coming as a part of the next version of HDP due for release this summer. Multiple topologies are not supported in Ambari but you can create additional ones manually.

pminovic · ‎06-27-2016

Hi @João Souza, it's not a good idea to base your design on file names in hdfs. You can use file names only in phase 1 of your processing flow (what you are already doing using "-tagFile"), after that just consider your input as a "data set". Using directories, what Emily suggested, is a much better idea, and is often used to partition data for MR jobs and Hive tables.

pminovic · ‎06-26-2016

Hi @elan chelian. It works with following changes (details here😞 TITLE must be STRING, it seems XmlSerDe doesn't support VARCHAR yet PRICE must be declared as FLOAT or DOUBLE, not INT (e.g., 24.90) Your unit record of data is BOOK, not CATALOG You are missing text() to capture specific values Declarations: DROP TABLE IF EXISTS BOOKDATA; CREATE EXTERNAL TABLE BOOKDATA (TITLE STRING, PRICE FLOAT) ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe' WITH SERDEPROPERTIES ( "column.xpath.TITLE"="/BOOK/TITLE/text()", "column.xpath.PRICE"="/BOOK/PRICE/text()") STORED AS INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat' LOCATION '/user/it1/hive/xml' TBLPROPERTIES ("xmlinput.start"="<BOOK","xmlinput.end"= "</BOOK>"); Test: hive> select * from BOOKDATA; OK Hadoop Defnitive Guide 24.9 Programming Pig 30.9 Time taken: 0.081 seconds, Fetched: 2 row(s)

pminovic · ‎06-25-2016

As mentioned in https://community.hortonworks.com/questions/37192/error-no-package-python27-available-while-setting.html, the tutorial has been corrected.

pminovic · ‎06-24-2016

How about trying this: https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-PythonClientDriver

pminovic · ‎06-23-2016

Let me answer your follow-up questions: Metadata is not a simple copy, there are some per-cluster settings that would have to be taken into account. Yes, Hive mirroring has to be "bootstrapped" using Hive export/import feature, externally to Falcon. ACID usage is not wide-spread yet, and Falcon mirroring is "set-it-and-forget-it". Also, new features are coming in new versions of Falcon and ACID will be supported before long. Falcon and Oozie jobs can be run on either cluster. I prefer to run them on DR cluster which is not doing much anyway, instead of using the busy production (source) cluster.

pminovic · ‎06-23-2016

If it helped you, can you please accept my answer, to help us manage Q&A and help others find solutions for their problems. Tnx!

pminovic · ‎06-23-2016

I recently moved a journal node from one cluster machine to another by following procedure given here: https://community.hortonworks.com/questions/4272/process-for-moving-hdp-services-manually.html and informing Ambari about adding the 4th JN, and then deleting the one I wanted to move. It's similar to the article you found but a little bit simpler. The Ambari API calls are given in the article you found.

pminovic · ‎06-23-2016

Yes, I can confirm your findings, that page 9.1 appears to be hidden. By the way, the pdf is fine, that extra page is not there.

pminovic · ‎06-23-2016

"rollingUpgarde rollback" on NN1 means that both hdfs software and data in hdfs will be reverted back to the state before starting the rolling upgrade. Any changes to hdfs (files added to or deleted) will be lost. NN1 will become active NN. "bootstrapStandby" on NN2 means that metadata from NN1 will be copied to NN2. After startup NN2 will become Stand by NN. What happens behind the stage: when you start the rolling upgrade with "rollingUpgrade prepare" a copy of NN metadata (FSImage) is created, called "previous". It consists of hard links to the "current" FSImage. When you do "rollingUpgrade rollback", "current" FSImage is replaced by "previous", that's why all hdfs changes are lost. If you want to keep the changes you can use "rollingUpgrade downgrade", it will downgrade only software, keeping hdfs image intact. You can find more details here.

Online	Offline
Last Visited	‎08-19-2019 01:20 AM

Member Since	‎09-24-2015 04:02 AM
Last Visited	‎08-19-2019 01:20 AM
Posts	816
Kudos received	481

Cloudera Community

Re: datanode + Error occurred during initializatio...

Re: Problem when Distcp between two HA Cluster.

Re: Beeline over KNOX fails with HTTP Response co...

Re: What does nclients option of performance evalu...

Re: missing directories in ambari installation pac...

Re: YARNUI and RM HA

Re: Merge and Rename files in HDFS - Pig?

Re: Hive XML Parising - Null value returned

Re: Tutorial-380 has one mistake

Re: Read hive table with a python script

Re: Hive replication between clusters - Falcon bas...

Re: Is it impossible to move or add JournalNode by...

Re: Is it impossible to move or add JournalNode by...

Re: Ambari proxy documentation

Re: What's the difference of -bootstrapstandby and...