Member since
09-24-2015
816
Posts
488
Kudos Received
189
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2189 | 12-25-2018 10:42 PM | |
9980 | 10-09-2018 03:52 AM | |
3688 | 02-23-2018 11:46 PM | |
1408 | 09-02-2017 01:49 AM | |
1632 | 06-21-2017 12:06 AM |
06-27-2016
02:02 AM
Knox-0.9 is coming as a part of the next version of HDP due for release this summer. Multiple topologies are not supported in Ambari but you can create additional ones manually.
... View more
06-27-2016
01:49 AM
1 Kudo
Hi @João Souza, it's not a good idea to base your design on file names in hdfs. You can use file names only in phase 1 of your processing flow (what you are already doing using "-tagFile"), after that just consider your input as a "data set". Using directories, what Emily suggested, is a much better idea, and is often used to partition data for MR jobs and Hive tables.
... View more
06-26-2016
04:31 AM
3 Kudos
Hi @elan chelian. It works with following changes (details here😞 TITLE must be STRING, it seems XmlSerDe doesn't support VARCHAR yet PRICE must be declared as FLOAT or DOUBLE, not INT (e.g., 24.90) Your unit record of data is BOOK, not CATALOG You are missing text() to capture specific values Declarations: DROP TABLE IF EXISTS BOOKDATA;
CREATE EXTERNAL TABLE BOOKDATA (TITLE STRING, PRICE FLOAT)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"column.xpath.TITLE"="/BOOK/TITLE/text()",
"column.xpath.PRICE"="/BOOK/PRICE/text()")
STORED AS INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION '/user/it1/hive/xml'
TBLPROPERTIES ("xmlinput.start"="<BOOK","xmlinput.end"= "</BOOK>");
Test: hive> select * from BOOKDATA;
OK
Hadoop Defnitive Guide 24.9
Programming Pig 30.9
Time taken: 0.081 seconds, Fetched: 2 row(s)
... View more
06-25-2016
01:53 AM
1 Kudo
As mentioned in https://community.hortonworks.com/questions/37192/error-no-package-python27-available-while-setting.html, the tutorial has been corrected.
... View more
06-24-2016
08:02 AM
How about trying this: https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-PythonClientDriver
... View more
06-23-2016
07:56 AM
Let me answer your follow-up questions: Metadata is not a simple copy, there are some per-cluster settings that would have to be taken into account. Yes, Hive mirroring has to be "bootstrapped" using Hive export/import feature, externally to Falcon. ACID usage is not wide-spread yet, and Falcon mirroring is "set-it-and-forget-it". Also, new features are coming in new versions of Falcon and ACID will be supported before long. Falcon and Oozie jobs can be run on either cluster. I prefer to run them on DR cluster which is not doing much anyway, instead of using the busy production (source) cluster.
... View more
06-23-2016
05:53 AM
If it helped you, can you please accept my answer, to help us manage Q&A and help others find solutions for their problems. Tnx!
... View more
06-23-2016
03:20 AM
2 Kudos
I recently moved a journal node from one cluster machine to another by following procedure given here: https://community.hortonworks.com/questions/4272/process-for-moving-hdp-services-manually.html and informing Ambari about adding the 4th JN, and then deleting the one I wanted to move. It's similar to the article you found but a little bit simpler. The Ambari API calls are given in the article you found.
... View more
06-23-2016
02:10 AM
Yes, I can confirm your findings, that page 9.1 appears to be hidden. By the way, the pdf is fine, that extra page is not there.
... View more
06-23-2016
01:00 AM
4 Kudos
"rollingUpgarde rollback" on NN1 means that both hdfs software and data in hdfs will be reverted back to the state before starting the rolling upgrade. Any changes to hdfs (files added to or deleted) will be lost. NN1 will become active NN. "bootstrapStandby" on NN2 means that metadata from NN1 will be copied to NN2. After startup NN2 will become Stand by NN. What happens behind the stage: when you start the rolling upgrade with "rollingUpgrade prepare" a copy of NN metadata (FSImage) is created, called "previous". It consists of hard links to the "current" FSImage. When you do "rollingUpgrade rollback", "current" FSImage is replaced by "previous", that's why all hdfs changes are lost. If you want to keep the changes you can use "rollingUpgrade downgrade", it will downgrade only software, keeping hdfs image intact. You can find more details here.
... View more