Member since
05-07-2018
331
Posts
45
Kudos Received
35
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7342 | 09-12-2018 10:09 PM | |
2909 | 09-10-2018 02:07 PM | |
9709 | 09-08-2018 05:47 AM | |
3225 | 09-08-2018 12:05 AM | |
4229 | 08-15-2018 10:44 PM |
06-18-2018
06:25 PM
Hey @Sami Ahmad! So answering your questions, usually when you use kinit -kt command you're passing a keytab file, and when you don't use it -kt (only kinit), you will use a password authentication. In all of my jobs when I had a kerberized environment, usually the sysadmin gave a keytab file to my user or service. It's a common practice, since most of the components in Hadoop works pretty well with keytabs instead of passphrases for principals in KRB. In your case, if you don't have a keytab, the best approach would be to ask for the sysadmin to generate a keytab for you.
... View more
06-18-2018
04:26 PM
Hey @Sami Ahmad! Did you tried to add the principal + keytab path into your jdbc string connection? jdbc:phoenix:<ZK-QUORUM>:<ZK-PORT>:<ZK-HBASE-NODE>:principal_name@REALM:/path/to/keytab Hope this helps!
... View more
06-18-2018
04:01 PM
1 Kudo
Hi @Dan Alan! Did you check the NIFI Api Rest? https://nifi.apache.org/docs/nifi-docs/rest-api/index.html Hope this helps
... View more
06-18-2018
03:58 PM
Hi @rama! How much is set for hive.exec.max.dynamic.partitions? BTW, could you check if your hive log is showing smtg related to lock? If it is, try to unlock your tables (sintax e.g. -> UNLOCK TABLE <TABLE_NAME>). And one last thing, just asking, but are you using External tables? And you're running a MSCK REPAIR TABLE <TABLE_NAME> after each batch? Hope this helps!
... View more
06-18-2018
03:32 PM
Hey @Tsuyoshi Sanda! Looks like this 1001 user is setting manually this hiveconf (including the hive.metastore.client.socket.timeout), if your concern is to impact your hive sessions with a low value for hive.metastore.client.socket.timeout, don't worry about it. Cause that command hive -e "set;"| grep -i hive.metastore.client.socket.timeout its showing that Ambari is able to set the correct value to your hive sessions. And going further to figure out what's happening, I'd login as 1001 and run the pstree on those strange PID's. In the meantime, What it seems to be weird, is root/1001 users are running the same command at the end (hive + show databases), and login as ambari-qa. Try to grep the PID 2767 and if you able share with us, please. Hope this helps!
... View more
06-15-2018
11:04 PM
Hey @Anji Raju! I made a test here, and it seems that you need to change your Messages struct<Message: array<struct< for Messages array<struct<Message:struct< Here's my test: #My sample.xml file <Status>
<StatusCode>0</StatusCode>
<StatusDesc>Success</StatusDesc>
<ConfidenceIndex>D</ConfidenceIndex>
<Messages>
<Message>
<severity></severity>
<statuscode>0</statuscode>
<pagenumber>1</pagenumber>
<filename>Filxxxxe.pdf</filename>
<layoutfileid>xxxx</layoutfileid>
<layoutpageid>xxxxx</layoutpageid>
<layoutidentifertext></layoutidentifertext>
<text>xx xxxxxx xx</text>
</Message>
</Messages>
</Status> #Downloading and adding the jar from ibm.spss
wget http://search.maven.org/remotecontent?filepath=com/ibm/spss/hive/serde2/xml/hivexmlserde/1.0.5.3/hivexmlserde-1.0.5.3.jar
mv remotecontent?filepath=com%2Fibm%2Fspss%2Fhive%2Fserde2%2Fxml%2Fhivexmlserde%2F1.0.5.3%2Fhivexmlserde-1.0.5.3.jar hivexmlserde-1.0.5.3.jar
#Creating the HDFS location for the table and putting the sample.xml above hdfs dfs -mkdir /user/hive/warehouse/hive-xml-test
hdfs dfs -put sample.xml /user/hive/warehouse/hive-xml-test
#Starting the test in Hive
hive> add jar /tmp/hivexmlserde-1.0.5.3.jar;
create external table xmltest(
StatusCode string,
StatusDesc string,
ConfidenceIndex string,
Messages array<struct<Message:struct<
severity:string,
statuscode:string,
pagenumber:string,
filename:string,
layoutfileid:string,
layoutpageid:string,
layoutidentifertext:string,
text:string>>>
)
row format serde 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
with serdeproperties (
"column.xpath.StatusCode" = "/Status/StatusCode/text()"
,"column.xpath.StatusDesc" = "/Status/StatusDesc/text()"
,"column.xpath.ConfidenceIndex" = "/Status/ConfidenceIndex/text()"
, "column.xpath.Messages" = "/Status/Messages/Message" )
stored as inputformat 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
outputformat 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
location '/user/hive/warehouse/hive-xml-test'
tblproperties ( "xmlinput.start" = "<Status>" ,"xmlinput.end" = "</Status>" );
hive> select * from xmltest;
OK
0 Success D [{"message":{"severity":null,"statuscode":"0","pagenumber":"1","filename":"Filxxxxe.pdf","layoutfileid":"xxxx","layoutpageid":"xxxxx","layoutidentifertext":null,"text":"xx xxxxxx xx"}}]
Time taken: 0.104 seconds, Fetched: 1 row(s)
Hope this helps!
... View more
06-15-2018
09:43 PM
Hi @Tsuyoshi Sanda! Could you check the following command? [hive@node1 ~]$ hive -e "set;" | grep -i hive.metastore.client.socket.timeout
log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender.
Logging initialized using configuration in file:/etc/hive/2.6.4.0-91/0/hive-log4j.properties
hive.metastore.client.socket.timeout=1800s
And one thing that is tricking my mind, is who's 1001 user? Do you have a hive user owning this process? Hope this helps!
... View more
06-15-2018
06:04 PM
Hey @Karthik Chandrashekhar! Sorry for my delay, so basically I couldn't note anything wrong in your configs. But one thing, on your disk-util-hadoop,png. Could you check if under the /hadoop mount is there any other subdir (besides the /hadoop/hadoop/hdfs/data)? du --max-depth=1 -h /hadoop/hadoop/ or du --max-depth=1 -h /hadoop/
#And just to check the mountpoints lsblk
I think your Non-dfs usage is high, because HDFS is counting other directories under the /hadoop at the same disk. And one last thing, how much is set for dfs.datanode.du.reserved? Hope this helps!
... View more
06-13-2018
06:34 PM
1 Kudo
Hey @Vinay K! Does this user asif exists in all NodeManager/ResourceManager machines? And does it belong to yarn group? Hope this helps!
... View more
06-13-2018
06:06 PM
Hey @Bhanu Pamu! I'm not sure if I get it, but if you have /hadoop (non-dfs + dfs files) and do you wanna move them to /data, guess the best choice would be to add this /data to dfs.datanode.data.dir as well. Then stop the datanodes and move the files from /hadoop to /data. Not sure if this is the best practice or if has another approach to do this, but certainly i'd investigate more on this thought before doing anything under HDFS. Hope this helps! 🙂
... View more