Member since
06-24-2016
111
Posts
8
Kudos Received
0
Solutions
04-18-2017
01:26 AM
Then, if I turn on ACID Transactions option, do I need to install Hive Standalone Metastore additionally?
... View more
03-25-2017
03:19 PM
Here's hadoop cluster information in CentOS 6.7. /etc/hosts 10.10.1.10 cm.hdp.com cm 10.10.1.11 nn01.hdp.com nn01 10.10.1.12 nn02.hdp.com nn02 10.10.1.13 yarn.hdp.com yarn 10.10.1.14 dn01.hdp.com dn01 10.10.1.15 dn02.hdp.com dn02 10.10.1.16 dn03.hdp.com dn03 In case of a node, nn01.hdp.com, I wondering about HOSTNAME's value in '/etc/sysconfig/network'. Following this link, it is recommend HOSTNAME pattern like that "FQDN". https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-installation/content/edit_the_network_configuration_file.html Why is that HOSTNAME changed to FQDN? I think, HOSTNAME value is not FQDN but HOSTNAME like this. HOSTNAME=nn01
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
11-21-2016
01:29 AM
--split-by option is possible for text column after add sqoop-site.xml in ambari or add that option in command line. Like this. I think oracle record count is not relevant splitted file size. Because actual file size depends on column count and column type and column value size per one record. And here's my interesting sqoop import results. One file total size : 2.2 GB sqoop import ... --direct --fetch size 1000 --num-mappers 10 --split-by EMP_NO (TEXT) 0 bytes each 3 mappers, and 1.1GB to 1 mapper. and re-test with same value except below option. --split-by REEDER_ID (NUMBER) In my opinion, Sqoop mappers only parallel processing without regard to the file size for selected query results in oracle record, these are not evenly split file size. Also --split-by with NUMBER TYPE COLUMN option is useful than TEXT TYPE COLUMN that is not accurate for splitted file size.
... View more
11-16-2016
01:12 AM
That param "--direct-split-size" is only for postsgreSQL, If I'm right. Because I already tested it, and It's not working. See my result. Total table size is 2.3GB. sqoop import ... --num-mappers 4 --direct-split-size 600000000
... View more
11-15-2016
12:33 AM
Sqoop Version 1.4.6 in HDP 2.5.0.0 Oracle 11g Select query size is about 2.3GB. Sqoop Import .... --num-mappers 4 --split-by STR ... Result. I think the mappers options doesn't disrelated with saving same file size. I want to split output file size each per 570MB, but sqoop parameter is not support that feature. Is that another options or tips for output file size?
... View more
Labels:
- Labels:
-
Apache Sqoop
11-09-2016
02:00 AM
I think Smart Sense feature is not free for general user in case of using all Smart Sense menu service in Ambari. Am I right?
... View more
11-09-2016
01:13 AM
Where is the the tools or smartsense tab in support portal? I can't see it.
... View more
10-04-2016
01:28 AM
Thanks Ashnee. I didn't notice that.
... View more
09-30-2016
01:09 PM
Here's my ambari-server web. It's consist of two versions stack. (2.4.2.0 | 2.5.0.0)
I tried that upgrade HDP-2.4.2.0 to HDP-2.5.0.0, but failed some reasons. And then restarted ambari-server, I got this weird issue. Of course registered correct stack version info before upgrade. I feel the lack of delete feature for registered stack version in ambari-server wb. How to delete registered stack version? Like curl REST api...Delete ambari DB..... I hope newest ambari-server version is support delete stack version in ambari-server web.
... View more
Labels:
07-15-2016
02:32 AM
1 Kudo
System : HDP 2.4.2.0, AMBARI, 2.2.2.0 Machine : 5 Datanode Servers in 10 Servers, Datanode Server Disk Volume Info : 3TB x 10 Datanode Directory in Ambari web : DataNode directories (Just only one Disk1 (/data1) consist of two datanode directory per Datanode Server. Disk Volume1 : /data1/hadoop/hdfs/data, /data1/crash/hadoop/hdfs/data -> I don't know why it's added crash directory in only Disk1 Volume (/data1) Disk Volume2 : /data1/hadoop/hdfs/data .... Disk Volume5 : /data1/hadoop/hdfs/data /data1/crash/hadoop/hdfs/data Here's my question. Q1. When I put large size data in HDFS's certain directory (/dataset), I'm wondering hdfs replication policy each per server's disks. ex. dfs replication count = 3, certain 10GB File in HDFS's /dataset That File is separate replication architecture in disk volume per datanodes. Case1. BlockPool - blk_,,,,,, blk_...meta -> Datnode1 - disk1 | Datnode8 - disk2 | Datanode3 - disk6 ...... Case2. BlockPool - blk_,,,,,, blk_...meta -> Datnode1 - disk1 | Datnode2 - disk1 | Datanode7 - disk1 ...... | Datanode$ - disk1 Which one is correct hadoop hdfs replication & distribution store data manage policy. Is that possible Case2? Q2. How to move safely one datanode directory (/data1/crash/hadoop/hdfs/data) to another Disk(2~5) Volumes in same datanode directory or another datanode directory disk(2~5). Because It's bring the disk full issue only disk1 (data1) faster than another disks. In case Disk1's /data directory, up to double the block data & meta files is stored. So I need to know the solution of datanode directory's (/data1/crash/hadoop/hdfs/data) data before remove this path /data1/crash/hadoop/hdfs/data in Amabri - HDFS - Configs - Settings -Datanode directories.
... View more
Labels:
- Labels:
-
Apache Hadoop
- « Previous
- Next »