Member since
06-17-2016
4
Posts
0
Kudos Received
0
Solutions
03-22-2018
08:24 PM
Size of single file is 2.4GB, number of records in table more than 22 millions. My cluster is on HDP 2.6 and consists of 16 nodes (each with 96GB memory). I can't increase the number of mappers. I set followings: set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; set tez.grouping.min-size=16777216; (16MB) set tez.grouping.max-size=107374182; (107MB) set hive.optimize.index.filter=true; set use.hive.interactive.mode=true; I'm executing following query "select PropertyType, count(*) as count from houses group by PropertyType;" but each time Hive TEZ creates just ONE Mapper, thats why this query takes too long (135 sec., 95% of time spent for mapping task). I'm using Beeline interface, but even Hive CLI has same result. ---------------------------------------------------------------------------- INFO : Dag name: select PropertyType, count(*)...PropertyType(Stage-1)
INFO : Status: Running (Executing on YARN cluster with App id application_1521676736844_0019)
--------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED
--------------------------------------------------------------------------------
Map 1 .......... llap SUCCEEDED 1 1 0 0 0
Reducer 2 ...... llap SUCCEEDED 59 59 0 0 0
--------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 131.76 s
--------------------------------------------------------------------------------
INFO : Status: DAG finished successfully in 131.65 seconds
INFO :
INFO : Query Execution Summary
INFO : ----------------------------------------------------------------------------------------------
INFO : OPERATION DURATION
INFO : ----------------------------------------------------------------------------------------------
INFO : Compile Query 0.95s
INFO : Prepare Plan 0.38s
INFO : Submit Plan 0.36s
INFO : Start DAG 0.47s
INFO : Run DAG 131.65s
INFO : ----------------------------------------------------------------------------------------------
INFO : ...
INFO : OK
... So, how can I increase the number of Mapper tasks? I'd highly appreciate any help
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
12-22-2017
09:31 PM
Hello, I have same problem with Metrics Collector Process: Connection failed: [Errno 111] Connection refused to dnode01.cedar.cluster.ada:6188 NTP is enabled on all 16 nodes, but there is no positive affect I'm not sure how the status of NTP on Browsing Host (which is not part of cluster) can affect the operation of cluster? Can help with my case? Please let me know, if you need more details
... View more
12-12-2017
11:59 AM
thank you for response 1) node manager is running 2) this folder does not exit not in master not in slaves Do you think it will help if I create it Additionally on Ambari UI - Metrics Collector Process Connection failed: [Errno 111] Connection refused to....
... View more
12-12-2017
10:59 AM
I'm trying to install HDP 2.6 on 16 nodes (2 master nodes and 14 slaves) using Ambari wizard. After many challenges it is completed, but not enough successfully - all nodes get orange with "Warnings encountered". All slave nodes have only warning related to NodeManager Start I'd highly appreciate any help on fixing this issue. Here are some outputs and error messages from different nodes: ==== MasterNode1 --- Check HDFS stdout: /var/lib/ambari-agent/data/output-189.txt (last notice)
2017-12-12 04:10:41,598 - HdfsResource[None] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/2.6.3.0-235/hadoop/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://nnode.cedar.cluster.ada:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': None, 'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir': '/usr/hdp/2.6.3.0-235/hadoop/conf', 'immutable_paths': [u'/mr-history/done', u'/app-logs', u'/tmp']}
Command completed successfully! --- Grafana Start
Errors and Output files empty ==== MasterNode2 --- Metrics Collector Start stderr: /var/lib/ambari-agent/data/errors-174.txt (last notice) File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 120, in action_create
raise Fail("Applying %s failed, parent directory %s doesn't exist" % (self.resource, dirname))
resource_management.core.exceptions.Fail: Applying File['/usr/lib/ams-hbase/bin/hadoop'] failed, parent directory /usr/lib/ams-hbase/bin doesn't exist
--- Activity Analyzer Start
Errors and Output files empty --- Activity Explorer Start Errors and Output files empty --- Check MapReduce2 Errors and Output files empty ==== All DataNodes - same warning on all (14) of them --- NodeManager Start stderr: /var/lib/ambari-agent/data/errors-181.txt
Command aborted. Reason: 'Server considered task failed and automatically aborted it' stdout: /var/lib/ambari-agent/data/output-181.txt (last notice)
2017-12-12 04:10:41,326 - Execute['ulimit -c unlimited; export HADOOP_LIBEXEC_DIR=/usr/hdp/2.6.3.0-235/hadoop/libexec && /usr/hdp/2.6.3.0-235/hadoop-yarn/sbin/yarn-daemon.sh --config /usr/hdp/2.6.3.0-235/hadoop/conf start nodemanager'] {'not_if': 'ambari-sudo.sh -H -E test -f /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid && ambari-sudo.sh -H -E pgrep -F /var/run/hadoop-yarn/yarn/yarn-yarn-nodemanager.pid', 'user': 'yarn'}
Command aborted. Reason: 'Server considered task failed and automatically aborted it'
Command failed after 1 tries
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)