Member since
05-05-2016
65
Posts
117
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3422 | 06-20-2016 09:08 AM | |
658 | 05-26-2016 01:25 PM | |
6435 | 05-26-2016 01:14 PM | |
688 | 05-25-2016 09:14 AM | |
1148 | 05-25-2016 09:03 AM |
09-29-2017
02:15 PM
Hi, I am trying to configure HS2 (lives in HDP) to use external Hive Metastore (outside HDP). To achieve this, I updated hive.metastore.uri property to point to external hive metastore thrift end-point. But when I am trying to list tables/database via beeline, unable to see tables/dbs which are on external Metastore.
... View more
Labels:
- Labels:
-
Apache Hive
09-29-2017
01:24 PM
I added following property and able to list S3 bucket from hdfs comamnd fs.s3a.aws.credentials.provider=com.amazonaws.auth.InstanceProfileCredentialsProvider But not sure if this property is sufficient when interacting with S3.
... View more
09-29-2017
01:15 PM
@rdoktorics I have attached AWS Instance Profile to all EC2 machines and i can successfully use aws cli to list/get S3 bucket but i cannot run command hdfs dfs -ls <S3_BUCKET_PATH> Question here is how to make hadoop use the AWS instance profile. What all properties are required to add/set in core-site.xml or hdfs-site.xml?
... View more
09-29-2017
12:34 PM
I see way to do in Hadoop 2.8 https://hadoop.apache.org/docs/r2.8.0/hadoop-aws/tools/hadoop-aws/index.html#S3A but HDP ships Hadoop 2.7.x
... View more
09-29-2017
12:30 PM
@rdoktorics Thanks for the link. But I am looking to install HDP manually (Not via Cloudbreak or HDC).
... View more
09-29-2017
10:39 AM
1 Kudo
Hi, I am looking to use S3 instead of HDFS to store data and run computation on my Cluster (build on EC2 instances) . Questions: 1. How can I configure HDP to use AWS IAM role to interact with S3? I dont want to use AWS keys to interact with S3 2. Is S3 Guard available in HDP2.6? 3. Any best practices to follow when using S3 instead of HDFS? Thanks, Pradeep
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
07-20-2016
08:52 AM
Yes, adding these properties solved the issue for HDFS plugin. Now I will check rest of the plugins.
... View more
07-19-2016
02:21 PM
Somehow Ranger is not able to pick rangerrepouser@REALM in my case, it says user does not exist. I am able to do ldapsearch for this user and kinit with the user. My HDFS is HA so i am using namenode URL as hdfs://mycluster
... View more
07-19-2016
12:49 PM
@Dale Bradman , was you able to solve the issue? I am facing the similar issue.
... View more
07-15-2016
10:20 AM
Option 2 didn't work for me. It is throwing me a message that "Certificate already exists" . I am still getting SSL Handshake Exception for ranger usersync. Which option worked for you?
... View more
06-20-2016
09:08 AM
2 Kudos
Root Cause : MariaDB libraries being installed on centOS7 by default. Solution : Remove mariadb packages and re-install hive metastore (MySQL).
... View more
06-20-2016
09:03 AM
1 Kudo
Issue : When installing hive metastore (MySQL) via Ambari , it throws conflicts error because of mariadb packages installed on the system which leads to failure of a task in ambari. Ambari Task Output : Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_server.py", line 64, in <module>
MysqlServer().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_server.py", line 33, in install
self.install_packages(env, exclude_packages=params.hive_exclude_packages)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 410, in install_packages
retry_count=agent_stack_retry_count)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 54, in action_install
self.install_package(package_name, self.resource.use_repos, self.resource.skip_repos)
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/yumrpm.py", line 49, in install_package
self.checked_call_with_retries(cmd, sudo=True, logoutput=self.get_logoutput())
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 83, in checked_call_with_retries
return self._call_with_retries(cmd, is_checked=True, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/package/__init__.py", line 91, in _call_with_retries
code, out = func(cmd, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/bin/yum -d 0 -e 0 -y install mysql-community-server' returned 1. warning: /var/cache/yum/x86_64/7/mysql56-community/packages/mysql-community-common-5.6.31-2.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Public key for mysql-community-common-5.6.31-2.el7.x86_64.rpm is not installed
Importing GPG key 0x5072E1F5:
Userid : "MySQL Release Engineering <mysql-build@oss.oracle.com>"
Fingerprint: a4a9 4068 76fc bd3c 4567 70c8 8c71 8d3b 5072 e1f5
Package : mysql-community-release-el7-5.noarch (@HDP-UTILS-1.1.0.20)
From : file:/etc/pki/rpm-gpg/RPM-GPG-KEY-mysql
Transaction check error:
file /usr/share/mysql/charsets/Index.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/armscii8.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/ascii.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/cp1250.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/cp1256.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/cp1257.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/cp850.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/cp852.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/cp866.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/dec8.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/geostd8.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/greek.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/hebrew.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/hp8.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/keybcs2.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/koi8r.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/koi8u.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/latin1.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/latin2.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/latin5.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/latin7.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/macce.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/macroman.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /usr/share/mysql/charsets/swe7.xml from install of mysql-community-common-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
file /etc/my.cnf from install of mysql-community-server-5.6.31-2.el7.x86_64 conflicts with file from package MariaDB-common-10.1.14-1.el7.centos.x86_64
Error Summary
-------------
stdout:
2016-06-06 15:57:47,697 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2016-06-06 15:57:47,698 - Group['spark'] {}
2016-06-06 15:57:47,699 - Group['hadoop'] {}
2016-06-06 15:57:47,699 - Group['users'] {}
2016-06-06 15:57:47,700 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,702 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,704 - User['oozie'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users']}
2016-06-06 15:57:47,705 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,707 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users']}
2016-06-06 15:57:47,708 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,708 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users']}
2016-06-06 15:57:47,709 - User['flume'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,710 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,711 - User['sqoop'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,712 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,713 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,714 - User['hcat'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
2016-06-06 15:57:47,714 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2016-06-06 15:57:47,716 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2016-06-06 15:57:47,728 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if
2016-06-06 15:57:47,729 - Group['hdfs'] {}
2016-06-06 15:57:47,729 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': [u'hadoop', u'hdfs']}
2016-06-06 15:57:47,730 - FS Type:
2016-06-06 15:57:47,731 - Directory['/etc/hadoop'] {'mode': 0755}
2016-06-06 15:57:47,758 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2016-06-06 15:57:47,758 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 0777}
2016-06-06 15:57:47,772 - Repository['HDP-2.4'] {'base_url': 'http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.4.2.0', 'action': ['create'], 'components': [u'HDP', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP', 'mirror_list': None}
2016-06-06 15:57:47,782 - File['/etc/yum.repos.d/HDP.repo'] {'content': '[HDP-2.4]\nname=HDP-2.4\nbaseurl=http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.4.2.0\n\npath=/\nenabled=1\ngpgcheck=0'}
2016-06-06 15:57:47,783 - Repository['HDP-UTILS-1.1.0.20'] {'base_url': 'http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos7', 'action': ['create'], 'components': [u'HDP-UTILS', 'main'], 'repo_template': '[{{repo_id}}]\nname={{repo_id}}\n{% if mirror_list %}mirrorlist={{mirror_list}}{% else %}baseurl={{base_url}}{% endif %}\n\npath=/\nenabled=1\ngpgcheck=0', 'repo_file_name': 'HDP-UTILS', 'mirror_list': None}
2016-06-06 15:57:47,786 - File['/etc/yum.repos.d/HDP-UTILS.repo'] {'content': '[HDP-UTILS-1.1.0.20]\nname=HDP-UTILS-1.1.0.20\nbaseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.20/repos/centos7\n\npath=/\nenabled=1\ngpgcheck=0'}
2016-06-06 15:57:47,787 - Package['unzip'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2016-06-06 15:57:47,946 - Skipping installation of existing package unzip
2016-06-06 15:57:47,946 - Package['curl'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2016-06-06 15:57:47,961 - Skipping installation of existing package curl
2016-06-06 15:57:47,961 - Package['hdp-select'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2016-06-06 15:57:47,974 - Skipping installation of existing package hdp-select
2016-06-06 15:57:48,191 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2016-06-06 15:57:48,248 - Package['mysql-community-release'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2016-06-06 15:57:48,401 - Installing package mysql-community-release ('/usr/bin/yum -d 0 -e 0 -y install mysql-community-release')
2016-06-06 15:57:50,957 - Package['mysql-community-server'] {'retry_on_repo_unavailability': False, 'retry_count': 5}
2016-06-06 15:57:50,972 - Installing package mysql-community-server ('/usr/bin/yum -d 0 -e 0 -y install mysql-community-server')
Version Details: Operating System : CentOS7 Ambari : 2.2
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hive
05-26-2016
01:25 PM
2 Kudos
@Aniruddh KendurkarFor first SSH login you can use credentials root/hadoop but then it prompts to change password. Also , Ambari server web UI credentials are not admin/admin , you need to reset the ambariserver password. Detailed Guide : http://hortonworks.com/hadoop-tutorial/learning-the-ropes-of-the-hortonworks-sandbox/#setup-ambari-admin-password http://hortonworks.com/hadoop-tutorial/learning-the-ropes-of-the-hortonworks-sandbox/#ways-execute-terminal-command
... View more
05-26-2016
01:14 PM
2 Kudos
@Tajinderpal Singh It looks like hive is not able get any container from Resource manager. Check you Resource manager to see if you have any avaialble contaiers or all containers are occupied by some jobs
... View more
05-25-2016
03:14 PM
4 Kudos
@elan chelian this line tells that job is going on a machine that have 1GB of resource. MAP capability required is more than the supported max container capability in the cluster. Killing the Job. mapResourceRequest: <memory:2048, vCores:1> maxContainerCapability:<memory:1024, vCores:3> Job received Kill while in R You can reduce the container size to 1GB and run the sqoop import. Try to run modified command: sqoop import -D mapreduce.map.memory.mb=1024 -D mapreduce.map.java.opts=-Xmx768m --connect jdbc:oracle:thin:@oracledbhost:1521:VAEDEV --table WC_LOY_MEM_TXN --username OLAP -P -m 1
... View more
05-25-2016
03:10 PM
Great to hear that 🙂
... View more
05-25-2016
09:19 AM
3 Kudos
@sanka sandeep As HDP do not support Apache Drill at the moment,you might not be able to Start/Stop Drill service using Ambari-server. Once you download the Drill packages from the Apache Drill website, you can use bin/drillbit.sh start script to start Drill. Drill Install Guide (Distributed Mode): https://drill.apache.org/docs/installing-drill-on-the-cluster/ Drill Install Guide (Embedded Mode) :https://drill.apache.org/docs/starting-drill-on-linux-and-mac-os-x/
... View more
05-25-2016
09:14 AM
2 Kudos
@c pat HDF-1.2 is compatible with HDP2.3+ . Have not seen such compatible matrix for HDF-1.1 http://docs.hortonworks.com/HDPDocuments/HDF1/HDF-1.2/bk_HDF_InstallSetup/content/hdf_supported_hdp.html
... View more
05-25-2016
09:03 AM
5 Kudos
@rahul jain From the URL it looksyou are searching for Hive ODBC for Suse OS. You can download Hive ODBC from here : http://hortonworks.com/downloads/ . Go to Hortonworks ODBC Driver for Apache Hive (v2.1.2) section to choose your OS specific downloads. URL specific to Suse : http://public-repo-1.hortonworks.com/HDP/hive-odbc/2.1.2.1002/suse11/hive-odbc-native-2.1.2.1002-1.x86_64.rpm If you are looking for HDP packages in Tarball format : http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/_hdp_24_repositories.html Hope this helps
... View more
05-23-2016
10:50 AM
2 Kudos
@Greenhorn TechieAs mentioned in above comment, edge not is beneficial when you access HDFS via hadoop API . When you knox , putting data on edge node would be like a hop means it will increase overall time taken to ingest data on HDFS.
... View more
05-23-2016
10:30 AM
4 Kudos
@Greenhorn Techie If your source system has access to Knox exposed WebHDFS then this would be the good way as you would avoid data hop on edge node. This method should take less time to put data on HDFS than putting data via edge node. Also accessing directly knox exposed WebHDFS will let you avoid SSH access to knox edge node.SO first option looks more secure and fast. Hope this helps you.
... View more
05-20-2016
04:10 PM
@petri koskican you share Pig job Logs?
... View more
05-20-2016
04:09 PM
6 Kudos
@elan chelian Can you try your sqoop command with increasing map memory from 256MB o higher value , like this : sqoop import -D mapreduce.map.memory.mb=2048 -D mapreduce.map.java.opts=-Xmx1024m --connect jdbc:oracle:thin:@oracledbhost:1521:VAEDEV --table WC_LOY_MEM_TXN --username OLAP -P -m 1
... View more
05-20-2016
09:28 AM
1 Kudo
@Roopa Raphael Supply "hadoop" as your current password then it will prompt for New Password.
... View more
05-20-2016
09:06 AM
1 Kudo
@Roopa Raphael First login user/pass for Sandbox is : username : root
password : hadoop After first login sandbox will ask you to change your password. Supply new password if prompted else use command passwd to change root password $> passwd Additionally, default ambari credentials (admin/admin) might not work. You need to reset the ambari-server password. You can refer here: http://hortonworks.com/wp-content/uploads/2016/03/ReleaseNotes_3_1_2016.pdf Hope this helps!
... View more
05-19-2016
12:31 PM
2 Kudos
@rahul jain This might help you https://bytealmanac.wordpress.com/2012/07/02/assigning-a-static-ip-to-a-vmware-workstation-vm/
... View more
05-19-2016
12:24 PM
1 Kudo
@Thomas Larsson From the above list output on mount point , it seems that whole volume is not inaccessible , some of its subdir are inaccessible . what is the output of this command : sudo ls -la /mnt/data21/
... View more
05-19-2016
10:01 AM
3 Kudos
Adding to above , In this scenario datanode service may fail with "Incompatible ID" . To resolve you need to re-register the datanode as datanode will have the old namespaceid.
... View more
05-19-2016
09:55 AM
4 Kudos
@kavitha velaga 1. Number of mappers depends on InputSplit of the file and hadoop launches mappers as much as reqired. User do not have direct control to set number of mapper via property. 2. To control the number of mapper, user has to control the number of inputsplit which is not necessary until there is requirement of custom logic. 3. User can control the number of reducer for a MR job by setting this property : job.setNumReduceTasks(numOfReducer); numOfReducer can have value from 0 to any positive integer. if you choose 0 then MR job will be mapper only job(no reducer means no aggregation) There are some usecases where Reducer is not necessary so putting numOfReducer=0 will make MR job to finish quickly (as job avoid shuffle and sorting). 4. Container size depends on how much memory your program would require in general. 5. Distcp - This ticket https://issues.apache.org/jira/browse/HDFS-7535 has improved distcp performance. To make distcp run quicker we might disable post copy check like checksum but then we trade-off with reliability. Hope this helps
... View more
05-18-2016
12:35 PM
1 Kudo
@sivasaravanakumar k If you are using any IDE to develop program like Eclipse then you can Add all required JAR file to program path and then build JAR using Eclipse. (http://tutoringcenter.cs.usfca.edu/resources/adding-user-libraries-in-eclipse.html) If you want to run your program on remote job where these jar might not be on classpath then build FAT/Executable jar using Eclipse. FAT JAR will ensure that all the required dependency is packaged with the program. (http://stackoverflow.com/questions/502960/eclipse-how-to-build-an-executable-jar-with-external-jar) You can also use build-tool like Maven to handle dependency for you.
... View more