Member since
01-04-2019
77
Posts
27
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2184 | 02-23-2018 04:32 AM | |
550 | 02-23-2018 04:15 AM | |
453 | 01-20-2017 02:59 PM | |
770 | 01-18-2017 05:01 PM | |
3295 | 06-01-2016 01:26 PM |
10-11-2018
09:15 PM
Can u check for any hung process? ps -ef | grep registry ps -ef | grep nifi You can also create the empty pid file with right permissions for the service account.
... View more
04-10-2018
03:15 PM
Not really. you also need AM containers to spawn new AM for your jobs. You can configure % of memory reserved for AM containers in Yarn capacity scheduler view.
... View more
02-23-2018
06:40 AM
@Alex PQ it is possible that the job had failed before log could be created. Navigate to Resource Manager UI and look for the failed job. Once you click on your application ID, it will show the status of log aggregation and possible failure reason. you can also open AM log from here.
... View more
02-23-2018
06:33 AM
yes, you can. please look at yarn capacity scheduler settings. Please read here - https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_yarn-resource-management/content/setting_user_limits.html
... View more
02-23-2018
06:23 AM
By Default CPU is shared unless you enable CPU based scheduling. So you will be able to run (total yarn memory/container size) on your cluster.
... View more
02-23-2018
04:32 AM
1 Kudo
@Raj B If you have both the clusters up and running, you can export tables from one cluster to another using Hive import export commands if the databases are not very large. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport
... View more
02-23-2018
04:26 AM
@kotesh banoth For HDFS disk space alter HDFS configurations make sure all new data disks are provided under datanode directories. For computation You will have to alter yarn configuration - yarn.scheduler.maximum-allocation-mb yarn.scheduler.maximum-allocation-vcores yarn.nodemanager.resource.memory-mb
... View more
02-23-2018
04:23 AM
@Daniel Müller HDFS will replicate once it scans an under-replicated block. alternatively running hdfs balancer may also trigger the same.
... View more
02-23-2018
04:19 AM
@kishore sanchina You have to setup your browser for kerberos. You can look at here for steps - https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_security/content/enabling_browser_access_spnego_web_ui.html
... View more
02-23-2018
04:15 AM
1 Kudo
@Giuseppe D'Agostino All HDP packages primarily gets installed in /usr/hdp location. You can create a mount pointing to disk named /usr/hdp.
... View more
01-16-2018
06:55 PM
@Chen Ran Open NameNode UI and go to "Datanode Volume Failures" tab. Let us know if you see any volume failures.
... View more
01-16-2018
06:45 PM
@Nicolas Tobias Please look at hive.fetch.task.conversion property. This property gathers simple queries data without spinning up a job. Setting this to none should trigger data querying via job.
... View more
01-16-2018
06:40 PM
answer depends on what kind of job. For e.g. a MR job, you check the running task log in Yarn Resource Manager. Depending upon what is being processed, you can check when the last activity or what activity is currently being carried out and for how long. for e.g. you can have fetcher class where reducer tasks is trying to fetch data from all map tasks. this could be time consuming if there is large amount of data, or intermediate compression is not enabled.
... View more
01-16-2018
06:33 PM
Are you able to run queries via Hive Cli? if yes, then login to Ambari - HDFS. Perform comparison of configurations prior to adding host and after new adding host. Do similarly with HIVE and YARN services. It is possible that Ambari has reverted to default configuration during addition of new node. Looking at error, It appears that ambari is not able to write the file as user 'rene' which could be due to impersonation failure. (proxyuser settings)
... View more
10-02-2017
10:38 PM
Can you share more details - Where do you try to do this? and possibly provide a screen shot?
... View more
09-29-2017
07:33 PM
@ryan xia
You can click on "Show advanced option" on "create cluster" wizard this with list you options to provide HDP and Ambari repositories URL.
... View more
09-29-2017
07:28 PM
@Yair Ogen
As per the log "message from server:"Host 'il-dev-05' is not allowed to connect to this MySQL server", user "hive"@"metastorehost" does not have a valid userid on mysql server at il-dev-05. Create a user on mysql with "hive"@"metastorehost" and provide full access to hive database. You can refer here - https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-administration/content/using_hive_with_mysql.html
... View more
09-29-2017
07:14 PM
1 Kudo
check this https://oozie.apache.org/docs/4.0.0/WebServicesAPI.html#Job_Log You can use curl to run the rest api.
... View more
09-29-2017
05:35 PM
Since you are talking about secondary NameNode, a Secondary NameNode will never act as a Name/meta data service provider even if you shutdown primary Namenode. You will have to switch Secondary Namenode to standby NameNode using HA. You can read here - https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Secondary_NameNode. One way to check if your secondary NN has all latest fs image is by checking the size of CURRENT directory of NN and SNN.
... View more
09-29-2017
05:28 PM
1) With Oozie UI you can look at the status of all past workflows. 2) If you have smartsense activity explorer and analyzer setup, you can query all the job that ran from activity.job table within given specified period and job name /type containing "oozie" in it.
... View more
09-27-2017
01:16 AM
@Br Hmedna You are trying to export ORC data into mysql without converting it to text. You should use sqoop hive export to do this. Look at this link https://community.hortonworks.com/questions/22425/sqoop-export-from-hive-table-specifying-delimiters.html
... View more
09-27-2017
01:12 AM
@Kumar Veerappan You will have to create an watcher/alert scripts that identifies which NN is active and alert/email if NN flips. Namenode service provides JMX which has information on which NameNode is active. Your watcher script can query this data to identify if NN failovers. name" : "Hadoop:service=NameNode,name=NameNodeStatus",
"modelerType" : "org.apache.hadoop.hdfs.server.namenode.NameNode",
"State" : "active",
"NNRole" : "NameNode",
"HostAndPort" : "host1:8020
... View more
09-27-2017
01:06 AM
@Mohammad Shazreen Bin Haini If you are using ranger to manage permissions, there should 2 default policies. 1) HDFS policy - that gives hive full permission to read/write to /apps/hive/warehouse directory. 2) Hive policy - that gives hive full permission "hive" user to create, drop databases, tables.
... View more
09-27-2017
01:01 AM
@Zack Riesland With smartsense, you can install activity analyzer and activity explorer that can parse all the job runs within the cluster. This does not provide the metrics at table lever rather it can provide you information at job level. You will have to make few modifications to extract table information from actual query and join it with smartsense metrics to capture your metrics.
... View more
09-27-2017
12:56 AM
1 Kudo
If you are looking for extracting text from PDF, I had done this via Apache Tika. Its simpler to use.
... View more
08-17-2017
04:31 PM
if SLA is not the constraint, you can save RDD as temporary file and read it again via databricks. If you are running it via zeppelin dashboard, you can invoke shell interpreter and use sed to do an infile replace for xmlns: prior to reading it on your dataframe.
... View more
06-13-2017
10:56 PM
Check application log for application_1497349747602_0006 in RM. Paste the stdout/stderr from job history
... View more
02-08-2017
08:04 PM
I have seen this error for my customer. Issue was with memory footprint on the node hosting Zeppelin/Livy. Free memory was 1GB. This was since livy had many dead session which were not releasing memory. Deleting livy sessions helped increasing free memory. YOu can use livy rest api to view sessions and delete dead sessions.
... View more
01-25-2017
07:07 PM
1 Kudo
I think pre-emption is within leaf queues under same parent queue. That is why this behavior is observed.
... View more
01-25-2017
06:48 PM
Is your cluster kerberized? Log says "Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, status: 403, message: Forbidden" Can you check if you are running the job with user having valid kerberos ticket?
... View more