Member since
12-11-2015
79
Posts
26
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2216 | 04-05-2016 12:14 PM | |
913 | 12-14-2015 04:44 PM |
07-27-2016
02:37 PM
I made sure that no jobs were running at that time when the deletion happened :). Also I am not sure who set retention period to 8 years, have to check that. One thing i want to know here, if before deletion, I have would have set the property to 7days and restarted only timeline server, would it remove all the older logs from leveldb and keep logs which are generated in last 7 days? I guess after restarting Yarn, Job history server also resets.. and gets clears at regular intervals.. not sure though!!
... View more
07-27-2016
12:43 PM
Thanks It worked. I just want to know if there are any consequences after deleting the leveldb-timeline-store.ldb directory. After deleteing content of this directory and restarting timeline server, the contents of this directory regenerated and freed up the space.
We set “yarn.timeline-service.ttl-ms”
from 267840000000(~443weeks) to 604800000(7 Days) to limit the size of the
leveldb storage, according to hortonworks document.
... View more
07-27-2016
07:53 AM
Ambari version is 2.1.2
... View more
07-27-2016
07:52 AM
Ambari version is 2.1.2
... View more
07-26-2016
01:56 PM
when I do klist, It shows below output: klist /var/lib/ambari-agent/tmp/web_alert_cc_4610ec9b283dcfc90bc6df1e519e1c52
Ticket cache: FILE:/var/lib/ambari-agent/tmp/web_alert_cc_4610ec9b283dcfc90bc6df1e519e1c52
Default principal: HTTP/abcd_fqdn@REALM.COM
Valid starting Expires Service principal
07/26/16 08:42:28 07/26/16 08:47:28 krbtgt/REALM.COM@REALM.COM When I run below command it just execute and didn't give any error in the output /usr/share/centrifydc/kerberos/bin/kinit -l 5m -c /var/lib/ambari-agent/tmp/web_alert_cc_4610ec9b283dcfc90bc6df1e519e1c52 -kt /etc/security/keytabs/spnego.service.keytab HTTP/abcd_fqdn.com@REALM.COM
... View more
07-26-2016
12:53 PM
How to confirm who manages Krb5.conf file. The ownership is with root: -rw-r--r-- 1 root root 727 May 11 17:00 krb5.conf ccache_type = 3 default_ccache_name
is not defined Do I need to set this parameter for a particular UID or simply I give for all users default_ccache_name = /tmp/krb5cc_%{uid} and save conf file?
... View more
07-26-2016
10:38 AM
I am getting below alerts: Services Reporting Alerts
CRITICAL [MAPREDUCE2]
MAPREDUCE2
CRITICAL History Server Web UI
Connection failed to http://abcd_fqdn.com:19888 (Execution of '/usr/share/centrifydc/kerberos/bin/kinit -l 5m -c /var/lib/ambari-agent/tmp/web_alert_cc_4610ec9b283dcfc90bc6df1e519e1c52 -kt /etc/security/keytabs/spnego.service.keytab HTTP/ abcd_fqdn.com@Realm.COM > /dev/null' returned 1. kinit(v5): Credentials cache I/O operation failed XXX when initializing cache /var/lib/ambari-agent/tmp/web_alert_cc_4610ec9b283dcfc90bc6df1e519e1c52 ) I am not sure which credentials cache is it referring here. I can see credentials cache file on the node: -rw------- 1 yarn hadoop 1547 Jul 13 11:54 /tmp/krb5cc_513
-rw------- 1 hcat hadoop 1417 Jul 22 12:23 /tmp/krb5cc_516
-rw------- 1 hdfs hadoop 2775 Jul 22 12:24 /tmp/krb5cc_511
-rw------- 1 oozie hadoop 3046 Jul 22 12:24 /tmp/krb5cc_504
-rw------- 1 ambari-qa hadoop 1456 Jul 26 02:48 /tmp/krb5cc_1002 space in /tmp is also available.
... View more
Labels:
- Labels:
-
Apache Ambari
07-21-2016
10:16 AM
I have a scenario where yarn timeline store db is increasing day by day. In April it was 346GB and now it increased to 466GB and occupying lot of space in /var/opt/hadoop/yarn/timeline. # df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 30G 11G 18G 37% /
udev 126G 260K 126G 1% /dev
tmpfs 126G 0 126G 0% /dev/shm
/dev/sda5 5.0G 2.5G 2.2G 54% /var
/dev/sda7 756G 698G 21G 98% /var/opt/
/dev/sdb 2.5T 6.6G 2.3T 1% /data when i checked disk usage on /var/opt/hadoop/yarn/timeline du -sh *
466G leveldb-timeline-store.ldb
40K timeline-state-store.ldb I don't know why it is occupying this much of space. What steps do i need to take to make it consume less space.
... View more
Labels:
- Labels:
-
Apache YARN
07-05-2016
07:19 AM
Is it a bug that shows port 8050 in Ambari UI but connect to port 8032..?? I still dont understand as why it doesn't connect to 8050 if we have set Yarn.resourcemanager.address to <rmaddress:8050> rather than connecting to 8032.
... View more
06-17-2016
08:22 AM
@Kuldeep Kulkarni , I tried this suggestion and it worked for me but oozie and spark jobs failed as they are not able to connect to current active RM. I had to manually switchover RM then restarted oozie and spark, then jobs started running. Weird though 🙂 Anyways do you have any document or KB Article related to your suggetsion where this is shown or described. Again, Thanks for your time 🙂
... View more
06-15-2016
06:20 AM
Yes, I have RM HA
... View more
06-15-2016
06:18 AM
Then why it is connecting to port 8032 and not 8050 in our cluster. I have HDP2.3.4. Do we need to explicitly set yarn.resourcemanager.address.X to <IP.master1>:8050 and yarn.resourcemanager.address.Y to <IP.master2>:8050 in yarn-site.xml
... View more
06-14-2016
02:48 PM
2 Kudos
Hortonworks documentation says 8050 but yarn-default.xml says 8032. Also when I do netstat on resource manager node, it give 8032 port where resource manager is connecting and not 8050. I have got reference from one of the questions asked in community and it says this is a bug but might be a documentation error and docment team from Hortonworks has confirmed the port is 8032. See the below link and the update provided by Artem Ervits in the end: https://community.hortonworks.com/questions/11599/how-to-change-resourcemanager-port-on-oozie.html#answer-21336 Could anyone confirm and let me know if its being resolved? @Artem Ervits @Neeraj Sabharwal @pardeep
... View more
Labels:
- Labels:
-
Cloudera Manager
04-05-2016
12:14 PM
Thanks for the input guys. I increased RS heap to 16GB and increased handler count to 200. Also we performed manual major compaction on few HBase tables majorly used by customer. After that HBase read performance dropped to acceptable limits.
... View more
04-01-2016
06:52 PM
gclog-data02.txtgclog-data06.txt
... View more
04-01-2016
04:47 PM
I dont see any RIT on HBase webui but I see read requests of 7M & 3M on few datanodes. Attaching logs. Is it advisable to restart ambari-server in this case
... View more
04-01-2016
03:09 PM
1 Kudo
One of my cluster shows hbase read latency of 200000000ms which is impacting cluster performance and hence certain jobs. Need to resolve this as early as possible.
... View more
Labels:
- Labels:
-
Apache HBase
02-18-2016
07:18 AM
1 Kudo
Thanks for the link. This issue is being faced by one of our teams and they have only access to Hue and not the command line. They are uploading archived zip files to HDFS and the size ranges from 2GB to 900GB. I already told them that Hue is not meant to upload such big files in HDFS and I am not able to find the background functionality of fileuploader anywhere online.
... View more
02-18-2016
06:41 AM
2 Kudos
I am trying to upload files from HUE to HDFS using File Browser. Everytime I upload a large file, in GBs, a temporary but same size file creates in /tmp (eg tmpxxx.upload) which is occupying whole / space and its getting filled up to 100%. My question is how to move these temporary file from /tmp to some other location having sufficient space.
... View more
- Tags:
- Hadoop Core
- hue
Labels:
- Labels:
-
Cloudera Hue
01-18-2016
12:18 PM
@Artem Ervits no networking or firewall issue as such
... View more
01-18-2016
12:18 PM
a) Datanodes are up and running. b) Will check the tcpdump and send the output
... View more
01-14-2016
05:31 AM
1 Kudo
Frequently getting these below error messages on datanode ERROR datanode.DataNode (DataXceiver.java:run(250)) - X.X.X.X6:50010:DataXceiver error processing READ_BLOCK operation src: /x.x.x.7:49636 dst: /x.x.x.6:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/x.x.x.6:50010 remote=/x.x.x.7:49636]
at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:172)
at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:220)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:547)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:716)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:110)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
at java.lang.Thread.run(Thread.java:745)
... View more
Labels:
- Labels:
-
Apache Hadoop
12-17-2015
08:00 AM
fsck output ......Status: HEALTHY
Total size: 15311550475135 B (Total open files size: 36 B)
Total dirs: 543341
Total files: 1572526
Total symlinks: 0 (Files currently being written: 5)
Total blocks (validated): 1635306 (avg. block size 9363110 B) (Total open file blocks (not validated): 4)
Minimally replicated blocks: 1635306 (99.99999 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 43091 (2.635042 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 2.9589655
Corrupt blocks: 0
Missing replicas: 67060 (1.3669327 %)
Number of data-nodes: 3
Number of racks: 1
... View more
12-17-2015
07:52 AM
DFSadmin report Configured Capacity: 107909857935360 (98.14 TB) Present Capacity: 107907646763463 (98.14 TB) DFS Remaining: 63959036886116 (58.17 TB) DFS Used: 43948609877347 (39.97 TB) DFS Used%: 40.73% Under replicated blocks: 35016 Blocks with corrupt replicas: 113 Missing blocks: 0 Datanodes available: 3 (3 total, 0 dead) Live datanodes: Name: x.x.x.x4 Hostname: x.x.x.x4 Decommission Status : Normal Configured Capacity: 35969952645120 (32.71 TB) DFS Used: 15310914557283 (13.93 TB) Non DFS Used: 821825081 (783.75 MB) DFS Remaining: 20658216262756 (18.79 TB) DFS Used%: 42.57% DFS Remaining%: 57.43% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 14 Last contact: Wed Dec 16 20:17:25 AEDT 2015 Name: x.x.x.x6 Hostname: x.x.x.x6 Decommission Status : Normal Configured Capacity: 35969952645120 (32.71 TB) DFS Used: 14348334051328 (13.05 TB) Non DFS Used: 497512448 (474.46 MB) DFS Remaining: 21621121081344 (19.66 TB) DFS Used%: 39.89% DFS Remaining%: 60.11% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1025 Last contact: Wed Dec 16 20:17:23 AEDT 2015 Name: x.x.x.x5 Hostname: x.x.x.x5 Decommission Status : Normal Configured Capacity: 35969952645120 (32.71 TB) DFS Used: 14289361268736 (13.00 TB) Non DFS Used: 891834368 (850.52 MB) DFS Remaining: 21679699542016 (19.72 TB) DFS Used%: 39.73% DFS Remaining%: 60.27% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1025
... View more
12-16-2015
09:42 AM
1 Kudo
Ambari UI shows corrupt blocks, fsck output shows not corrupt blocks and filesystem under / as healthy. Also when i run dfsadmin report it shows 'blocks with corrupt replicas ' same number as ambari showing
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
12-14-2015
02:47 PM
total used free shared buffers cached
Mem: 251 2 249 0 0 0
-/+ buffers/cache: 1 250
Swap: 7 0 7
... View more
12-14-2015
02:31 PM
ambari agent hung and gives error '/usr/sbin ambari-agent : fork : cannot allocate memory'
... View more
- Tags:
- Ambari
- Hadoop Core
Labels:
- Labels:
-
Apache Ambari