Member since
02-06-2018
47
Posts
6
Kudos Received
0
Solutions
01-18-2019
10:51 PM
not exactly, as I mentioned state could either be started or installed only. I was to see if service is facing any issue,
... View more
01-18-2019
07:34 PM
Hi Guys, I am trying to check the service status with Ambari rest API, However I am not able to find any documention which explain thing in great details. For example, If I hit following REST URL http://localhost:8080/api/v1/clusters/Sandbox/services/HDFS I get following output "maintenance_state": "OFF",
"repository_state": "CURRENT",
"service_name": "HDFS",
"state": "STARTED"
},
"alerts_summary": {
"CRITICAL": 0,
"MAINTENANCE": 8,
"OK": 293,
"UNKNOWN": 4,
"WARNING": 0
}, However, I am not sure how to interpret this things. Should I care about MAINTENANCE, UNKNOWN and WARNING and just checking that nothing CRITICAL is good enough. This is mainly for developers to understand and track the time how much any service is down.
... View more
Labels:
- Labels:
-
Apache Ambari
10-01-2018
07:30 PM
I am trying to understand the hive query plan for a simple distinct query and I have small confusion regrading output of one of the stage. I have simple table with just two column, id and value. and just 4 rows as mentioned below. Data: <code>hive> select * from temp.test_distinct;
OK
1 100
2 100
3 100
4 150
Plan <code>hive> explain select distinct value from temp.test_distinct;
OK
Plan not optimized by CBO.
Vertex dependency in root stage
Reducer 2 <- Map 1 (SIMPLE_EDGE)
Stage-0
Fetch Operator
limit:-1
Stage-1
Reducer 2
File Output Operator [FS_6]
compressed:false
Statistics:Num rows: 2 Data size: 10 Basic stats: COMPLETE Column stats: NONE
table:{"input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}
Group By Operator [GBY_4]
| keys:KEY._col0 (type: string)
| outputColumnNames:["_col0"]
| Statistics:Num rows: 2 Data size: 10 Basic stats: COMPLETE Column stats: NONE
|<-Map 1 [SIMPLE_EDGE]
Reduce Output Operator [RS_3]
key expressions:_col0 (type: string)
Map-reduce partition columns:_col0 (type: string)
sort order:+
Statistics:Num rows: 4 Data size: 20 Basic stats: COMPLETE Column stats: NONE
Group By Operator [GBY_2]
keys:value (type: string)
outputColumnNames:["_col0"]
Statistics:Num rows: 4 Data size: 20 Basic stats: COMPLETE Column stats: NONE
Select Operator [SEL_1]
outputColumnNames:["value"]
Statistics:Num rows: 4 Data size: 20 Basic stats: COMPLETE Column stats: NONE
TableScan [TS_0]
alias:test_distinct
Statistics:Num rows: 4 Data size: 20 Basic stats: COMPLETE Column stats: NONE
Time taken: 0.181 seconds, Fetched: 35 row(s)
Confusion: TableScan, Select Operator and Group By Operator shows that they processed 4 rows which make sense to me. But shouldn't the next stage after Group By Operator get only 2 rows the process. As group by will remove other rows. In my DAG I can see the output of mapper is just two rows and not four, however that doesn't see to match with Plan. May i looking it wrong?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
09-12-2018
09:27 PM
@Venkatesh Kancharla open up a new question with all the details. and logs.
... View more
09-07-2018
03:08 PM
Compaction works only on transactional table, and to make any table transactional it should meet following properties.
Should be ORC Table Should be bucketed Should be managed table. Due you see the last point, you can't run compaction on non transactional table, if you do it from hive you will definitely get error, not sure from spark.
... View more
09-05-2018
05:54 PM
you are not getting the desired result as your compaction has failed. please check the yarn log to understand what might have gone wrong.
... View more
08-13-2018
07:10 PM
I found the solution, so posting here. The problem I was having is I was just stopping the docker container after making change in the command I was using to start the HDP image, didn't realize need to remove container as well. Following command helped. Save docker Work docker commit <hdp_container_id> <hdp_container_id> Stop and Remove docker. docker stop <hdp_container_id>
docker rm <hdp_container_id> Open 9083 port (hive metastore) by modifying start-sandbox-hdp-standalone_2-6-4.sh #!/bin/bash
echo "Waiting for docker daemon to start up:"
until docker ps 2>&1| grep STATUS>/dev/null; do sleep 1; done; >/dev/null
docker ps -a | grep sandbox-hdp
if [ $? -eq 0 ]; then
docker start sandbox-hdp
else
docker pull hortonworks/sandbox-hdp-standalone:2.6.4
docker run --name sandbox-hdp --hostname "sandbox-hdp.hortonworks.com" --privileged -d \
-p 9083:9083 \
- Start Docker ./start-sandbox-hdp-standalone_2-6-4.sh
... View more
08-10-2018
02:15 PM
@Sandeep Nemuri how do i check if metastore is up and running from my local machine. I logged into docker container and I can see I am able to telnet into the 9083 port. However if I try to do that from my local machine, it doesn't work. The one thing I realized is the port is not exposed in docker image or mentioned in webpage anywhere. https://hortonworks.com/tutorial/hortonworks-sandbox-guide/section/3/ I exposed the port and restarted the docker container however, I am still not able to connect to that port using telnet from my local machine or from presto server ( which also on my local machine).
... View more
08-10-2018
12:59 AM
I am trying to connect to hive metastore to my HDP sandbox. However it's throwing following error. Hive Catalogonnector.name=hive-hadoop2
hive.metastore.uri=thrift://sandbox-hdp.hortonworks.com:9083
hive.metastore.authentication.type=NONE Error: Query 20180810_005352_00000_umgac failed: Failed connecting to Hive metastore: [sandbox-hdp.hortonworks.com:9083] I tried using following values for hive.metastore.uri however I am gettting the same error. thrift://localhost:9083 thrift://127.0.0.1:9083 thrift://<IP of docker container>:9083 thrift://<IP of local machine>:9083
... View more
Labels:
- Labels:
-
Apache Hive
07-11-2018
02:26 PM
hi Guys.
I am trying to load data files into my hive table and facing issue. if files are located on local it doesn't work. if I move the file to hdfs then it works without any issue. Following command is not working in beeline, however it works perfectly in hive load data local inpath '/home/gaurang.shah/test.json' into table temp.test; Data is located on the node where one of the instance of hiverserver2 is running. I have given it all the permission as well. [gaurang.shah@aa ~] pwd
/home/gaurang.shah
[gaurang.shah@aa ~]$ ll test.json
-rwxrwxrwx 1 gaurang.shah domain users 56 Jul 11 13:54 test.json
... View more
Labels:
- Labels:
-
Apache Hive