Member since
04-11-2016
535
Posts
147
Kudos Received
77
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3963 | 09-17-2018 06:33 AM | |
858 | 08-29-2018 07:48 AM | |
1458 | 08-28-2018 12:38 PM | |
929 | 08-03-2018 05:42 AM | |
964 | 07-27-2018 04:00 PM |
01-21-2019
07:40 AM
@abbas mohammadnejad Please refer to post Knox+HDFS UI. It explains that configuration for HDFS UI through Knox.
... View more
01-21-2019
07:29 AM
@Abhishek Gupta Hive Merge is available from Hive 2.2 [Details]. Back port the feature to Hive 1.2.1 would be a challenge, since it includes many changes related to ACID, transactions etc.. You could use Hive Merge on Hive 2.1 without LLAP as well with hive.llap.execution.mode=none [Property details]. Hope this is helpful.
... View more
01-21-2019
06:50 AM
@Joshva Peter Above message "Operation category READ is not supported in state standby" is when the Balancer connects to Standby Namenode and the connection would later failover to other Active Namenode. Since, there is no error reported. Capture the debug logs as below to analyse the issue. export HADOOP_BALANCER_OPTS="-Droot.logger=DEBUG,console" Run the balancer as: hdfs balancer 'options' 2>&1 | tee /tmp/balancer.log
... View more
11-12-2018
08:26 AM
@Zholaman Kubaliyev The issue could be related to incorrect configuration / memory value for property "tez.runtime.io.sort.mb". Refer HCC link for tuning tez configurations.
... View more
10-05-2018
11:42 AM
@yogesh turkane Seems like your coordinator process is not registered or some configuration issues, from logs I did observe below error 2018-10-04T06:43:50,101 ERROR [main] io.druid.curator.discovery.ServerDiscoverySelector - No server instance found for [druid/coordinator]
2018-10-04T06:43:50,101 WARN [main] io.druid.java.util.common.RetryUtils - Failed on try 1, retrying in 886ms.
io.druid.java.util.common.IOE: No known server
Check the configurations and restart Druid services.
... View more
10-05-2018
09:40 AM
@Anil Varghese I suspect that the table is partitioned because of which the "describe formatted" does not show any stats related information. Try running "describe extended" for particular partition spec.
... View more
10-05-2018
09:37 AM
@Bal P You could verify the skew table from 'desc formatted <table_name>'. hive> desc formatted T;
OK
# col_name data_type comment
c1 string
c2 string
# Detailed Table Information
Database: default
Owner: hive
CreateTime: Fri Oct 05 09:16:47 UTC 2018
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://xxx:8020/apps/hive/warehouse/t
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"}
numFiles 0
numRows 0
rawDataSize 0
totalSize 0
transient_lastDdlTime 1538731007
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Skewed Columns: [c1]
Skewed Values: [[x1]]
Storage Desc Params:
serialization.format 1
Time taken: 0.899 seconds, Fetched: 34 row(s)
Or, from backend Hive Metastore DB to get list of the tables and their skew columns. mysql> select S1.SKEWED_COL_NAME,T1.TBL_NAME from SKEWED_COL_NAMES S1, TBLS T1 where S1.SD_ID=T1.SD_ID;
+-----------------+----------+
| SKEWED_COL_NAME | TBL_NAME |
+-----------------+----------+
| c1 | t |
+-----------------+----------+
1 row in set (0.00 sec)
... View more
09-26-2018
11:26 AM
@Saurabh Do you have the datasource "Sterlingtest" available under Druid coordinator UI? If yes, then check the hive-druid configurations under Ambari -> Hive configs and see if broker and coordinator addresses are configured properly.
... View more
09-18-2018
07:05 AM
@Teddy Brewski
Below are the properties which control the logs and other files written to /tmp/<username> folder. <property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/var/log/hadoop/hive/tmp/${user.name}</value>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/var/log/hadoop/hive/tmp/hive/${hive.session.id}_resources</value>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/var/log/hadoop/hive/tmp/operations_logs</value>
</property>
You can add/modify under Ambari -> Hive configs.
... View more
09-17-2018
10:48 AM
@Vikash Kumar The properties 'mapreduce.job.*' are only applicable to MR jobs. In Tez, the number of mappers and controlled by below parameters:
tez.grouping.max-size(default 1073741824 which is 1GB) tez.grouping.min-size(default 52428800 which is 50MB) tez.grouping.split-count(not set by default) And, reducers are controlled in Hive with properties:
hive.exec.reducers.bytes.per.reducer(default 256000000) hive.exec.reducers.max(default 1009) hive.tez.auto.reducer.parallelism(default false) For more details, refer link.
... View more
09-17-2018
10:37 AM
@Sudharsan
Ganeshkumar
Yes, you can increase the number of mappers to improve parallelism depending on your cluster resources.
... View more
09-17-2018
10:37 AM
@Sudharsan
Ganeshkumar
Yes, you can increase the number of mappers to improve parallelism depending on your cluster resources.
... View more
09-17-2018
06:33 AM
@Sudharsan
Ganeshkumar
-m represents the number of mappers run to extract the data from the source database. Here '-m 1' means running one mapper.
... View more
09-12-2018
03:20 PM
@Saurabh Data Analytic Studio is replacement for both Hive view and Tez views under Ambari. Refer to more details in below links: https://hortonworks.com/products/dataplane/data-analytics-studio/ https://hortonworks.com/products/dataplane/data-analytics-studio/ https://docs.hortonworks.com/HDPDocuments/DAS/DAS-1.0.0/index.html
... View more
09-06-2018
12:50 PM
@Kant T Hive on Spark is not supported, hence the error. You may try SparkSQL if you want to leverage Spark.
... View more
08-29-2018
08:43 AM
@Serg Serg You could monitor the Hive Tez applications from RM UI. We could check query status and resources used. However, we cannot check the query plan.
... View more
08-29-2018
07:48 AM
@Vinuraj M Below is the workaround for the issue: 1. In /usr/hdp/current/superset/lib/python3.4/site-packages/superset/models.py, replace:
password = Column(EncryptedType(String(1024), config.get('SECRET_KEY'))) with password = Column(String(1024))
2. Then drop and re-create the database.
... View more
08-29-2018
07:41 AM
@Benhail
Muthyala
Sqoop eval with select query basically returns the output of the query on the terminal and storing the same onto a variable is not possible. You can do following: sqoop eval \ -libjars $LIB_JARS -Dteradata.db.input.job.type=hive \ --connect "jdbc:teradata://XXXXXXx" \ --username XXXXXX \ --password XXXXX \ --query "select count(*) from database_name.table_name 1> sqoop.out 2>sqoop.err hive -S -e "select count(*)from database_name.table_name ;" 1> hive.out 2>hive.err The files sqoop.out and hive.out would include some log messages as well which could be grepped and removed.
... View more
08-29-2018
07:28 AM
1 Kudo
@James Creating Hive bucketed table is supported from Spark 2.3 (Jira SPARK-17729). Spark will disallow users from writing outputs to hive bucketed tables, by default. Setting `hive.enforce.bucketing=false` and `hive.enforce.sorting=false` will allow you to save to hive bucketed tables. If you want, you can set those two properties in Custom spark2-hive-site-override on Ambari, then all spark2 application will pick the configurations. For more details,refer Slideshare.
... View more
08-29-2018
06:50 AM
@Arun
Cherla
You can refer link for list of all HTTP endpoints. Please accept answer if this helped.
... View more
08-29-2018
06:39 AM
@naveen r Run Hive query would be running a YARN application and when the queue resource is utilized, then the performance of the query would be affected with respect to container allocation. Wrt HDFS, Hive query would be writing intermediate results, join data spills on the user's temporary directory on HDFS. Less or no storage would be lead to query failures.
... View more
08-28-2018
12:38 PM
@Samant Thakur This is a limitation from Sqoop end that operation on any non-transactional is not supported. Refer to below link to enable transaction logging on Informix and then, try Sqoop. https://www.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.sqlt.doc/ids_sqt_279.htm
... View more
08-28-2018
10:51 AM
1 Kudo
@Eugene Mogilevsky Can you check the HS2 logs and Zookeeper logs for issues? Seems like the Zookeeper is unable to connect to HS2.
... View more
08-27-2018
05:07 PM
@Jai
C
It seems like Sqoop exported '0' records and the mapper failed. Check the application log for errors and share the complete error stack.
... View more
08-27-2018
11:26 AM
@Jai
C
As mentioned in the Sqoop Jira, export into Bucketed Hive table is not supported. To export into Hive table, recreate the table without 'clustered by'.
... View more
08-27-2018
10:26 AM
@Jai
C
The error is not related to the Windows authentication, but is because of the Hive/Hcatalog table being a bucketed table which is not supported from Sqoop export. You can verify the same by running command from Hive Cli / Beeline: show create table <database>.<table>; The table definition would be like, keyword being 'clustered by': hive> show create table default.test_bucket;
OK
CREATE TABLE `default.test_bucket`(
`col1` int,
`col2` string)
CLUSTERED BY (
col1)
INTO 3 BUCKETS
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
'hdfs://xxx.com:8020/apps/hive/warehouse/test_bucket'
TBLPROPERTIES (
'numFiles'='13',
'numRows'='0',
'rawDataSize'='0',
'totalSize'='8883',
'transactional'='true',
'transient_lastDdlTime'='1505206092')
Time taken: 0.786 seconds, Fetched: 21 row(s)
... View more
08-27-2018
10:09 AM
@Jai
C
Below is the sample export command which works. In your case, could you share the error seen? sqoop export --connect "jdbc:jtds:sqlserver://IE11WIN7:1433;useNTLMv2=true;domain=IE11WIN7;databaseName=default_db" --table "test_table_view" --hcatalog-database default --hcatalog-table t1 --columns col2,col3 --connection-manager org.apache.sqoop.manager.SQLServerManager --driver net.sourceforge.jtds.jdbc.Driver --username IEUser --password 'Passw0rd!' --update-mode allowinsert --verbose
... View more
08-27-2018
09:49 AM
@Jai
C
Based on the following Jira, unfortunately, it seems like the ability to import directly into hive bucketed tables is not supported yet. https://issues.apache.org/jira/browse/SQOOP-1889 So, you would have to import the data to an intermediate table and then insert into the bucketed table. Please accept answer if this helped.
... View more
08-16-2018
07:39 AM
It would be due to amount of data being processed and with mapper=1, it would performing low. Try increasing the mappers like "-m 20".
... View more
08-16-2018
07:36 AM
@Saravanan
Muthiah
When using hive-jdbc-standalone*.jar, part from hadoop-common*.jar, below are the other dependent jars required: ibthrift-0.9.0.jar
httpclient-4.2.5.jar
httpcore-4.2.5.jar
commons-logging-1.1.3.jar
hive-common.jar
slf4j-api-1.7.5.jar
hive-metastore.jar
hive-service.jar
hadoop-common.jar
hive-jdbc.jar
guava-11.0.2.jar Please add the jars in the classpath under Client and try again.
... View more