Member since
04-11-2016
535
Posts
147
Kudos Received
77
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4097 | 09-17-2018 06:33 AM | |
892 | 08-29-2018 07:48 AM | |
1488 | 08-28-2018 12:38 PM | |
944 | 08-03-2018 05:42 AM | |
983 | 07-27-2018 04:00 PM |
02-15-2022
08:00 AM
Hi @CN As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
... View more
01-24-2022
02:38 AM
Hi, when I run the hive query it showing the below error Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask But this error is not showing all the time it got succeed with some of the users some times it got failed. Could you please suggest the reason and how to overcome this. need urgent. could you please help us.
... View more
01-19-2021
03:38 AM
Thank you So much Subha, It worked like magic.
... View more
12-31-2020
10:11 AM
@amol_08 can you let me know the fix for this..why logs are not able to visible..
... View more
10-22-2020
04:59 AM
I did as @ssubhas said, setting the attributes to false. spark.sql("SET hive.enforce.bucketing=false")
spark.sql("SET hive.enforce.sorting=false")
spark.sql("SET spark.hadoop.hive.exec.dynamic.partition = true")
spark.sql("SET spark.hadoop.hive.exec.dynamic.partition.mode = nonstrict")
newPartitionsDF.write.mode(SaveMode.Append).format("hive").insertInto(this.destinationDBdotTableName) Spark can create the bucketed table in Hive with no issues. Spark inserted the data into the table, but it totally ignored the fact that the table is bucketed. So when I open a partition, I see only 1 file. When inserting, we should set hive.enforce.bucketing = true, not false. And you will face the following error in Spark logs. org.apache.spark.sql.AnalysisException: Output Hive table `hive_test_db`.`test_bucketing` is bucketed but Spark currently does NOT populate bucketed output which is compatible with Hive.; This means that Spark doesn't support insertion into bucketed Hive tables. The first answer in this Stackoverflow question, explains that what @ssubhas suggested is a workaround that doesn't guarantee bucketing.
... View more
07-30-2020
08:55 AM
1 Kudo
brutal I know but a oneliner cd $(cat /etc/ambari-server/conf/ambari.properties | grep -i mpack|awk -F'=' '{print$2}') ; ls -l|grep -v cache |grep -v mpacks_replay.log |grep -v total |awk '{print$9}' |xargs The last bit is handy if you to create a ruby fact out of the data
... View more
06-07-2020
11:36 PM
@oudaysaada As this is an older post you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.
... View more
05-02-2020
09:35 AM
@ssubhas This did not work as well. Can you help me out. I am unable to connect to HIVE SERVICE in Putty.
... View more
04-20-2020
09:47 AM
hdfs dfs -ls -R <directory> |grep part-r* |awk '{print $8}' |xargs hdfs dfs -cat | wc -l
... View more
04-19-2020
04:44 AM
Hi Manoj, We are using textfiles with separationchar as '|', but the problem we have embedded new lines in columns which is resulting data empty in hive as its considering as new line. Rest data is migrating perfectly fine. Could you please suggest us how to avoid new line characters in between column-data. Thanks&Regards Sreeja
... View more
03-06-2020
06:06 PM
@sri_man
Since this thread was marked 'Solved' back in 2016, you would have a better chance of receiving a relevant response by posting a new question. This will also provide the opportunity to provide details specific to your environment that could aid other members in providing a more tailored answer to your issue.
... View more
02-19-2020
09:20 AM
Writing this so that it can help someone in future: I was installing Hive and getting error that It hive metastore wasn't able to connect, and I successfully resolved the error by recreating the hive metastore database. Someone the user which was created in mysql Hive metastore wasn't working properly and not able to authenticate. So I dropped metastore DB, Dropped User. Recreated Metastore DB, Recreated User, Granted all privileges and then it was working without issues.
... View more
01-05-2020
07:07 AM
this solution is not working for please tell me where I am going wrong sqoop-import -Dmapreduce.job.user.classpath.first=true -Dhadoop.security.credential.provider.path=jceks://x.jceks \ --connect="jdbc:mysql://quickstart.cloudera:3306/retail_db" \ --username retail_dba \ --password cloudera \ --table=departments \ --hive-import \ --target-dir=/departments \ --as-avrodatafile
... View more
10-27-2019
12:11 AM
why Kylo has changed , and what is future Road map for Kylo , will it be not a Good Fit for Enterprise Data Flow management like Nifi ? why Kylo when we have Nifi ?
... View more
09-17-2019
11:58 AM
@arun_cherlareac Were you able to fix this?
... View more
10-09-2018
05:57 AM
hi Sindhu, i have not done any configuration changes while installing i have follow all the hortonworks docs to install druid can you tell me on which config need to change. Thanks.
... View more
09-26-2018
12:07 PM
Yes @Sindhu I can see datasource "Sterlingtest" in superset but surprisingly when I logged into mysql backend and queried then I do not see any data source in that. mysql> select * from druid_dataSource; Empty set (0.00 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | druid| +--------------------+ 2 rows in set (0.00 sec) mysql> show tables; +-----------------------+ | Tables_in_druid | +-----------------------+ | druid_audit | | druid_config| | druid_dataSource| | druid_pendingSegments | | druid_rules | | druid_segments| | druid_supervisors | | druid_tasklocks | | druid_tasklogs| | druid_tasks | +-----------------------+ 10 rows in set (0.00 sec) mysql> select * from druid_audit; Empty set (0.00 sec) mysql> select * from druid_pendingSegments; Empty set (0.01 sec) mysql> select * from druid_rules; +-----------------------------------+------------+--------------------------+-----------------------------------------------------------------+ | id| dataSource | version| payload | +-----------------------------------+------------+--------------------------+-----------------------------------------------------------------+ | _default_2018-08-29T09:47:21.779Z | _default | 2018-08-29T09:47:21.779Z | [{"tieredReplicants":{"_default_tier":2},"type":"loadForever"}] | +-----------------------------------+------------+--------------------------+-----------------------------------------------------------------+ 1 row in set (0.00 sec) mysql>
... View more
09-17-2018
10:48 AM
@Vikash Kumar The properties 'mapreduce.job.*' are only applicable to MR jobs. In Tez, the number of mappers and controlled by below parameters:
tez.grouping.max-size(default 1073741824 which is 1GB) tez.grouping.min-size(default 52428800 which is 50MB) tez.grouping.split-count(not set by default) And, reducers are controlled in Hive with properties:
hive.exec.reducers.bytes.per.reducer(default 256000000) hive.exec.reducers.max(default 1009) hive.tez.auto.reducer.parallelism(default false) For more details, refer link.
... View more
03-04-2019
07:51 AM
you can modify hive.distro script and let the login authentication enter in the script it self
... View more
08-16-2018
11:02 AM
@Sudharsan Ganeshkumar You are not seeing anything because you are running the command as root user ! You will have to switch to the hive user and use hive or beeline # su - hive
$ hive Then at the prompt run the create statement hive> CREATE TABLE IF NOT EXISTS emp ( eid int, name String,
salary String, destination String)
COMMENT ‘Employee details’
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE; And then run hive> show table emp; HTH
... View more
08-13-2018
12:59 PM
@rinu shrivastav The split size is calculated by the formula:- max(mapred.min.split.size, min(mapred.max.split.size, dfs.block.size))
Say, HDFS block size is 64 MB and min.input.size is set to 128MB, then there will be split size would be 128MB. To read 256MB of data, there will be two mappers. To increase the number of mappers, then you could decrease min.input.size till the HDFS block size. split size=max(128,min(256,64))
... View more
08-06-2018
12:14 PM
@abcwt112
abcwt112
Can you check if the Hive metastore process running by running command 'ps -ef |grep -i metastore'? If not running, check for the errors under /var/log/hive/hivemetastore.log.
... View more
08-03-2018
07:25 AM
2 Kudos
Thankyou @Sindhu and @Rakesh S. I did a root cause analysis and found that our server is hosted in AWS which is a public cloud and we have not setup Kerberos or firewalls. In the nodes I can find the process w.conf running: yarn 21775 353 0.0 470060 12772 ? Ssl Aug02 5591:25 /var/tmp/java -c /var/tmp/w.conf Within /var/temp I can see a config.json which contains: {
"algo": "cryptonight", // cryptonight (default) or cryptonight-lite
"av": 0, // algorithm variation, 0 auto select
"background": true, // true to run the miner in the background
"colors": true, // false to disable colored output
"cpu-affinity": null, // set process affinity to CPU core(s), mask "0x3" for cores 0 and 1
"cpu-priority": null, // set process priority (0 idle, 2 normal to 5 highest)
"donate-level": 1, // donate level, mininum 1%
"log-file": null, // log all output to a file, example: "c:/some/path/xmrig.log"
"max-cpu-usage": 95, // maximum CPU usage for automatic mode, usually limiting factor is CPU cache not this option.
"print-time": 60, // print hashrate report every N seconds
"retries": 5, // number of times to retry before switch to backup server
"retry-pause": 5, // time to pause between retries
"safe": false, // true to safe adjust threads and av settings for current CPU
"threads": null, // number of miner threads
"pools": [
{
"url": "158.69.133.20:3333", // URL of mining server
"user": "4AB31XZu3bKeUWtwGQ43ZadTKCfCzq3wra6yNbKdsucpRfgofJP3YwqDiTutrufk8D17D7xw1zPGyMspv8Lqwwg36V5chYg", // username for mining server
"pass": "x", // password for mining server
"keepalive": true, // send keepalived for prevent timeout (need pool support)
"nicehash": false // enable nicehash/xmrig-proxy support
},
{
"url": "192.99.142.249:3333", // URL of mining server
"user": "4AB31XZu3bKeUWtwGQ43ZadTKCfCzq3wra6yNbKdsucpRfgofJP3YwqDiTutrufk8D17D7xw1zPGyMspv8Lqwwg36V5chYg", // username for mining server
"pass": "x", // password for mining server
"keepalive": true, // send keepalived for prevent timeout (need pool support)
"nicehash": false // enable nicehash/xmrig-proxy support
},
{
"url": "202.144.193.110:3333", // URL of mining server
"user": "4AB31XZu3bKeUWtwGQ43ZadTKCfCzq3wra6yNbKdsucpRfgofJP3YwqDiTutrufk8D17D7xw1zPGyMspv8Lqwwg36V5chYg", // username for mining server
"pass": "x", // password for mining server
"keepalive": true, // send keepalived for prevent timeout (need pool support)
"nicehash": false // enable nicehash/xmrig-proxy support
}
],
"api": {
"port": 0, // port for the miner API https://github.com/xmrig/xmrig/wiki/API
"access-token": null, // access token for API
"worker-id": null // custom worker-id for API
}
} which clearly shows some mining attack effected with our system. Worst of it, all the the files were created and process were running with root permissions. Even though I could not confirm the root cause, I guess, some attacker got access to our unprotected/unrestricted 8088 port and identified that the cluster is not Kerberized. Hence he tried some bruteforce and cracked our root password. Thus logged in to our AWS cluster and gained full access of our cluster. Conclusion: 1. Enable kerberos, add Knox, and secure your servers 2. Try to enable VPC 3. Refine the security groups to whitelist needed IPs and ports for HTTP and SSH 4. Give high security passwords for public clouds. 5. Change the default static user in Hadoop. Ambari > HDFS > Configurations >Custom core-site > Add Property hadoop.http.staticuser.user=yarn
... View more
08-03-2018
05:56 AM
@Moises Silva There is failed query wrt the orderby. Can you check for the error in the application log? Also, there running queries is not assigned to any dag. Did you check if you have enough resources from RM UI?
... View more
10-08-2018
03:38 PM
Same problem but more complex. Can you help me ?
oozie DB is created user/pass and privileges are set OK Connection test is OK I can connect through command line from the same server emulating JDBC connector with sqlline: # java -Djava.ext.dirs=/home/user/jline_sqlline__mysql_connector/ sqlline.SqlLine
sqlline version 1.0.2 by Marc Prud'hommeaux
sqlline> !connect jdbc:mysql://pro-hadoop-ambari/oozie oozie XXXXXXX
Connecting to jdbc:mysql://pro-hadoop-ambari/oozie
Connected to: MySQL (version 5.7.23)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Autocommit status: true
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:mysql://pro-hadoop-ambari/oozie>
But ... the service doesn't start due to a JDBC error 😞 Validate DB Connection
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.6.5.0-292/oozie/libserver/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.6.5.0-292/oozie/lib/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
DONE
DB schema does not exist
Check OOZIE_SYS table does not exist
DONE
Create SQL schema
Error: A connection could not be obtained for driver class "com.mysql.jdbc.Driver" and URL "jdbc:mysql://pro-hadoop-ambari/oozie". You may have specified an invalid URL.
Stack trace for the error was (for debug purposes):
--------------------------------------
<openjpa-2.4.1-r422266:1730418 fatal user error> org.apache.openjpa.util.UserException: A connection could not be obtained for driver class "com.mysql.jdbc.Driver" and URL "jdbc:mysql://pro-hadoop-ambari/oozie". You may have specified an invalid URL.
at org.apache.openjpa.jdbc.schema.DataSourceFactory.newConnectException(DataSourceFactory.java:272)
at org.apache.openjpa.jdbc.schema.DataSourceFactory.installDBDictionary(DataSourceFactory.java:258)
at org.apache.openjpa.jdbc.conf.JDBCConfigurationImpl.getConnectionFactory(JDBCConfigurationImpl.java:733)
at org.apache.openjpa.jdbc.conf.JDBCConfigurationImpl.getDataSource(JDBCConfigurationImpl.java:878)
at org.apache.openjpa.jdbc.conf.JDBCConfigurationImpl.getDataSource2(JDBCConfigurationImpl.java:920)
at org.apache.openjpa.jdbc.schema.SchemaTool.<init>(SchemaTool.java:132)
at org.apache.openjpa.jdbc.meta.MappingTool.newSchemaTool(MappingTool.java:314)
at org.apache.openjpa.jdbc.meta.MappingTool.record(MappingTool.java:495)
at org.apache.openjpa.jdbc.meta.MappingTool.run(MappingTool.java:1095)
at org.apache.openjpa.jdbc.meta.MappingTool.run(MappingTool.java:1006)
at org.apache.openjpa.jdbc.meta.MappingTool$1.run(MappingTool.java:939)
at org.apache.openjpa.lib.conf.Configurations.launchRunnable(Configurations.java:762)
at org.apache.openjpa.lib.conf.Configurations.runAgainstAllAnchors(Configurations.java:752)
at org.apache.openjpa.jdbc.meta.MappingTool.main(MappingTool.java:934)
at org.apache.oozie.tools.OozieDBCLI.createUpgradeDB(OozieDBCLI.java:1191)
at org.apache.oozie.tools.OozieDBCLI.createDB(OozieDBCLI.java:198)
at org.apache.oozie.tools.OozieDBCLI.run(OozieDBCLI.java:131)
at org.apache.oozie.tools.OozieDBCLI.main(OozieDBCLI.java:79)
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Access denied for user 'oozie'@'pro-hadoop-ambari' (using password: YES))
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1549)
at org.apache.commons.dbcp.BasicDataSource.createDataSource(BasicDataSource.java:1388)
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at org.apache.openjpa.jdbc.schema.DBCPDriverDataSource.getDBCPConnection(DBCPDriverDataSource.java:74)
at org.apache.openjpa.jdbc.schema.AutoDriverDataSource.getConnection(AutoDriverDataSource.java:42)
at org.apache.openjpa.jdbc.schema.SimpleDriverDataSource.getConnection(SimpleDriverDataSource.java:76)
at org.apache.openjpa.lib.jdbc.DelegatingDataSource.getConnection(DelegatingDataSource.java:118)
at org.apache.openjpa.lib.jdbc.DecoratingDataSource.getConnection(DecoratingDataSource.java:92)
at org.apache.openjpa.jdbc.schema.DataSourceFactory.installDBDictionary(DataSourceFactory.java:250)
... 16 more
Caused by: java.sql.SQLException: Access denied for user 'oozie'@'pro-hadoop-ambari' (using password: YES)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:927)
at com.mysql.jdbc.MysqlIO.proceedHandshakeWithPluggableAuthentication(MysqlIO.java:1709)
at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1252)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2488)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2521)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2306)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:839)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:49)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:421)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:350)
at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
at org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
at org.apache.commons.dbcp.BasicDataSource.validateConnectionFactory(BasicDataSource.java:1556)
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1545)
... 24 more
--------------------------------------
... View more
07-11-2018
12:13 PM
@Anjali Shevadkar you are right, that's why I asked you to check hive cli, so, seems to be some configuration in your ranger. Did you try to connect using ZK hosts on your connection string? I suggest you check this following document, check the permissions on HDFS. Let me know if this works for you. Make sure the user that you configure as the same as the unix user (or ldap, whatever). Try to configure another user to test. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/configure_ranger_authentication.html Another important thing, check the permissions on your HDFS, because when you are using ranger you need to change the owner/group and permissions https://br.hortonworks.com/blog/best-practices-in-hdfs-authorization-with-apache-ranger/
... View more
01-16-2019
08:28 AM
@Sindhu hi sindhu, what is the connection string for http mode of hive with kerberized cluster ,im unable to connect the sqlalchemy uri.for binary its working fine please help me out. i am getting below error to connect the http mode of hive (knox). ERROR: {"error": "Connection failed!\n\nThe error message returned was:\nTSocket read 0 bytes"} thanks in advance
... View more
07-02-2018
07:35 AM
@Asom Alimdjanov Can you verify if the httpclient*.jar is same under <installation>/hive/lib location?
... View more
07-27-2018
06:17 PM
@Gayathri Devi Sample R script: library(DBI)
library(rJava)
library(RJDBC)
hadoop.class.path = list.files(path=c("/usr/hdp/2.4.0.0-169/hadoop"),pattern="jar", full.names=T);
hive.class.path = list.files(path=c("/usr/hdp/current/hive-client/lib"),pattern="jar", full.names=T);
hadoop.lib.path = list.files(path=c("/usr/hdp/current/hive-client/lib"),pattern="jar",full.names=T);
mapred.class.path = list.files(path=c("/usr/hdp/current/hadoop-mapreduce-client/lib"),pattern="jar",full.names=T);
cp = c(hive.class.path,hadoop.lib.path,mapred.class.path,hadoop.class.path)
drv <- JDBC("org.apache.hive.jdbc.HiveDriver","hive-jdbc.jar",identifier.quote="`")
conn <- dbConnect(drv, "jdbc:hive2://ixxx:10000/default", "hive", "hive")
show_databases <- dbGetQuery(conn, "show databases") (OR) library("DBI")
library("rJava")
library("RJDBC")
hive.class.path = list.files(path=c("/usr/hdp/current/hive-client/lib"), pattern="jar", full.names=T);hadoop.lib.path = list.files(path=c("/usr/hdp/current/hive-client/lib"), pattern="jar", full.names=T);hadoop.class.path = list.files(path=c("/usr/hdp/2.4.0.0-169/hadoop"), pattern="jar", full.names=T);
cp = c(hive.class.path, hadoop.lib.path, hadoop.class.path, "/usr/hdp/2.4.0.0-169/hadoop-mapreduce/hadoop-mapreduce-client-core.jar")
.jinit(classpath=cp)
drv <- JDBC("org.apache.hive.jdbc.HiveDriver","hive-jdbc.jar",identifier.quote="`")
url.dbc <-paste0("jdbc:hive2://xxx:10000/default");
conn <- dbConnect(drv, url.dbc, "hive", “hive");dbListTables(conn);
... View more
04-06-2018
07:20 AM
@Mohd Azhar What is the version of the Ambari in-use?
... View more