About manikandanjeyab

manikandanjeyab · ‎03-27-2018

Hi All, I'm getting the following Error when trying to connect to Hive from My local Windows 10 machine. i have configured all Hadoop and Spark in my local and its Executing fine when reading as HDFS file its working fine only issue with Hive onle please help me out on this. Exception in thread "main" java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfigurations(HiveUtils.scala:200) at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:265) at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66) at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65) at org.apache.spark.sql.hive.HiveExternalCatalog$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:195) at org.apache.spark.sql.hive.HiveExternalCatalog$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195) at org.apache.spark.sql.hive.HiveExternalCatalog$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:195) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194) at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105) at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93) at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52) at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35) at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289) at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$instantiateSessionState(SparkSession.scala:1059) at org.apache.spark.sql.SparkSession$anonfun$sessionState$2.apply(SparkSession.scala:137) at org.apache.spark.sql.SparkSession$anonfun$sessionState$2.apply(SparkSession.scala:136) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:136) at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:133) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:632) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691) at MyProgram$.main(MyProgram.scala:54) at MyProgram.main(MyProgram.scala) Process finished with exit code 1

manikandanjeyab · ‎03-06-2018

Hi All, I need to Limit Maximum number of Thread usage for Each Processor Group in my Nifi Cluster. for example I'm having 5 processor groups for five countries and i need to limit the thread usage by a maximum usage limit. is there any way to do this ?

manikandanjeyab · ‎03-06-2018

Hi Shu, that Works Great. Tanx for your sollution. 🙂

manikandanjeyab · ‎03-05-2018

Hi Shu, Tanx for replying me, My actual problem is I have files more than 200 and My file size differ from 100 MB to 4 GB. and i need to split those in to 4 Concurrent Processors. could u give some suggestion for this ? the real Problem is i cant route based on file size, i need to send like my first flow file should go to first processor, and second go to second processor, and third go to third processor and finally fourth go to fourth processor, then finally 5'th should go to again first processor and this loop should continue till my last file.

manikandanjeyab · ‎03-05-2018

Hi All, please help me out to solve this issue. Im having 10 Different flow files in the following size. 3 files size (100-500MB) 2 Files (1GB) 2 Files(1.8GB) 3 files (50MB) I need to write some logic that should reroute these 10 files to 4 Convert Record processors running parallely ? is there any way that we could do this ?

manikandanjeyab · ‎04-11-2017

worked fine after alter the create job statement with --password-file as follows: bin/sqoop job --meta-connect jdbc:hsqldb:hsql://localhost:16000/sqoop --create inc_imp_patient_hive -- import --connect jdbc:mysql://localhost/test --username root --password-file /mysql_pwd.pwd --table patient --check-column pid --incremental append --last-value 0 --target-dir /Sqoop/Output/hdfs/patient_Hive -m 1

manikandanjeyab · ‎04-11-2017

i have found that db.password property for the job was not working fine when the job was created with with password as follows bin/sqoop job --meta-connect jdbc:hsqldb:hsql://localhost:16000/sqoop --create inc_imp_patient_hive -- import --connect jdbc:mysql://localhost/test --username root --password cloudera --table patient --check-column pid --incremental append --last-value 0 --target-dir /Sqoop/Output/hdfs/patient_Hive -m 1 it shows the db.password=cloudera in job --show job_name after executing it through Oozie workflow it was showing the db.password = nothing is the in the db.password place. how to resolve this issue ?

manikandanjeyab · ‎04-11-2017

Hi, I'm Trying to create a Oozie workflow for a sqoop incremental job loading data from Mysql to Hive External table. when im trying to run the same job via command line even for n times it's not asking for password, but when i'm configure and execute the same job via Oozie workflow(via hue interface) first time its running successfully but from second time onwords it is asking for the Mysql Password which i passed already when creating the Job with the parameter --password 'MYPassword' is there any workaround for this issue ? i have altered the sqoop-site.xml as follows <property> <name>sqoop.metastore.client.autoconnect.url</name> <value>--meta-connect jdbc:hsqldb:hsql://localhost:16000/sqoop</value> <description>The connect string to use when connecting to a job-management metastore. If unspecified, uses ~/.sqoop/. You can specify a different path here. </description> </property> <property> <name>sqoop.metastore.client.autoconnect.username</name> <value>SA</value> <description>The username to bind to the metastore. </description> </property> <property> <name>sqoop.metastore.client.autoconnect.password</name> <value></value> <description>The password to bind to the metastore. </description> </property> <property> <name>sqoop.metastore.client.record.password</name> <value>true</value> <description>If true, allow saved passwords in the metastore. </description> </property> <property> <name>sqoop.metastore.server.location</name> <value>/tmp/sqoop-metastore/shared.db</value> <description>Path to the shared metastore database files. If this is not set, it will be placed in ~/.sqoop/. </description> </property> <property> <name>sqoop.metastore.server.port</name> <value>16000</value> <description>Port that this metastore should listen on. </description> </property> when i'm trying to run second time it was throwing the following Error: 14614 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 14789 [uber-SubtaskRunner] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.5-cdh5.4.2 15587 [uber-SubtaskRunner] ERROR org.apache.sqoop.SqoopOptions - It seems that you have launched a Sqoop metastore job via 15588 [uber-SubtaskRunner] ERROR org.apache.sqoop.SqoopOptions - Oozie with sqoop.metastore.client.record.password disabled. 15588 [uber-SubtaskRunner] ERROR org.apache.sqoop.SqoopOptions - But this configuration is not supported because Sqoop can't 15588 [uber-SubtaskRunner] ERROR org.apache.sqoop.SqoopOptions - prompt the user to enter the password while being executed 15588 [uber-SubtaskRunner] ERROR org.apache.sqoop.SqoopOptions - as Oozie tasks. Please enable sqoop.metastore.client.record 15588 [uber-SubtaskRunner] ERROR org.apache.sqoop.SqoopOptions - .password in sqoop-site.xml, or provide the password 15588 [uber-SubtaskRunner] ERROR org.apache.sqoop.SqoopOptions - explicitly using --password in the command tag of the Oozie 15588 [uber-SubtaskRunner] ERROR org.apache.sqoop.SqoopOptions - workflow file. 15650 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 15831 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.MySQLManager - Preparing to use a MySQL streaming resultset. 15831 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation 16113 [uber-SubtaskRunner] ERROR org.apache.sqoop.manager.SqlManager - Error executing statement: java.sql.SQLException: Access denied for user 'root'@'localhost' (using password: NO) java.sql.SQLException: Access denied for user 'root'@'localhost' (using password: NO)

manikandanjeyab · ‎04-05-2017

@Jay SenSharma Tanx for the URL.

manikandanjeyab · ‎04-05-2017

coz as you mentioned it should check in the Directory like 4.0.0.2.0.10.0-1 or 4.0.0.2.0.xx.xx-x but it is trying to search for exact oozie-4.0.0, Thinking due to the mistake that i have done in POM.xml files it is referring like that.

Online	Offline
Last Visited	‎09-18-2019 09:50 AM

Member Since	‎03-26-2017 12:39 PM
Last Visited	‎09-18-2019 09:50 AM
Posts	61
Kudos received	1

Cloudera Community

Re: spark scala Dataframe adding new column Error

Re: Unsupported literal type class in Apache Spark...

Re: Spark SQL Error connecting Hive Metastore

Spark SQL Error connecting Hive Metastore

How to Limit Number of Threads for Each Processot ...

Re: How to re route flow files to 4 Different proc...

Re: How to re route flow files to 4 Different proc...

How to re route flow files to 4 Different processo...

Re: Sqoop job asking password for every second tim...

Re: Sqoop job asking password for every second tim...

Sqoop job asking password for every second time ru...

Re: Error while installing Oozie 4.0.0 in Hadoop 2...

Re: Error while installing Oozie 4.0.0 in Hadoop 2...