About jagadeesan

jagadeesan · ‎05-28-2020

Hi @Karan1211, User 'admin' does not have access to create a directory under /user. Because the /user/ directory is owned by "hdfs" with 755 permissions. As a result, only hdfs can write to that directory. So you would need to do this: If you want to create a home directory for root so you can store files in this directory, do: sudo -u hdfs hdfs dfs -mkdir /user/admin sudo -u hdfs hdfs dfs -chown admin /user/admin Then as admin you can do hdfs dfs -put file /user/admin/ NOTE: If you get below authentication error, either from your user account, you do not have enough permission to run the above command, so try with sudo or try with first sudo to hdfs user and then execute chown command as hdfs user. su: authentication failure I hope this helps.

jagadeesan · ‎05-28-2020

HI @Heri, Here I just wanna add some points. You can use PURGE option to delete data file as well along with partition metadata but it works only in Internal/ Managed tables ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec PURGE; But for External tables have a two-step process to alter table drop partition + removing file ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec; hdfs dfs -rm -r <partition file path> I hope this gives some insights here. cc @aakulov

jagadeesan · ‎04-23-2020

Please can you check with your internal Linux team / Network team for further support? Because it seems you have some internal connection while connecting the node from the Intellij idea node. Once you resolve the connection issue we will check further.

jagadeesan · ‎04-23-2020

Can you add below property at <spark_home>/conf/hive-site.xml and <hive-home>/conf/hive-site.xml hive.exec.max.dynamic.partitions=2000 <name>hive.exec.max.dynamic.partitions</name> <value>2000</value> <description></description> Hope this helps. Please accept the answer and vote up if it did. Note: Restart HiveServer2 and Spark History Server if it didn't work. -JD

jagadeesan · ‎04-21-2020

Can you try this below article? https://saagie.zendesk.com/hc/en-us/articles/360021384151-Read-Write-files-from-HDFS

jagadeesan · ‎04-21-2020

Hi @w12q12 So as per the below error in the log trace 20/04/21 18:20:50 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1067413441-127.0.0.1-1508775264580:blk_1073743149_2345 file=/data/ratings.csv at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:930) .... Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1067413441-127.0.0.1-1508775264580:blk_1073743149_2345 file=/data/ratings.csv at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:930) It seems that the namenode is not able to connect to the datanode when you ran the command. Please can you try to ping and telnet datanode and name node vice versa also check whether do you have any corrupt blocks and files in the cluster? ~JD

jagadeesan · ‎04-09-2020

Hi @drgenious Are you getting a similar error which reported in KUDU-2633 It seems this is open JIRA reported in the community ERROR core.JobRunShell: Job DEFAULT.EventKpisConsumer threw an unhandled Exception: org.apache.spark.SparkException: Job aborted due to stage failure: Aborting TaskSet 109.0 because task 3 (partition 3) cannot run anywhere due to node and executor blacklist. Blacklisting behavior can be configured via spark.blacklist.*. If you have the data in HDFS in (csv/avro/parquet) format, then you can use the below command to import the files to Kudu table. Prerequisites: Kudu jar with compatible version (1.6 or higher) For more reference spark2-submit --master yarn/local --class org.apache.kudu.spark.tools.ImportExportFiles <path of kudu jar>/kudu-spark2-tools_2.11-1.6.0.jar --operation=import --format=<parquet/avro/csv> --master-addrs=<kudu master host>:<port number> --path=<hdfs path for data> --table-name=impala::<table name> Hope this helps. Please accept the answer and vote up if it did.

jagadeesan · ‎04-03-2020

Hi @Mondi The important differences between parcels and packages are: Parcels are self-contained and installed in a versioned directory, which means that multiple versions of a given parcel can be installed side-by-side. You can then designate one of these installed versions as the active one. With packages, only one package can be installed at a time so there is no distinction between what is installed and what is active. You can install parcels at any location in the filesystem. They are installed by default in /opt/cloudera/parcels. In contrast, packages are installed in /usr/lib. When you install from the Parcels page, Cloudera Manager automatically downloads, distributes and activates the correct parcel for the operating system running on each host in the cluster. Note: You cannot install software using both parcels and packages in the same cluster. Because of their unique properties, parcels offer more advantages over packages, for more details please refer here Hope this helps. Please accept the answer and vote up if it did. Regards,

jagadeesan · ‎03-31-2020

Hi @sppandita85BLR Currently, there is no documented procedure to migrate from HDP. In these cases, it's best to engage with your local Cloudera account rep and professional services. They may help you with the runbook to do the migration or any other feasibility. Hope this helps. Please accept the answer and vote up if it did. Regards,

jagadeesan · ‎03-27-2020

Hi @rajisridhar You can use a command like this to get the start and end time and then store it where you wish to or configure mail accordingly to your requirements. Example: $ oozie job -oozie http://localhost:11000/oozie -info 14-20090525161321-oozie-joe . .---------------------------------------------------------------------------------------------------------------------------------------------------------------- Workflow Name : map-reduce-wf App Path : hdfs://localhost:8020/user/joe/workflows/map-reduce Status : SUCCEEDED Run : 0 User : joe Group : users Created : 2009-05-26 05:01 +0000 Started : 2009-05-26 05:01 +0000 Ended : 2009-05-26 05:01 +0000 Actions .---------------------------------------------------------------------------------------------------------------------------------------------------------------- Action Name Type Status Transition External Id External Status Error Code Start End .---------------------------------------------------------------------------------------------------------------------------------------------------------------- hadoop1 map-reduce OK end job_200904281535_0254 SUCCEEDED - 2009-05-26 05:01 +0000 2009-05-26 05:01 +0000 .---------------------------------------------------------------------------------------------------------------------------------------------------------------- For detailed information see this: https://oozie.apache.org/docs/3.3.2/DG_CommandLineTool.html#Jobs_Operations View solution in original post Hope this helps. Please accept the answer and vote up if it did.

Online	Offline
Last Visited	‎11-24-2024 12:51 PM

Member Since	‎11-12-2018 10:00 AM
Last Visited	‎11-24-2024 12:51 PM
Posts	189
Kudos received	177

Cloudera Community

Re: Apache Storm support in Cloudera

Re: Complete example for using spark MLlib for twi...

Re: CDP - Zeppeling: Spark + Livy + Hive - HWC

Re: CDP - Zeppelin - Livy Error

Re: Spark3 connection to HIVE ACID Tables

Re: Permission denied: user=admin, access=WRITE, i...

Re: Drop table not working as expected in Hive

Re: Intellij idea Apache Spark sql and hadoop hdfs...

Re: Unable to set hive.exec.max.dynamic.partitions...

Re: How to do authenticate kerberized cluster usin...

Re: Intellij idea Apache Spark sql and hadoop hdfs...

Re: Transfer files from hdfs to kudu

Re: Difference between upgrading using Parcels vs ...

Re: HDP to Cloudera Migration

Re: get oozie workflow name with start and end tim...