About darouwan

nur.majid · ‎11-01-2022

Hi @Siddu198 Add this config to your job: set("mapreduce.fileoutputcommitter.algorithm.version","2")

AKR · ‎09-03-2019

Hi, To view the spark logs of an completed application you can view the logs by running the below command yarn logs -applicationId application_xxxxxxxxxxxxx_yyyyyy -appOwner <userowner> > application_xxxxxxxxxxxxx_yyyyyy.log Thanks AKR

Digambar14 · ‎08-28-2019

I am facing same issue. Did you find any solution?

roland_johann · ‎02-01-2019

We ran into the same issue because we rely on poor mans DNS via local hosts file, as we don't have control over the infrastructure. To solve this issue of advertising non existent hostnames there are two solutions: 1. create separate configuration groups for each kafka broker and override `listeners` property with explicit IP of the relevant node 2. setup ambari-agent to specify a custom public hostname and use template variable at kafka config to use that property Solution 2 introduces a fix or another problem, it depends on your setup: By setting up ambari-agent to use a custom public hostname links from ambari to services like HDFS UI, YARN UI, Spark UI, Zeppelin, etc. will use this value. To setup solution 2: Create public hostname script Place a file at /var/lib/ambari-agent/public_hostname.sh and make it executable chmod a+x /var/lib/ambari-agent/public_hostname.sh with following content: #!/bin/sh hostname -I | awk '{print $1}' Change ambari-agent config at /etc/ambari-agent/conf/ambari-agent.ini , add property at agent section: public_hostname_script=/var/lib/ambari-agent/public_hostname.sh Restart ambari agent ambari-agent restart Configure Kafka Broker listener PLAINTEXT://{{config['agentLevelParams']['public_hostname']}}:6667 Restart Kafka

darouwan · ‎11-13-2018

Thanks @KB And another question: When my spark application writing massive of data to hdfs, it always throws error message like following: No lease on /user/xx/sample_2016/_temporary/0/_temporary/attempt_201604141035_0058_m_019029_0/part-r-19029-1b93e1fa-9284-4f2c-821a-c83795ad27c1.gz.parquet:File does not exist.HolderDFSClient_NONMAPREDUCE_1239207978_115 does not have any open files. How to solve this problem? I search online and others said it is related to dfs.datanode.max.xcievers

darouwan · ‎10-15-2018

Solve by using HttpFs. It set a gateway where no need to access data node.

asirna · ‎10-10-2018

1000 is for spark. You can set common.max_count at a global level. You should not have negative results if you increase the limit. But if your data size if very huge then you may need to tweak the above mentioned params accordingly.

darouwan · ‎08-16-2018

Hi @Jonathan Sneep Fine. thanks. I have added the user and group info to my namenode. So the typical way to adding the new user or group is creating the user and group on namenode, and waiting for usersync to sync the user info to Ranger? So if I don't care the group policy, creating internal user in ranger and specifying them in allow conditions also works? At least it seems work in practice..

tauqeerkhan · ‎07-27-2018

@Junfeng Chen I am facing a similar problem , can you please share the steps you performed to resolve it.

ivanku · ‎06-12-2018

hi~ i am facing the same problem....can u show the steps to slove?

Online	Offline
Last Visited	‎04-02-2018 12:02 AM

Member Since	‎12-21-2017 12:43 AM
Last Visited	‎04-02-2018 12:02 AM
Posts	67
Kudos received	3

Cloudera Community

Re: Access file on hdfs via proxy

Re: NullPointerException when running spark on Zep...

Re: How to change Spark _temporary directory when ...

Re: How to view the log of submitted spark jar pro...

Re: How to delete empty columns in df when writing...

Re: How to set kafka to broadcast the ip address r...

Re: A lot of blocks missing in HDFS

Re: Access file on hdfs via proxy

Re: How does zeppelin storage query result?

Re: Ranger auth error in hdfs

Re: "installed libcurl version doesn't support thi...

Re: How to configure to access web ui (like hue) v...