Member since
04-24-2017
61
Posts
6
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5706 | 12-23-2018 01:06 PM | |
3862 | 12-14-2018 10:59 AM |
03-25-2019
08:09 AM
Hi , This looks like application is trying to find an hdfs file under yarn local directory hdfs:/user/user1/pqrstu/config/input_abc1234.xml The file being create here should be just input_abc1234.xml Not sure what might be casuing this. Can you please give us the exact command you are using to submit spark job. Thanks Bimal
... View more
03-13-2019
06:52 PM
Please create a new thread for distinct questions, instead of bumping an older, resolved thread. As to your question, the error is clear as is the documentation, quoted below: """ Spooling Directory Source This source lets you ingest data by placing files to be ingested into a “spooling” directory on disk. This source will watch the specified directory for new files, and will parse events out of new files as they appear. The event parsing logic is pluggable. After a given file has been fully read into the channel, it is renamed to indicate completion (or optionally deleted). Unlike the Exec source, this source is reliable and will not miss data, even if Flume is restarted or killed. In exchange for this reliability, only immutable, uniquely-named files must be dropped into the spooling directory. Flume tries to detect these problem conditions and will fail loudly if they are violated: If a file is written to after being placed into the spooling directory, Flume will print an error to its log file and stop processing. If a file name is reused at a later time, Flume will print an error to its log file and stop processing. """ - https://archive.cloudera.com/cdh5/cdh/5/flume-ng/FlumeUserGuide.html#spooling-directory-source It appears that you can get around this by using ExecSource with a script or command that reads the files, but you'll have to sacrifice reliability. It may be instead worth investing in an approach that makes filenames unique (`uuidgen` named softlinks in another folder, etc.)
... View more
01-23-2019
07:38 AM
Hi Bimalc, thank you very much for your answer. At this moment I can only confirm that fs.namenode.delegation.token.max-lifetime is set to 7 days. We use gobblin keytab and have experimented with different settings of gobblin.yarn.login.interval.minutes and gobblin.yarn.token.renew.interval.minutes on gobblin side, but with no success yet. I've started a new run of gobblin now, so we'll need to wait some time for the next failure. I'll check logs against possible token renewal errors or any other suspicious symptomps and get back in this thread with results. Thanks!
... View more
01-02-2019
10:50 AM
env -i spark2-submit --keytab svc.keytab --principal svc@CORP.COM sample.py We submit the jobs someothing like this.
... View more
12-18-2018
12:49 AM
1 Kudo
Yeah, I did, tks 😄
... View more
12-07-2018
08:35 PM
Hi Alex, Look for the Logaggregation related messages in the Node manger log file on one of the node where one of the container was running for the application: In normal case you should see: 2018-12-07 20:27:59,994 INFO org.apache.spark.network.yarn.YarnShuffleService: Stopping application application_1544179594403_0020 ... org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Application just finished : application_1544179594403_0020 .. org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Uploading logs for container container_e06_1544179594403_0020_01_000001. Current good log dirs are /yarn/container-logs Do you see these messages for the failing application or do you see some error/exception instead? If you can paste the relevant log for the failing application I can take a look. Regards Bimal
... View more