<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Flink : Files written to HDFS are stuck in .pending when using flink api in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flink-Files-written-to-HDFS-are-stuck-in-pending-when-using/m-p/113147#M33871</link>
    <description>&lt;P&gt;Hi , &lt;/P&gt;&lt;P&gt;I am doing a poc in which I am trying to write some data on the HDFS using flink . Though I can see the files are getting written but they are stuck with a postfix ".pending" .  Any help will be appreciated , also is there a way that only one file is written &lt;/P&gt;&lt;P&gt;StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();&lt;/P&gt;&lt;P&gt;
env.enableCheckpointing(123, CheckpointingMode.AT_LEAST_ONCE, true); &lt;/P&gt;&lt;P&gt;DataStream&amp;lt;String&amp;gt; text = env.readTextFile("D:/names.txt");&lt;/P&gt;&lt;P&gt;
DataStream&amp;lt;String&amp;gt; parsed = text.map(new MapFunction&amp;lt;String, String&amp;gt;() {
@Override
public String map(String value) {
return (value);
}
}); &lt;/P&gt;&lt;P&gt;parsed.flatMap(new FlatMapFunction&amp;lt;String, String&amp;gt;() {
public void flatMap(String value, Collector&amp;lt;String&amp;gt; out) throws Exception {
for (String s : value.split(" ")) {
out.collect(s);
}
}
}); &lt;/P&gt;&lt;P&gt;System.setProperty("HADOOP_USER_NAME", "hdfs");&lt;/P&gt;&lt;P&gt;
RollingSink&amp;lt;String&amp;gt; sink = new RollingSink&amp;lt;String&amp;gt;("hdfs://MYMACHINE:8020/flink/test8");&lt;/P&gt;&lt;P&gt;
sink.setBucketer(new NonRollingBucketer());&lt;/P&gt;&lt;P&gt;parsed.addSink(sink);&lt;/P&gt;&lt;P&gt;env.execute();&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 10:28:40 GMT</pubDate>
    <dc:creator>schauhan1</dc:creator>
    <dc:date>2022-09-16T10:28:40Z</dc:date>
    <item>
      <title>Flink : Files written to HDFS are stuck in .pending when using flink api</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flink-Files-written-to-HDFS-are-stuck-in-pending-when-using/m-p/113147#M33871</link>
      <description>&lt;P&gt;Hi , &lt;/P&gt;&lt;P&gt;I am doing a poc in which I am trying to write some data on the HDFS using flink . Though I can see the files are getting written but they are stuck with a postfix ".pending" .  Any help will be appreciated , also is there a way that only one file is written &lt;/P&gt;&lt;P&gt;StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();&lt;/P&gt;&lt;P&gt;
env.enableCheckpointing(123, CheckpointingMode.AT_LEAST_ONCE, true); &lt;/P&gt;&lt;P&gt;DataStream&amp;lt;String&amp;gt; text = env.readTextFile("D:/names.txt");&lt;/P&gt;&lt;P&gt;
DataStream&amp;lt;String&amp;gt; parsed = text.map(new MapFunction&amp;lt;String, String&amp;gt;() {
@Override
public String map(String value) {
return (value);
}
}); &lt;/P&gt;&lt;P&gt;parsed.flatMap(new FlatMapFunction&amp;lt;String, String&amp;gt;() {
public void flatMap(String value, Collector&amp;lt;String&amp;gt; out) throws Exception {
for (String s : value.split(" ")) {
out.collect(s);
}
}
}); &lt;/P&gt;&lt;P&gt;System.setProperty("HADOOP_USER_NAME", "hdfs");&lt;/P&gt;&lt;P&gt;
RollingSink&amp;lt;String&amp;gt; sink = new RollingSink&amp;lt;String&amp;gt;("hdfs://MYMACHINE:8020/flink/test8");&lt;/P&gt;&lt;P&gt;
sink.setBucketer(new NonRollingBucketer());&lt;/P&gt;&lt;P&gt;parsed.addSink(sink);&lt;/P&gt;&lt;P&gt;env.execute();&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:28:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flink-Files-written-to-HDFS-are-stuck-in-pending-when-using/m-p/113147#M33871</guid>
      <dc:creator>schauhan1</dc:creator>
      <dc:date>2022-09-16T10:28:40Z</dc:date>
    </item>
    <item>
      <title>Re: Flink : Files written to HDFS are stuck in .pending when using flink api</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flink-Files-written-to-HDFS-are-stuck-in-pending-when-using/m-p/113148#M33872</link>
      <description>&lt;P&gt;Hi,
unfinished buckets have the .pending extension. Once a bucket is closed (for example for time-bucketing, once the time is over), the file will be renamed.
Since you are using the NonRollingBucketer, the files will never be closed. I would recommend you to use the DateTimeBucketer.&lt;/P&gt;&lt;P&gt;As a side note: I would recommend you to increase the checkpointing intervall a bit. 123 milliseconds is very frequent and the application doesn't look like being extremely latency critical. A value like 2000 milliseconds is probably more appropriate.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jul 2016 15:14:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flink-Files-written-to-HDFS-are-stuck-in-pending-when-using/m-p/113148#M33872</guid>
      <dc:creator>rmetzger1</dc:creator>
      <dc:date>2016-07-06T15:14:45Z</dc:date>
    </item>
    <item>
      <title>Re: Flink : Files written to HDFS are stuck in .pending when using flink api</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flink-Files-written-to-HDFS-are-stuck-in-pending-when-using/m-p/113149#M33873</link>
      <description>&lt;P&gt;thanks Robert, it worked&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jul 2016 22:16:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Flink-Files-written-to-HDFS-are-stuck-in-pending-when-using/m-p/113149#M33873</guid>
      <dc:creator>schauhan1</dc:creator>
      <dc:date>2016-07-06T22:16:44Z</dc:date>
    </item>
  </channel>
</rss>

