Member since
02-22-2014
2
Posts
0
Kudos Received
0
Solutions
09-08-2017
12:43 AM
I'm using Flume's HDFS sink and it sometimes doesn't close the temporary files in HDFS, so that these data become lost. I think it is because supervisord doesn't give Flume agents enough time to stop, since the default stopwaitsecs is 30 seconds, and after that supervisord just "kill -9" it. WARN killing '388-flume-AGENT' (7994) with SIGKILL If I kill the process in terminal, it'll take a minute or so to fully stop, thus making no unrenamed temporary files. I'm wondering whether it's possible to increase the stopwaitsecs. Thanks.
... View more
Labels:
- Labels:
-
Apache Flume
02-22-2014
06:34 AM
I have a CDH4.5 cluster, and I want to upload files into it from another server (e.g. database server). With vanilla Hadoop and Hive, I can change the configuration files, pointing the namenode and metastore to remote services, and simply run: dba@db-001$ hadoop fs -copyFromLocal /path/to/export.tsv dba@db-001$ hive -e "load data local inpath '/path/to/export.tsv' into table test.my_table" But how about CDH? What components should I install on other servers?
... View more
Labels: