Member since
04-06-2018
21
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
10377 | 07-12-2018 03:06 AM |
05-31-2019
09:29 AM
And I found a solution by pointint job.local.dir to directory with the code: spark = SparkSession \ .builder \ .appName('XML ETL') \ .master("local[*]") \ .config('job.local.dir', 'file:/home/zangetsu/proj/prometheus-core/demo/demo-1-iot-predictive-maintainance') \ .config('spark.jars.packages', 'com.databricks:spark-xml_2.11:0.5.0') \ .getOrCreate() Now all works
... View more
07-13-2018
02:05 AM
The developer (customer side) who work with me on the cluster try to use Apache Airflow, and after one week, he can do what we need (workflow, emailing / alerting, re-run, ...) without the load of files into hdfs, Apache airflow is running in standalone mode and the web UI is better than Oozie UI. It seems a better solution than oozie, what do you think about this ? As it is an incubating project, I don't know if it's a good idea, but the web UI is good, it looks easy to manage, I didn't know this new project but I think Oozie is outdated compare to Airflow. For the moment Oozie is in stand-by, they will make a choice between oozie and airflow, but I must admit that Airflow looks a better solution.
... View more
04-09-2018
01:58 AM
Hi, I don't think there is a solution to export hive / impala metadata directly into excel file. You can export metastore to sql dump : https://discuss.pivotal.io/hc/en-us/articles/115000104847-How-to-migrate-Hive-from-one-Hadoop-cluster-to-another- Then, convert the sql dump to csv file : https://blog.twineworks.com/converting-a-mysql-dump-to-csv-files-b5e92d7cc5dd
... View more