About Vinod369

Tirabassi · ‎08-17-2018

I have this same problem, I too tried follow the method of @Vinod369 and not resolved

rp4345 · ‎01-16-2018

where should we add this warn??

Vinod369 · ‎09-25-2017

Sorry, could not focus on this, was busy with production activities. Finally, I could able to run it successfully with below configurations, Jar / py name : ${nameNode}/user/solution.jar Main Class : Module.final_solution Options List : --conf spark.yarn.jar=local:/opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly.jar Properties: Spark Master : yarn Mode : cluster App name : Final Solution

Sawarkar · ‎08-16-2017

Great Suggestion! Thanks for that! But i would like to ask one question that, if i want to have a struct or array fields in target, then how i should transform the mysql data, so that it will fit in HCatalog schema. The need here is to just have nested data from other collection instead of foreign key representation. Currently we are using sqoop import only, and we are trying to modify the query so that it will be accepted by hcat schema. Thanks & Regards, Mahendra

Harsh J · ‎07-28-2016

The Cloudera ODBC connector is available for Windows, and .NET does support ODBC: http://www.cloudera.com/downloads/connectors/hive/odbc.html There's no direct client for HDFS in .NET, but HDFS offers a REST API via its WebHDFS component. This REST API is documented at http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-project-dist/hadoop-hdfs/WebHDFS.html, and you can use your .NET HTTP client to make use of it.

pdvorak · ‎06-24-2016

Your understanding is correct. You either need to ensure flume can write to that directory, or create a directory that flume owns and can write to. -pd

pdvorak · ‎06-07-2016

The flume http source is for creating a REST API that you can post data to from the upstream sender. If you are looking for a source that will consume from SQL server via accessing the SQL server API, you'll need to write a custom source for that, or possibly try this: https://github.com/keedio/flume-ng-sql-source Additionally, if you don't need real time processing, you may want to consider using sqoop to import data via batch processing, and it can handle incremental updates. -pd

Vinod369 · ‎05-26-2016

Thank you for detail reply. I have initiated Flume as service on Edge node and its as expected.

pdvorak · ‎05-25-2016

This documentation goes over stopping and starting flume when not using Cloudera Manager. This assumes you are running packages and not parcels on this edge node: http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_flume_run.html

cjervis · ‎05-18-2016

Congratulations on solving your issue and thanks for sharing it.

Online	Offline
Last Visited	‎10-02-2020 04:56 PM

Member Since	‎05-11-2016 11:47 AM
Last Visited	‎10-02-2020 04:56 PM
Posts	29
Kudos received	1

Cloudera Community

Re: Not able to run Spark jar action in Oozie

Re: Sqoop with Oozie error

Re: SAS connection issue with hive2

Re: Sqoop with Oozie error

Re: Flume Spooling Directory Source runner has shu...

Re: Not able to run Spark jar action in Oozie

Re: Sqoop to write JSON in HDFS

Re: How can I connect HDFS through .NET ?

Re: Spooldir Error in Flume

Re: Use REST API as source in Flume

Re: Flume agent on edge node

Re: Flume without agents on web server

Re: SAS connection issue with hive2