Member since
07-30-2013
15
Posts
3
Kudos Received
2
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2449 | 05-14-2021 07:41 AM | |
| 5626 | 01-27-2020 06:51 AM |
05-14-2021
07:41 AM
Hello, You can solve this by using the Maven shade plugin. Take a look at the Cloudera doc https://docs.cloudera.com/runtime/7.2.9/developing-spark-applications/topics/spark-packaging-different-versions-of-libraries-with-an-application.html . Michael
... View more
07-23-2020
08:25 AM
1 Kudo
The data node question has been answered, but one tangental comment - you say you are using the Secondary Name Node service. You almost certainly do not want to be using that. You do not get any HA with the SNN. What you probably want is the Standby Namenode. In Cloudera Manager you can enable HA from the HDFS service actions and that will replace your Secondary Name Node with a Standby Name Node.
... View more
01-27-2020
06:51 AM
1 Kudo
Can you use sqoop to retrieve the data directly from the database and dump it into Hive? That will solve your delimiter problem.
... View more
01-26-2020
07:35 AM
Where is the data coming from? You could use a binary format like Avro or Parquet if your source system can export that way. If you MUST have a text file with a delimiter, you need a delimiter that is not anywhere in the data.
... View more
01-24-2020
06:56 AM
1 Kudo
I would recommend not using CSV in your case. If you have commas in the fields then you can't really delimit them with commas because, as you have noticed, you will have field breaks in the middle of a field. Can you get the source data exported some other way?
... View more