Member since
02-25-2025
2
Posts
0
Kudos Received
0
Solutions
03-06-2025
12:25 AM
Hello, I need to read SAS files using Spark in Cloudera Private Cloud. To achieve this, I’m using the following approach: However, to run this successfully, I need to add the com.github.saurfang.sas.spark package to my Spark environment. Has anyone done this before? If so, I’d appreciate any guidance on how to set it up. Thanks in advance!
... View more
Labels:
- Labels:
-
Apache Spark
02-26-2025
05:52 AM
@dsender Apache NiFi is a data agnostic service. It can move any data format through a dataflow because the content is treated as just bytes inside a FlowFile. The only time the content needs to be read is if there is need to manipulate it, extract from it, etc. Then you would need to use a processor that understand the data format. While it does not appear that Cloudera Flow Management offers any SAS specific processor components. So some custom processor would need to be developed or perhaps you can use one of the available scripting processors? You would still need to write a custom script to ingest and/or process the SAS files. So this starts with the question of how would you pull these SAS files from command line outside of using NiFi? Then figure out how to turn that success into a custom script or processor that does the same thing. You could also reach out to your Cloudera Account owner and discuss possible professional service offering that maybe able to help you here with your custom needs. Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more