Member since
02-01-2022
269
Posts
95
Kudos Received
59
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1910 | 06-12-2024 06:43 AM | |
2666 | 04-12-2024 06:05 AM | |
1977 | 12-07-2023 04:50 AM | |
1177 | 12-05-2023 06:22 AM | |
2077 | 11-28-2023 10:54 AM |
02-22-2023
08:28 AM
1 Kudo
@merlioncurry Lacking a bit of deatils, so making some assumptions that you used an Ambari UI to upload to HDFS. So those files are going to be in hdfs://users/maria_dev, not on the actual machine location for the same users. You will need use hdfs commands to view them. If they do not work, then the path you uploaded may be different. From the sandbox prompt: hdfs dfs -ls /users/ hdfs dfs -ls /users/maria_dev
... View more
02-22-2023
08:16 AM
1 Kudo
@fahed The HDFS Service inside of the DataLake is supporting of the environment, and its services. For example: Atlas. Ranger, Solr, Hbase. It's size, is based on the environment scale. You are correct in the assumption that your end user HDFS Service is part of Data Hubs deployed around the environment. You should not try to use the environment's HDFS Service for applications and workloads that would be part of deeper Data Hubs.
... View more
02-09-2023
06:31 AM
Click into that doc and check out the other escape option. I think you need to handle the quotes too.
... View more
02-09-2023
06:22 AM
1 Kudo
@Techie123 Well, like i said, you have to learn the aws side of providing access to a bucket. A public bucket starting point will show you what you have to do, inside of the bucket config, to allow other systems to access that bucket. For example starting from public open bucket, to whatever access control level you ultimately need to have. Getting lost in that space is not necessarily a "nifi" thing.... so my recommendation is to build nifi with public bucket, THEN when it works, start testing the deeper access requirements. The controller service configuration provides multiple ways to access a bucket and a bunch of settings. Make sure you have a working access/key credentials tested directly in the processor before moving to the Controller Service.
... View more
02-09-2023
06:14 AM
@codiste_m By default hive will be using Static Partitioning. With Hive you can do Dynamic Partitioning, but i am not sure how well that works with existing data in existing folders. I believe this creates the correct partitions based on the schema, and is creating those partition folders as the data inserts into the storage path. It sounds like you will need to execute a load data command for all partitions you want to query.
... View more
02-09-2023
05:53 AM
@ShobhitSingh You need to handle the escape with another option: .option("escape", "\\") You may need to experiment with the actual string in the match argument ("//") to suit your needs. Be sure to check spark docs specific to your version. For example: https://spark.apache.org/docs/latest/sql-data-sources-csv.html
... View more
02-09-2023
05:44 AM
@Iwantkakao There are 2 things i see right off bat: Tue Jan 31 19:47:13 UTC 2023 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification. ^^ consider the recommendation: useSSL=false in your sqoop command. 23/01/31 19:47:14 ERROR manager.SqlManager: Error executing statement: java.sql.SQLException: Access denied for user ''@'localhost' (using password: NO) java.sql.SQLException: Access denied for user ''@'localhost' (using password: NO) ^^ this error is saying that your user does not have access to mysql. You are going to need to provide a specific user, password, and host, with permissions and grants accordingly. If your user is root, add the username and password to the command. Last but not least, sqoop project is now "in the attic" which means the project is no longer actively getting support and developement from the open source community. I recommend that you learn other techniques to complete the same outcome.
... View more
02-09-2023
05:34 AM
@Techie123 You are going to need to provide credentials for the nifi calls against any s3 bucket with access controls. I would recommend working in lower nifi dev environments and use a public S3 buckets to get comfortable. This will test basics of your flow without access issues. This will also remove confusion around nifi flow functionality vs AWS access issues and help you learn when/where to use the different ways (key, or a controler service w/ credentials) to provide access from nifi to s3.
... View more
02-07-2023
05:02 AM
@Abdulrahmants if you need to talk to someone about getting those added, please reach out in direct message. Another approach could be to create an API input endpoint on nifi (handleHttpRequest/handleHttpResponse), and make a scripted (python,java,etc) process to send the file to the nifi endpoint.
... View more
02-06-2023
07:25 AM
1 Kudo
I believe this is a job for MiNiFi https://nifi.apache.org/minifi/index.html Basically, you create a small minifi flow to run on the server/network with privelaged access, and this flow will send its results to NiFi.
... View more