Member since
08-16-2015
97
Posts
16
Kudos Received
12
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
885 | 07-11-2021 08:05 PM | |
1656 | 07-11-2021 06:37 PM | |
39422 | 06-04-2021 12:01 AM | |
1042 | 06-03-2021 11:43 PM | |
3426 | 04-26-2021 06:58 PM |
04-09-2021
10:05 PM
Hello Hive timestamp does support up to 9 digits decimal places (nano seconds) For your case, maybe you can check whether for those timestamp with none-zero nano seconds, e.g. 1750-01-01 00:00:00.123456789, whether such data can be exported correctly And for your example, 00:00:00.0 equals to 00:00:00, you didn't lose any precision, as it is zero nanosecond
... View more
04-09-2021
09:54 PM
Hello According to the documentation related to the state management, it will only pull the new files compared to last run https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-azure-nar/1.5.0/org.apache.nifi.processors.azure.storage.ListAzureBlobStorage/ State management: Scope Description CLUSTER After performing a listing of blobs, the timestamp of the newest blob is stored. This allows the Processor to list only blobs that have been added or modified after this date the next time that the Processor is run. State is stored across the cluster so that this Processor can be run on Primary Node only and if a new Primary Node is selected, the new node can pick up where the previous node left off, without duplicating the data.
... View more
04-09-2021
06:53 PM
Hello This HUE distcp editor is designed to replicate data within the cluster and/or with the object store You can click on the "..." button next to the input box to see what are the directories you account has access to, but within the current cluster scope For data replication between two clusters, use Cloudera Manager/Replication Manager https://docs.cloudera.com/cdp/latest/data-migration/topics/rm-dc-data-replication.html
... View more
04-09-2021
06:36 PM
Hello Have you tried to tune the performance? https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_command-line-installation/content/tune_llap.html
... View more
04-09-2021
06:34 PM
1 Kudo
Hello According to NiFi documentation, everytime the processor performing a listing of blobs, it auto picks up the new data since last run https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-azure-nar/1.5.0/org.apache.nifi.processors.azure.storage.ListAzureBlobStorage/index.html State management: Scope Description CLUSTER After performing a listing of blobs, the timestamp of the newest blob is stored. This allows the Processor to list only blobs that have been added or modified after this date the next time that the Processor is run. State is stored across the cluster so that this Processor can be run on Primary Node only and if a new Primary Node is selected, the new node can pick up where the previous node left off, without duplicating the data.
... View more
04-09-2021
06:28 PM
Hello Do you have an active subscription with Cloudera? All platform binaries are now behind paywall and only available for Cloudera customer to download details: https://www.cloudera.com/downloads/paywall-expansion.html
... View more
03-30-2021
02:30 AM
Hello On the NiFi Controller Service Details page, under tab properties, set the value for the "Database User"
... View more
03-30-2021
02:18 AM
Hello Try to set the "Database User", e.g. hive If you refer to the Hive2 JDBC documentation, the user ID is required https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-UsingJDBC The default <port> is 10000. In non-secure configurations, specify a <user> for the query to run as. The <password> field value is ignored in non-secure mode. Connection cnct = DriverManager.getConnection("jdbc:hive2://<host>:<port>", "<user>", "");
... View more
03-30-2021
02:13 AM
1 Kudo
Hello This is related to the Cache Management of HDFS As described in the documentation: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html In this architecture, the NameNode is responsible for coordinating all the DataNode off-heap caches in the cluster. The NameNode periodically receives a cache report from each DataNode which describes all the blocks cached on a given DN. The NameNode manages DataNode caches by piggybacking cache and uncache commands on the DataNode heartbeat. If the metric is going up, one possibility could be your namenode is too busy to handle the request
... View more
03-30-2021
01:53 AM
Hello You can follow below guide to install CDP Private Cloud Base Trial https://docs.cloudera.com/cdp-private-cloud/latest/release-guide/topics/cdpdc-trial-download-information.html
... View more