Member since
01-19-2017
3676
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 609 | 06-04-2025 11:36 PM | |
| 1175 | 03-23-2025 05:23 AM | |
| 579 | 03-17-2025 10:18 AM | |
| 2183 | 03-05-2025 01:34 PM | |
| 1373 | 03-03-2025 01:09 PM |
07-30-2020
03:38 PM
The unpack command will not work without that extra dash. https://stackoverflow.com/questions/34573279/how-to-unzip-gz-files-in-a-new-directory-in-hadoop/43704452 I had another try with a file name as the destination. hdfs dfs -cat /user/testuser/stage1.tar.gz | gzip -d | hdfs dfs -put - /user/testuser/test3/stage1 the file stage1 appeared in the test3 directory. There is something interesting. The stage1.tar.gz contains three empty txt files. "hdfs dfs -cat /user/testuser/test3/-" ouptut nothing and the file size is 0.1k "hdfs dfs -cat /user/testuser/test3/stage1" output some texts including original file names inside. Also the file size is 10k.
... View more
07-30-2020
06:20 AM
@Shelton After the apply of the workaround on the file /var/lib/ambari-agent/cache/custom_actions/scripts/remove_previous_stacks.py, I succeeded to execute the API request but there's no effect : the directory of the old version hasn't been removed on the servers of the cluster, and the results of the API queries GET http://localhost/api/v1/clusters/<cluster_name>/stack_versions, http://localhost/api/v1/clusters/<cluster_name>/stack_versions/<id_oldversion>, are the same.
... View more
07-27-2020
06:08 AM
@Krpyto84 Your permission issue is linked to ZK ACL's my good guess is your Kafka is kerberized. Zookeeper requires you to set up a superuser using the zookeeper.DigestAuthenticationProvider.superDigest property. I don't know how you will integrate that procedure in your Ansible playbook You will then need to append this in y your KAFKA_OPTS env variable to set the JVM parameters export KAFKA_OPTS=-Djava.security.auth.login.config=/path/to/kafka_server_jaas.conf Please let me know whether that is your situation if that's the case then I will try to help you out
... View more
07-26-2020
11:18 AM
1 Kudo
@mike_bronson7 log.retention.bytes is a size-based retention policy for logs, i.e the allowed size of the topic. Segments are pruned from the log as long as the remaining segments don't drop below log.retention.bytes. You can also specify retention parameters at the topic level To specify a retention time period per topic, use the following command. kafka-configs.sh --zookeeper [ZooKeeperConnectionString] --alter --entity-type topics --entity-name [TopicName] --add-config retention.ms=[DesiredRetentionTimePeriod] To specify a retention log size per topic, use the following command. kafka-configs.sh --zookeeper [ZooKeeperConnectionString] --alter --entity-type topics --entity-name [TopicName] --add-config retention.bytes=[DesiredRetentionLogSize] That should resolve your problem Happy hadooping
... View more
07-26-2020
08:06 AM
Thanks @Shelton . But as I mentioned earlier, I have set the sync source to LDAP / AD. As per Cloudera customer support, we cannot sync Unix and LDAP / AD at the same time. For the issue which I faced, I restarted Ranger KMS, and tried again using "hdfs" user. And was able to create the encryption zone.
... View more
07-23-2020
11:38 PM
I was able to restart to the datanode from the Ambari UI after a restart of the ambari-agent on the servers where the datanode run
... View more
07-23-2020
06:18 AM
Since the solution is scattered across many posts, I'm posting a short summary of what I did. I am running HDP 2.6.5 image on VirtualBox. Increased my virtual hard disk through Virtual Media Manager In the guest OS, Partitioned the unused space Formatted the new partition as an ext4 file system Mounted the file system Update the /etc/fstab (I couldn't do it, as I did not find that file In Ambari, under DataNode directory config, added the newly mounted file system as a comma separated value Restarted HDFS (my cluster did not have any files, therefore I did not run the below) Thanks to @Shelton for his guidance. sudo -u hdfs hdfs balancer
... View more
07-22-2020
09:24 PM
ambari files view (same PB for Hue File browser) is not the good tool if you want to upload (very) big files. it's running in JVMs, and uploading big files will use more memory (you will hit maximum availaible mem very quickly and cause perfs issues to other users while you are uploading ) BTW it's possible to add other ambari server views to increase perfs (it may be dedicated to some teams/projects ) for very big files prefer Cli tools : scp to EDGE NODE with a big FS + hdfs dfs -put. or distcp or use an object storage accessible from you hadoop cluster with a good network bandwidth
... View more
07-22-2020
01:25 AM
Comparison between Apache Sentry and Apache Ranger based on features offered by them: Feature Apache Sentry Apache Ranger Role-Based Access Control [RBAC] Yes Yes Deny Support No Yes Admin Web User Interface No Yes REST API Support No Yes CLI Support Yes No Audits Support No Yes Plugins Supported Impala, Hive, HDFS, Solr, Kafka Impala, Hive, HDFS, Solr, Kafka, HBase, Knox, Yarn, Storm, etc Tag-based policy No Yes Row Level Filtering No Yes Column Masking No Yes HDFS ACL Sync Yes No [Will be supported in upcoming CDP releases] As we can see Apache Ranger supports more features like tag-based policy, row-level filtering, column masking, audits, admin web interface, more services, or plugins in CDP stack, and that's why its the default choice for the authorization service in CDP. For more detailed comparison see this article by @EricL https://www.ericlin.me/2020/01/introduction-to-apache-ranger-part-i/
... View more
07-21-2020
06:32 AM
Hi,. I tried following the steps outlined in the document but still no progress.
... View more