About pminovic

pminovic · ‎01-23-2016

After upgrade to Ambari-2.1.2.1 (or 2.2.1) and HDP-2.3.x we are going to add Kerberos and LDAP to the cluster and we are looking for the best, automated solution. Both will run on a RHEL box but we can select components freely. What's the best way to go? I'm aware of FreeIPA, exactly what we want except that it's not supported by Ambari. I don't mind using manual Kerberos wizard but in Ambari-2.1.2 there were some issues on clusters with manually installed Kerberos (like CSV files not appearing when adding new services, issues when adding new nodes etc). KDC and OpenLDAP, KDC is fully supported from Ambari, but not aware of full integration of KDC and OpenLDAP, like when adding new users have to add them twice, once to OpenLDAP and then to KDC (possibly can use scripts). Any help and ideas will be appreciated.

pminovic · ‎01-22-2016

Only the first time when importing with --lastmodified all records will be imported, after that try to update a few records in MySql and run your Sqoop job again, it is supposed to import only updated records. If it still doesn't work please post your MySql table schema, a few records from your table and your Sqoop job command.

pminovic · ‎01-21-2016

@Ancil McBarnett Oh yes, supervisors definitely on dedicated nodes if you have enouhg nodes. I updated my answer.

pminovic · ‎01-21-2016

jaas files are needed only for keberized Kafka, that's why I thought it's kerberized. Can you open the *.properties files and see what's there.

pminovic · ‎01-21-2016

@Anders Synstad Something went wrong with your upgrade. So your Kafka is kerberized, right? You'll need both kafka_client_jaas.conf and kafka_server_jaas.conf. You can find details here. It's a good idea to open those files and do a sanity check of their contents. Once you have them right, and under /etc/kafka/conf, restart all brokers. Then create a new topic and make sure console producer and consumer are running. For kerberized Kafka after kinit do this: export CLIENT_JVMFLAGS="-Djava.security.auth.login.config=/etc/kafka/conf/kafka_client_jaas.conf" and run producer and consumer with "--security-protocol SASL_PLAINTEXT" option. Details are in chapters 5 and 6 of that document. After that you can try performance tests.

pminovic · ‎01-21-2016

@Gerd Koenig If you reformat hdfs you will be left without the whole /hdp folder and you'll have to recreate it. If you are sure everything else is now all right you better remove corrupted files and recreate them, they are all available in /usr/hdp/<hdp-version> and you can copy them to hdfs. Details can be found in the doc given by @Neeraj Sabharwal. For example, hive and pig files are given here, tez files here and so on. You can just delete files under /user/ambari-qa, they are result of some service checks, no need to recreate them.

pminovic · ‎01-21-2016

@Raja Sekhar Chintalapati Thanks for letting us know. Did you use rolling upgrade or express upgrade? Tnx.

pminovic · ‎01-21-2016

Hi @Ancil McBarnett my 2 cents: Nothing on Edge nodes, you have no idea what the guys will do there Nimbus, Storm UI and DRPC on one of cluster master nodes. If this is a stand-alone Storm&Kafka cluster then set a master and put these guys together with Ambari there. Supervisors on dedicated nodes. In hdfs cluster you can collocate them with Data nodes. Dedicated Kafka broker nodes, but see below Dedicated ZK for Kafka, however in case of Kafka-0.8.2 or higher, if consumers don't keep offsets on ZK, low to medium traffic and you have at least 3 brokers,then you can start with collocating Kafka ZK with brokers. In this case ZK should use dedicated disk.

pminovic · ‎01-21-2016

Hi @Mehdi TAZI I cannot recommend using HBase for data lake. It's not designed for that, but to provide quick access to stored data. If your total data size grows into hundreds of terabytes or into petabyte range it won't work well. I mean, it cannot replace a file system. You can combine small files into Sequence files or something similar but the best solution would be a kind of object store for Hadoop/HDFS. And indeed there is such a solution called Ozone. It's under active development and it's supposed to appear soon. More details can be found here.

pminovic · ‎01-21-2016

You mean HDP-2.3.4, right? On the host where Pig is failing try to install Pig manually: "yum install "pig_2_3_*". Retry if needed. If it still doesn't work check the baseurl in /etc/yum.repos.d/HDP.repo file.

Online	Offline
Last Visited	‎08-19-2019 01:20 AM

Member Since	‎09-24-2015 04:02 AM
Last Visited	‎08-19-2019 01:20 AM
Posts	816
Kudos received	481

Cloudera Community

Re: datanode + Error occurred during initializatio...

Re: Problem when Distcp between two HA Cluster.

Re: Beeline over KNOX fails with HTTP Response co...

Re: What does nclients option of performance evalu...

Re: missing directories in ambari installation pac...

Looking for an automated integration of HDP/Ambari...

Re: sqoop incremental import working fine ,now i w...

Re: Best Practices for Storm Deployment on a Hadoo...

Re: Problems with Kafka command line utils after u...

Re: Problems with Kafka command line utils after u...

Re: how to format HDFS in an already installed clu...

Re: hive settings missing in ambari

Re: Best Practices for Storm Deployment on a Hadoo...

Re: Can I use Hbase as a datalake

Re: anyone having PIG installation issues in HDP 2...