About mqureshi

jawadgilani786 · ‎06-12-2017

@amarnath reddy pappu @mqureshi, @Kuldeep Kulkarni, @Gerd Koenig, @Andrew Ryansmaple-mapreduce-job-error.txtmapreduce-error-in-hive-wiht-beeline.txt We have enabled SSL/TLS on HDP cluster by following @amarnath reddy pappu blog : https://community.hortonworks.com/articles/52875/enable-https-for-hdfs.html and HDP documentation. Almost all service opening on Https defined port. But Only issue we are currently facing is : MAP REDUCE JOBS ARE NOT LAUNCHING We use hive through beeline connector. While executing query we receive error : WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty (state=08S01,code=1) Tried a Sample Map reduce Job alone as well. That also got failed. Error is long so attaching here. I would appreciate your help. 🙂

saravanagopal88 · ‎05-09-2017

Thankyou @mqureshi the /etc/hosts file was not updated properly

MattWho · ‎04-27-2017

@Anishkumar Valsalam In a Nifi cluster you need to make sure you have uploaded you new custom component nars to every node in the cluster. I do not recommend adding your custom nars directly to the existing NiFi lib directory. While this works, it can become annoying to manage when you upgrade NiFi versions. NiFi allows you to specify an additional lib directory where you can place your custom nars. Then if you upgrade, the new version can just get pointed at this existing additional lib dir. Adding additional lib directories to your NiFi is as simple adding an additional property to the nifi.properties file. for example: nifi.nar.library.directory.lib1=/nars-custom/lib1 nifi.nar.library.directory.lib2=/nars-custom/lib2 Note: Each prefix must be unique (i.e. - lib1 and lib2 in the above examples). These lib directories must be accessible by the user running your NiFi instance. Thanks, Matt

mqureshi · ‎04-03-2017

@Revathy Mourouguessane In your Hive table properties you can specify skip.footer.line.count to remove footer from your data. If you just have one line footer, set this value to 1. You will specify this in your create table properties: tblproperties("skip.header.line.count"="1", "skip.footer.line.count"="1");

sbomma · ‎04-03-2017

Are these tables External Tables? In the case of external tables you would have manually clean the folders by removing the files and folders that are referenced by the table ( using hadoop fs -rm command)

dmoore247 · ‎12-07-2017

Nice work! How well does this work with the numerous ORC performance enhancements? Is HIVE-14565 a concern with respect to using deeply nested structures? Thanks.

myoung · ‎03-19-2017

@mqureshi @james.jones I recommend you read up on information about SolrCloud. The reference guide provides a good overview for how it works starting on page 419: http://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-4.10.pdf A SolrCloud cluster uses Zookeeper for cluster coordination. This means keeping track of which nodes are up, how many shards a collection has and which hosts are currently serving those shards, etc. Zookeeper is also used to store configuration sets. These are the index and schema configuration files that are used for your indexes. When you create a collection using the Solr scripts, the configuration files for the collection are uploaded to Zookeeper. An collection is comprised of 1 or more shard indexes and 0 or more replica indexes. When you use HDFS to store the indexes, it is much easier to add/remove SolrCloud nodes to your cluster. You don't have to copy the indexes which are normally stored locally. The new SolrCloud node is configured to coordinate with Zookeeper. Upon startup, the new SolrCloud node will be told by Zookeeper which shards for which it is responsible and then use the respective indexes stored on HDFS. All of the index data itself is stored within the index directories on HDFS. These directories are self contained. Solr stores collections within index directories where each index has its own directory within the top level Solr index directory. This is true for local storage and HDFS. When you replicate your HDFS index directories to another HDFS cluster, all of the data is maintained within the respective index directories. HDFS: /solr/collectionname_shard1_replica1/<index files> HDFS: /solr/collectionname_shard2_replica1/<index files> 1. In the case of having Solr running on a DR cluster, you would need to ensure the index configuration (schemas, configuration sets, etc) are updated in the DR Solr Zookeeper. If you create collections on your primary cluster, then you would need to similarly create collections on the DR cluster. This is primarily to ensure the collection metadata exists in both clusters. As long as these settings are in sync, copying the index directories from one HDFS cluster to another HDFS cluster is all you need to do to keep DR the cluster in sync with the production cluster. As I mentioned above, both clusters will be configured to store indexes in an HDFS location. As long as the index directories exist, the SolrCloud nodes will read the indexes from those HDFS directories. Solr creates those index directories based on the name of the collection/index. That is how it knows which data goes with which index. 2. Yes, you should be able to do this. If you need to "restore" a collection from backup, then you would have to copy each of the collection index shards. If you create a collection with 5 shards, then you will have 5 index directories that you need to restore from DR. Using something like Cross Data Center Replication in SolrCloud 6 is the easiest way to get Solr DR in place. Second to that, using the native Backup/Restore functionality in SolrCloud 5 is a viable alternative. Unfortunately, SolrCloud 4 has neither of these more user friendly approaches. I highly recommend upgrading to at least Solr 5 to get a better handle on backups and disaster recovery.

myoung · ‎03-17-2017

@mqureshi If Solr is storing the indexes on HDFS, then you have a fairly easy way of doing backups. You can use HDFS snapshots to take incremental backups of the Solr index directories on HDFS and then use distcp to copy those snapshots to another HDFS cluster. That provides the ability to have local backup copies and remote backup copies. If you didn't want to perform the HDFS snapshots, you could simply use distcp to replicate the HDFS data to another cluster. However, you lose the easy ability to restore an HDFS snapshot from a local backup.

mqureshi · ‎03-03-2017

@Adedayo Adekeye do an ls on "~/.ssh/config". Did it work? I am wondering if folder ".ssh" even exists (it should but it could be an issue with the VM you are using). In fact, for the user you are logged in as, is there a home folder? Just check all your permissions including permissions on .ssh folder.

MattWho · ‎03-02-2017

@nedox nedox You will want to use one of the available HDFS processors to get data form your HDP HDFS file system. 1. GetHDFS <-- Use if standalone NiFi installation 2. ListHDFS --> RPG --> FetchHDFS <-- Use if NiFI cluster installation All of the HDFS based NiFi processors have a property that allows you to specify a path to the HDFS site.xml files. Obtain a copy of your core-site.xml and hdfs-site.xml files from your HDP cluster and place them somewhere on the HDF hosts running NiFi. Point to these files using the "Hadoop Configuration Resources" processor property. example: Thanks, Matt

Online	Offline
Last Visited	‎10-31-2017 03:17 AM

Member Since	‎06-07-2016 09:05 AM
Last Visited	‎10-31-2017 03:17 AM
Posts	923
Kudos received	310

Cloudera Community

Re: YARN recommended configuration

Re: How to resolve for NULL values when they are c...

Re: Why is spark has better speed than Hadoop

Re: Is it possible to assign Hadoop queues to Hado...

Re: Kafka NiFi HDF Installation

Re: enabling SSL/TLS for HDFS - running into issue...

Re: amabri installation on centos failed stating t...

Re: Steps to deploy custom Nars

Re: Apache hive: to ignore the header and footer

Re: I am new to hive. how to delete Hive default f...

Re: Parsing JSON data and Storing it in ORC Hive

Re: HDFS Replication for SOLR 4.10.3

Re: SOLR Cross Data Center Replication version 4.1...

Re: "~/.ssh/config" E212: Can't open file for writ...

Re: Get Data from HDP using HDF