About mqureshi

mqureshi · ‎10-19-2017

@Mateusz Grabowski Check in the components installed and I am sure smartsense is still installed. Hortonworks will not be collecting data but smartsense is still doing its job. Check in the list of components you are installing and I am sure it is still installed.

mqureshi · ‎10-18-2017

@Mateusz Grabowski those recommendations are based on smartsense analysis of your cluster. Smartsense uses machine learning and data analysis from hundreds of clusters and tuned them based on what it has seen in the past on other hundreds of clusters suggests optimizations. If you are sure about your settings and know about your workload and know what you are doing, then you should go with that. Here is a little article that explains how Smart sense comes up with tuning your cluster and optimizing your hardware for best use. https://hortonworks.com/blog/case-study-2x-hadoop-performance-with-hortonworks-smartsense-webinar-recap/

mqureshi · ‎10-18-2017

@Swati Sinha The exception you are getting is java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableLongObjectInspector.get(WritableLongObjectInspector.java:36) at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:243) Somehwhere in your data, it is expecting a long and is not getting a long value. Looking at your record in the log file, the only thing that jumps off are the following attributes: ,"_col126":254332105,"_col183":7141271 I think this is a malformed jason and your colon (:) should be in quotes and the values outside just like the rest of your json record. I could be wrong here but this is what it looks like right now.

mqureshi · ‎10-02-2017

@ROHIT AILA I understand. That's why, I think you need to call kafka-consumer-group.sh script from your java program to get what you are looking for.

mqureshi · ‎10-01-2017

@Wolfgang nobody Can you give "yarn" user the right to impersonate in core-site.xml? Try the following: <property> <name>hadoop.proxyuser.yarn.groups</name> <value>group your user is part of. Can be comma separated list or '*' (without quotes) for all</value> <description>Allow the superuser yarn to impersonate any members of the group mentioned in group list</description> </property> <property> <name>hadoop.proxyuser.yarn.hosts</name> <value>host1,host2</value> <description>The superuser can connect only from host1 and host2 to impersonate a user. Make it '*' so it can connect from any host</description> </property>

mqureshi · ‎09-30-2017

@ROHIT AILA Have you tried using consumer.position(TopicPartition partition)? or run the following script from your java program. You can also run the script with zookeeper instead of bootsrap-server. kafka-consumer-groups.sh --bootstrap-server <broker-host>:9092 --new-consumer --group groupname --describe

mqureshi · ‎09-30-2017

@Tamil Selvan K Cloudbreak is used to provision, configure and elastically grow clusters in cloud. You would use Ambari to get cluster details you are looking for, which includes cluster name, configuration values and so on. And if you need to get these values without using the interface, you can use Ambari API. So, what you are looking for is available through Ambari and not cloudbreak.

mqureshi · ‎09-30-2017

@Prabhu Muthaiyan Here is how you would do it. first in your spark-env.sh set HADOOP_CONF_DIR to where your hdfs-site.xml, core-site.xml and hive-site.xml exist, such that your program when it runs is able to pick up these files and know how to connect to Hive. Then you basically code similar to below import org.apache.spark.sql.SparkSession; SparkSession spark = SparkSession .builder() .appName("Java Spark SQL basic example") .config("spark.some.config.option", "some-value") .enableHiveSupport() .getOrCreate(); DataSet<Row> emp1 = spark.sql("SELECT col1, col2, col3 from emp1 where <condition goes here>"); emp1.write().saveAsTable("emp2") ; //or use this emp1.write().mode("append").saveAsTable("emp2") ; you can have write modes which are following: SaveMode.Overwrite : overwrite the existing data. - SaveMode.Append : append the data. - SaveMode.Ignore : ignore the operation (i.e. no-op). - SaveMode.ErrorIfExists : default option, throw an exception at runtime

mqureshi · ‎09-30-2017

@Prabhu Muthaiyan Here is how you would do it. first in your spark-env.sh set HADOOP_CONF_DIR to where your hdfs-site.xml, core-site.xml and hive-site.xml exist, such that your program when it runs is able to pick up these files and know how to connect to Hive. Then you basically code similar to below import org.apache.spark.sql.SparkSession; SparkSession spark = SparkSession .builder() .appName("Java Spark SQL basic example") .config("spark.some.config.option", "some-value") .enableHiveSupport() .getOrCreate(); DataSet<Row> emp1 = spark.sql("SELECT col1, col2, col3 from emp1 where <condition goes here>"); emp1.write().saveAsTable("emp2") ; //or use this emp1.write().mode("append").saveAsTable("emp2") ; you can have write modes which are following: SaveMode.Overwrite : overwrite the existing data. - SaveMode.Append : append the data. - SaveMode.Ignore : ignore the operation (i.e. no-op). - SaveMode.ErrorIfExists : default option, throw an exception at runtime

mqureshi · ‎09-26-2017

@sally sally Is this the right XML you have copied here? It is not a valid XML.

Online	Offline
Last Visited	‎10-31-2017 03:17 AM

Member Since	‎06-07-2016 09:05 AM
Last Visited	‎10-31-2017 03:17 AM
Posts	923
Kudos received	310

Cloudera Community

Re: YARN recommended configuration

Re: How to resolve for NULL values when they are c...

Re: Why is spark has better speed than Hadoop

Re: Is it possible to assign Hadoop queues to Hado...

Re: Kafka NiFi HDF Installation

Re: YARN recommended configuration

Re: YARN recommended configuration

Re: How to resolve for NULL values when they are c...

Re: Getting kafka consumer offsets of a consumer g...

Re: distcp copy to local directory

Re: Getting kafka consumer offsets of a consumer g...

Re: How to check the cluster related details in da...

Re: How to read hive table1 from spark, using data...

Re: How to read hive table1 from spark, using data...

Re: how to get xml node value in nifi processor...