Member since
06-07-2016
923
Posts
322
Kudos Received
115
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3146 | 10-18-2017 10:19 PM | |
3500 | 10-18-2017 09:51 PM | |
13024 | 09-21-2017 01:35 PM | |
1265 | 08-04-2017 02:00 PM | |
1612 | 07-31-2017 03:02 PM |
10-19-2017
03:21 PM
@Mateusz Grabowski Check in the components installed and I am sure smartsense is still installed. Hortonworks will not be collecting data but smartsense is still doing its job. Check in the list of components you are installing and I am sure it is still installed.
... View more
10-18-2017
10:19 PM
@Mateusz Grabowski those recommendations are based on smartsense analysis of your cluster. Smartsense uses machine learning and data analysis from hundreds of clusters and tuned them based on what it has seen in the past on other hundreds of clusters suggests optimizations. If you are sure about your settings and know about your workload and know what you are doing, then you should go with that. Here is a little article that explains how Smart sense comes up with tuning your cluster and optimizing your hardware for best use. https://hortonworks.com/blog/case-study-2x-hadoop-performance-with-hortonworks-smartsense-webinar-recap/
... View more
10-18-2017
09:51 PM
2 Kudos
@Swati Sinha The exception you are getting is java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableLongObjectInspector.get(WritableLongObjectInspector.java:36) at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:243) Somehwhere in your data, it is expecting a long and is not getting a long value. Looking at your record in the log file, the only thing that jumps off are the following attributes: ,"_col126":254332105,"_col183":7141271 I think this is a malformed jason and your colon (:) should be in quotes and the values outside just like the rest of your json record. I could be wrong here but this is what it looks like right now.
... View more
10-02-2017
02:40 PM
@ROHIT AILA I understand. That's why, I think you need to call kafka-consumer-group.sh script from your java program to get what you are looking for.
... View more
10-01-2017
11:55 PM
@Wolfgang nobody Can you give "yarn" user the right to impersonate in core-site.xml? Try the following: <property>
<name>hadoop.proxyuser.yarn.groups</name>
<value>group your user is part of. Can be comma separated list or '*' (without quotes) for all</value>
<description>Allow the superuser yarn to impersonate any members of the group mentioned in group list</description>
</property>
<property>
<name>hadoop.proxyuser.yarn.hosts</name>
<value>host1,host2</value>
<description>The superuser can connect only from host1 and host2 to impersonate a user. Make it '*' so it can connect from any host</description>
</property>
... View more
09-30-2017
09:39 PM
@ROHIT AILA
Have you tried using consumer.position(TopicPartition partition)? or run the following script from your java program. You can also run the script with zookeeper instead of bootsrap-server. kafka-consumer-groups.sh --bootstrap-server <broker-host>:9092 --new-consumer --group groupname --describe
... View more
09-30-2017
06:11 AM
1 Kudo
@Tamil Selvan K Cloudbreak is used to provision, configure and elastically grow clusters in cloud. You would use Ambari to get cluster details you are looking for, which includes cluster name, configuration values and so on. And if you need to get these values without using the interface, you can use Ambari API. So, what you are looking for is available through Ambari and not cloudbreak.
... View more
09-30-2017
06:06 AM
@Prabhu Muthaiyan
Here is how you would do it. first in your spark-env.sh set HADOOP_CONF_DIR to where your hdfs-site.xml, core-site.xml and hive-site.xml exist, such that your program when it runs is able to pick up these files and know how to connect to Hive. Then you basically code similar to below import org.apache.spark.sql.SparkSession;
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.config("spark.some.config.option", "some-value")
.enableHiveSupport()
.getOrCreate();
DataSet<Row> emp1 = spark.sql("SELECT col1, col2, col3 from emp1 where <condition goes here>");
emp1.write().saveAsTable("emp2") ;
//or use this emp1.write().mode("append").saveAsTable("emp2") ;
you can have write modes which are following: SaveMode.Overwrite : overwrite the existing data. - SaveMode.Append : append the data. - SaveMode.Ignore : ignore the operation (i.e. no-op). - SaveMode.ErrorIfExists : default option, throw an exception at runtime
... View more
09-30-2017
06:06 AM
@Prabhu Muthaiyan
Here is how you would do it. first in your spark-env.sh set HADOOP_CONF_DIR to where your hdfs-site.xml, core-site.xml and hive-site.xml exist, such that your program when it runs is able to pick up these files and know how to connect to Hive. Then you basically code similar to below import org.apache.spark.sql.SparkSession;
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.config("spark.some.config.option", "some-value")
.enableHiveSupport()
.getOrCreate();
DataSet<Row> emp1 = spark.sql("SELECT col1, col2, col3 from emp1 where <condition goes here>");
emp1.write().saveAsTable("emp2") ;
//or use this emp1.write().mode("append").saveAsTable("emp2") ;
you can have write modes which are following: SaveMode.Overwrite : overwrite the existing data. - SaveMode.Append : append the data. - SaveMode.Ignore : ignore the operation (i.e. no-op). - SaveMode.ErrorIfExists : default option, throw an exception at runtime
... View more
09-26-2017
01:11 PM
@sally sally Is this the right XML you have copied here? It is not a valid XML.
... View more