Member since
03-01-2018
36
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4864 | 03-01-2018 06:51 PM |
07-24-2019
07:01 AM
@ateeq ashraf, Are you using a cache, can you check if the cache is full?
... View more
07-24-2019
06:55 AM
@John You can use the REST API to Access data in HBase. By default, the HBase rest server is not started. You need to start it first To start rest server in the foreground # su hbase # hbase rest start -p {port to start the server} To start in the background # su hbase # /usr/hdp/current/hbase-master/bin/hbase-daemon.sh start rest -p {port to start the server} . Ref: http://hbase.apache.org/book.html#_rest http://blog.cloudera.com/blog/2013/03/how-to-use-the-apache-hbase-rest-interface-part-1/
... View more
07-24-2019
06:52 AM
@jingyong zou, Is it a new table or an older table created in HDP2.x? if it is the older table created in HDP2.x than there might be chances you are hitting HIVE-20593
... View more
07-24-2019
06:36 AM
@Vincenzo Giangregorio Have a look to wonderfull document : https://community.hortonworks.com/articles/196257/how-to-compact-orc-files-on-hive.html
... View more
07-24-2019
05:59 AM
@Rodrigo Gallacci Can you please provide information regarding which HDP and hive version you are using? Its seems like a managed table, are you getting any error logs? if its hdp3.1 can you usine beeline in debug more like below beeline --verbose=true and post the logs ?
... View more
07-24-2019
05:55 AM
@Mahendta Chouhan , I can see the error Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed The error above likely suggests your HMS is configured for security (kerberos) but that your login lacks a valid TGT (such as one obtained via kinit). Could you post the output of klist, and confirm if a 'hadoop fs' test Also check the if you are logged in with correct user, which have all the access.
... View more
12-10-2018
04:32 PM
Hi @Kunal Agarwal Can you check if below property is enable of not ?
You can login to Ambari and go to YARN Configs page. Search yarn.resourcemanager.webapp.ui-actions.enabled If it exists, change the value to True. If it does not exist, clear the filter and add it from ‘Custom yarn-site’ then ‘Add Property’ and set the value to true. Save and restart.
... View more
10-09-2018
06:44 AM
Hi All, I am not able to connect Kafka consumer and producer running on a hdp2.6.5 container on mac. I am using eclipse to connect to sandbox-hdp.hortonworks.com:6667. I can see the port forwarding but still not working.
docker ps
$ 4f94aca8bc81hortonworks/sandbox-proxy:1.0 "nginx -g 'daemon of…" 2 hours ago Up 2 hours ........ 0.0.0.0:6627->6627/tcp, 0.0.0.0:6667->6667/tcp, 0.0.0.0:7777->7777/tcp, 0.0.0.0:7788->7788/tcp, 0.0.0.0:8000->8000/tcp, 0.0.0.0:8005->8005/tcp, 0.0.0.0:8020->8020/tcp, 0.0.0.0:8032->8032/tcp, 0.0.0.0:8040->8040/tcp, 0.0.0.0:8042->8042/tcp, ....... sandbox-proxy b76d24c6c5f8
$ hortonworks/sandbox-hdp:2.6.5 "/usr/sbin/init" 2 hours ago Up 2 hours22/tcp, 4200/tcp, 8080/tcp
class kafkaProducer {
val kafkaBrokers = "sandbox-hdp.hortonworks.com:6667"
val topicName = "test"
val props = new Properties()
props.put("bootstrap.servers", kafkaBrokers)
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
val producer = new KafkaProducer[String, String](props)
def sendEvent(message: String)={
val key = java.util.UUID.randomUUID().toString()
producer.send(new ProducerRecord[String, String](topicName, key, message))
}
}
... View more
Labels:
09-17-2018
06:16 AM
hi @Anurag Mishra, for memeory : https://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/ did you tried in uber mode ? <code>mapreduce.job.ubertask.enable = true
... View more
09-17-2018
05:53 AM
Hi @vgarg, Since 0.11 hive has a NVL function nvl(T value, T default_value) which says Returns default value if value is null else returns value
... View more
09-14-2018
08:51 AM
Hi @Abhijeet Rajput,
Did you tried like this ?
dfOrders.write.mode("overwrite").format("jdbc")
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
.option("url", "jdbc:sqlserver://server.westus.cloudapp.azure.com;databaseName=TestDB")
.option("dbtable", "TestDB.dbo.orders")
.option("user", "myuser")
.option("batchsize","200000")
.option("password","MyComplexPassword!001").save()
Thanks
Vikas Srivastava
... View more
09-14-2018
08:39 AM
Hi @Michael Spann, This controls how long the Processor should be scheduled to run each time that it is triggered. On the left-hand side of the slider, it is marked 'Lower latency' while the right-hand side is marked 'Higher throughput'. When a Processor finishes running, it must update the repository in order to transfer the FlowFiles to the next Connection. Updating the repository is expensive, so the more work that can be done at once before updating the repository, the more work the Processor can handle (Higher throughput). However, this means that the next Processor cannot start processing those FlowFiles until the previous Process updates this repository. As a result, the latency will be longer (the time required to process the FlowFile from beginning to end will be longer). As a result, the slider provides a spectrum from which the DFM can choose to favor Lower Latency or Higher Throughput. For more read: https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#processor_anatomy
... View more
09-14-2018
08:29 AM
@Redhouane
Abdellaoui
Can you please if the inode are available ? You can check this with "df -i" Thanks Vikas Srivastava
... View more
03-17-2018
04:55 AM
@Alpesh Virani Instant Patchy workaround: SET hive.support.concurrency=false; Unlock table: unlock table my_table;
... View more
03-17-2018
04:51 AM
HI @Leonid Yapharov, does it resolve your problem?
... View more
03-17-2018
04:48 AM
Hi @Lim , Are you able to install Metron ? Thanks Vikas Srivastava
... View more
03-17-2018
04:47 AM
Hi @HDave Hope you doing good, did you get the answer you are looking for? if yes, Can you please provide the feedback and marked thread as close. Thanks Vikas Srivastava
... View more
03-07-2018
04:44 AM
Can you try installing pillow pip install pillow then try once again
... View more
03-07-2018
04:38 AM
Hi @HDave When SparkSQL uses hive SparkSQL can use HiveMetastore to get the metadata of the data stored in HDFS. This metadata enables SparkSQL to do better optimization of the queries that it executes. Here Spark is the query processor. When Hive uses Spark See the JIRA entry: HIVE-7292 Here the the data is accessed via spark. And Hive is the Query processor. So we have all the deign features of Spark Core to take advantage of. But this is a Major Improvement for Hive but there is certain dependency of version between spark and hive , Link: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started Here is already the link on HCC you can view: https://community.hortonworks.com/questions/54740/hive-on-tez-or-hive-query-using-spark-sql.html
... View more
03-07-2018
04:29 AM
@LIM yup here is the link for installation. https://cwiki.apache.org/confluence/display/METRON/Installation
... View more
03-05-2018
02:41 AM
Hi @PJ Which you have said like it didnt launched the containers till 8:55 which means its not getting the proper resource to start the process and as there are already jobs running in the same queue support the issue as well. try decreasing the value to 0.1 .
... View more
03-05-2018
02:36 AM
SET hive.support.concurrency=false; It should work.
... View more
03-04-2018
03:39 PM
set hive.compactor.initiator.on=true; set hive.compactor.worker.threads=1; INSERT INTO TEST VALUES (1,'a');
... View more
03-04-2018
03:29 PM
@PJ did you try to explain extended $cmd? if possible repair it once like msck repair table.
... View more
03-04-2018
03:21 PM
@Leonid Yapharov Can you try putting you terminal: export LANG=en_US.UTF-8
... View more
03-04-2018
03:13 PM
In your case, if you try to run it on yarn, you can use the minimum of 1G as well like this --master yarn-client --executor-memory 1G --executor-cores 2 --num-executors 12 you can increase the number of executors to make it more better 🙂
... View more
03-04-2018
03:00 PM
2 Kudos
hi @SUDHIR KUMAR Can you check the host file of name-node if you are able to access all data node with hostnames? Thanks
... View more
03-02-2018
03:08 AM
@hema moger, You can simply do that df['Address'] = df[df.columns[3:5]].apply(lambda x: ','.join(x.map(str)),axis=1)
... View more
03-01-2018
06:51 PM
Number of cores = Concurrent tasks as executor can run So we might think, more concurrent tasks for each executor will give better performance. But research shows that
any application with more than 5 concurrent tasks, would lead to bad show. So stick this to 5. Coming back to next step, with 5 as cores per executor, and 19 as total available cores in one Node(CPU) - we come to
~4 executors per node. So memory for each executor is 98/4 = ~24GB. Calculating that overhead - .07 * 24 (Here 24 is calculated as above)
= 1.68 Since 1.68 GB > 384 MB, the over head is 1.68. Take the above from each 21 above => 24 - 1.68 ~ 22 GB
... View more