Member since
03-16-2016
707
Posts
1753
Kudos Received
203
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5129 | 09-21-2018 09:54 PM | |
6495 | 03-31-2018 03:59 AM | |
1969 | 03-31-2018 03:55 AM | |
2179 | 03-31-2018 03:31 AM | |
4833 | 03-27-2018 03:46 PM |
09-06-2016
03:22 AM
1 Kudo
@WONHEE CHOI 1) Make me a favor and create that directory /usr/lib/sqoop/lib, and place your odbc6.jar there and try. 2) Also, what is your SQOOP_HOME set as an environment variable? 3) Also, sqoop has --driver option where you can set the driver explicitly. Under normal conditions, if you have the SQOOP_HOME environment variable set for your sqoop user and the library is placed to the /lib folder, you shouldn't need this step 3. Let me know.
... View more
09-06-2016
03:17 AM
4 Kudos
@sanjeevan mahajan ... and to add to what Predrag stated based on the documentation, the same is true for all other databases, including Oracle, PostgreSQL, etc. The query needs to be rewritten to achieve the expected result, first find the min(s.from_date) per s.emp_no and at the second step join with e.emp_no to retrieve the needed other fields, as lookups. Try this: SELECT e.emp_no, e.birth_date,e.first_name, e.last_name, e.gender, s.min_from_date
FROM employees e,
(SELECT emp_no, min(from_date) as min_from_date
FROM new2_salaries
GROUP BY s.emp_no) s
WHERE s.emp_no = e.emp_no; If any of the responses to your question addressed the problem don't forget to vote and accept the answer. If you fix the issue on your own, don't forget to post the answer to your own question. A moderator will review it and accept it.
... View more
09-06-2016
02:49 AM
1 Kudo
@Diego Campo Stop your VM and attempt the following Network Settings. 1) Use Attached to NAT screen-shot-2016-09-05-at-103517-pm.png Enable Network Adapter and Cable Connected should be checked on for your Adapter. Start your VM and test. If it does not work go to 2) 2) Use Attached to Bridged Adapter: Select the name of the internet adaptor you are currently using on your host machine. Under Advanced, make sure the machine is using the Desktop Adaptor Type Under Advanced, make sure Promiscuous Mode is set to Allow VMs Under Advanced, make sure Cable connected is checked on Hit OK to save your changes Start your VM and test. If any of the responses to your question addressed the problem don't forget to vote and accept the answer. If you fix the issue on your own, don't forget to post the answer to your own question. A moderator will review it and accept it.
... View more
09-06-2016
02:25 AM
1 Kudo
@WONHEE CHOI Place the odbc6.jar in /usr/lib/sqoop/lib and retry. If it does not pick-up the jar file, restart Sqoop server and try again. If any of the responses to your question addressed the problem, don't forget to vote and accept the answer. If you fix the issue on your own, don't forget to post the answer to your own question. A moderator will review it and accept it.
... View more
09-03-2016
12:03 AM
4 Kudos
@Thomas Larsson Resource Manager UI. You can get to it in two ways: http:/hostname:8088, where hostname is the host name of the server where Resource Manager service runs. Otherwise, from Ambari UI click on YARN (left bar) then click on Quick Links at top middle, then select Resource Manager. You will see the memory and CPU used for each container. One container is allocated to a task. Good tutorial here: http://hadooptutorial.info/yarn-web-ui/ This is the visual. You could also build your own Grafana dashboard making calls to the REST API: https://hadoop.apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html Please don't forget to vote/accept best answer for your question.
... View more
09-02-2016
08:28 PM
1 Kudo
@Madhu B There are several ways to skin this cat but they would require some classpath tricks. Before going there, could you create in Hive a view of that table, e.g. create view hbase_user_act_view as select * from hbase_user_act; and test with that? Use HiveContext, please. Let me know. If any of the responses in this thread addressed your issue, don't forget to vote and accept best answer.
... View more
09-02-2016
08:22 PM
5 Kudos
@sankar rao An archive has been corrupted. You probably store compressed files (e.g. gzip or lzo) in Hive table directory and at least one of those files is corrupted. I would start moving files out of that folder (HDFS) in reverse chronological order and repeat the query until successful. That way you can find the corrupted archive. There are other ways to test your archives. You could do that too. Try and let me know. If this response or any response in this thread, please don't forget to vote and accept best answer.
... View more
09-02-2016
08:07 PM
4 Kudos
@chandramouli muthukumaran Just to clarify, SparlSQL does not access or use Hive engine. It just consumes the metadata of Hive data structures. Assuming that both can execute the query functionally (SparkSQL is quite limited functionally compared with Hive), but the query will need to churn through 40 TB of data, then I would say likely Hive on Tez is your optimal choice. That is also driven by the cost associated with your Spark cluster RAM additional to Hive's requirements because I assume that you will still have some cases where running Hive is needed. I noticed that if the amount of data is less than 1 TB, SparkSQL outperforms Hive on Tez. Anyhow, be aware, that with HDP 2.5 LLAP is in Tech Preview and soon will be GA. If you were asking Hive on LLAP vs. SparkSQL, I would say without hesitation for most of the queries, Hive on LLAP. Again, for some sofisticated queries with limited amount of data, and limited function, SparkSQL may be a winner, but in the big picture is too expensive to maintain both approaches and I would still consider Hive on Tez and LLAP over SparkSQL for most of the cases that deal with BIG DATA. Otherwise, 1 TB does not need Hadoop for fast queries. Read more about Hive on LLAP here: http://hortonworks.com/blog/llap-enables-sub-second-sql-hadoop/ Give LLAP a shot before deciding to use SparkSQL, especially, if you already have the queries written in HiveQL. If this response or any response in this thread was helpful, please don't forget to vote/accept it as the best answer.
... View more
09-01-2016
06:32 PM
@deepak sharma Crazy enough. I just reached to this customer and s simple restart of Kafka service addressed the issue. Kerberos was enabled recently and probably this service was not restarted. Not much to learn. The symlink suggestion from you is an interesting approach which while not applicable here, is worth it to remember for other situations. Thank you for the suggestion.
... View more
09-01-2016
06:29 PM
4 Kudos
It seems that the data does not go to thrash. A simple restart of Kafka service addressed the issue. Kerberos was enabled recently and probably this service was not restarted. The symlink suggestion from deepak is an interesting approach which while not applicable here, is worth it to remember for other situations.
... View more