Member since
07-31-2019
346
Posts
259
Kudos Received
62
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2536 | 08-22-2018 06:02 PM | |
1527 | 03-26-2018 11:48 AM | |
3735 | 03-15-2018 01:25 PM | |
4797 | 03-01-2018 08:13 PM | |
1314 | 02-20-2018 01:05 PM |
11-22-2016
07:33 PM
2 Kudos
Too add to what @Scott Shaw said, the biggest thing we'd be looking for initially is data skew. So we can take a look at a couple things to help determine this. The first is to take a look at the input size. With input size, we can completely ignore the min, and take a look at the 25, median and 75th percentiles. We see that in your job the are fairly close together, and we also the see the max is never dramatically more than the median. If we saw the max and 75% percentile were very large, we would definitely see data skew. Another indicator of data skew is the task duration. Again ignore the minimum, we're definitely going to inevitably get a small partition due to one reason or another. Focus on the 25th median 75th and max. In a perfect world the seperation between the 4 would be a tiny amount. So seeing 6s, 10s, 11s, 17s, they may seem like significantly different but theyre actually relatively close. The only time we would have a cause for concern would be when the 75% and max are quite a bit greater then 25% and median. When I saw significant, I'm talking about most tasks take ~30s and the max taking 10 mins. That would be a clear indicator of data skew.
... View more
11-09-2016
02:12 PM
Thanks @Andrew Grande! That worked! I feel like a noob 🙂 but appreciate all the help!
... View more
10-15-2016
02:52 PM
Hey @Daniel Rolls, no problem at all. I'm glad its working! I'm surprised Chrome isn't working though. I use Chrome by default and the views have worked fine so far. Thanks for the update!
... View more
05-11-2017
03:02 PM
Hi Scott, Below is the error I am getting on when I am trying to perform ODBC data connection. "UNABLE TO CONNECT" Encountered an error while trying to connect to ODBC Details: "ODBC: ERROR [HY000] [Hortonworks][Hardy] (34) Error from server: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Internal credentials cache error).
ERROR [HY000] [Hortonworks][Hardy] (34) Error from server: SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Internal credentials cache error)." I am able to sucssefully test the connection from Hortonworks Hive ODBC Driver DSN Setup Thanks for any help
:)
... View more
09-09-2016
01:06 PM
+ @jfrazee @Matt Burgess
... View more
08-26-2016
05:14 PM
That did the trick! Thanks @Constantin Stanca!
... View more
08-01-2016
04:59 PM
Thanks, that answers all my questions. I'd be all in HDInsight if MS would give me a free dev environment 🙂
... View more
03-20-2017
08:41 PM
8 Kudos
@Anurag Setia HDP windows only support server OSs, such as Windows server 2012 R2. Here's the list: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2-Win/bk_HDP_Install_Win/content/ref-9bdea823-d29d-47f2-9434-86d5460b9aa9.1.html. You also need to install required software packages prior installation: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2-Win/bk_HDP_Install_Win/content/ref-dc3ee968-ae3c-4c41-bb26-75a165180fb5.1.html
... View more
06-13-2016
01:00 PM
Thanks for information...
... View more
10-17-2017
02:36 PM
I tried using the external table method but I run out of memory. My mongo collection (table2) has 10 million records (0.755 GB) and reading from it works. After the insert task fails I do a count on the native table (table1) and it contains 0 rows. My query looks like this: "INSERT INTO table1 SELECT * FROM table2", if I add "LIMIT 1000" it works, however I need to migrate the entire collection. I attached the output from beeline.
... View more