Member since
09-23-2015
800
Posts
898
Kudos Received
185
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 7357 | 08-12-2016 01:02 PM | |
| 2708 | 08-08-2016 10:00 AM | |
| 3670 | 08-03-2016 04:44 PM | |
| 7210 | 08-03-2016 02:53 PM | |
| 1863 | 08-01-2016 02:38 PM |
04-08-2016
06:17 PM
"submitted by user hive to unknown queue: default" So does your Resourcemanager have a default queue? You can check in the Resourcemanager UI at port 8088
... View more
04-08-2016
01:39 PM
2 Kudos
In an unkerberised HDP cluster the hbase node is /hbase-unsecure and will be changed to /hbase-secure In this question he did the same thing and fixed it by adding the url "zkUrl","sandbox:2181:/hbase-unsecure", https://community.hortonworks.com/questions/18228/phoenix-hbase-problem-with-hdp-234-and-java.html I doubt adding it to the spark config helps anything ( only parameters with spark. get serialized for example ) sqlline needed the /hbae-unsecure before but in the newest version they seem to take the znode from the hbase-site.xml if not otherwise configured. You can check in your hbase-site which node is needed.
... View more
04-08-2016
01:17 PM
Depends what you plan to do. - Aggregation queries and analytical reports then Hive ( simple jdbc connection is supported by BIRT and Pentaho and you can also make servlets with jdbc pools the whole shebang ) - Selecting one record at a time ( like a dashboard that shows the data of one customer Hbase with REST api from javascript might work Hbase with java api from a servlet if you prefer SQL Apache Phoenix is a cool SQL layer on top of HBase https://phoenix.apache.org/ - Interactive reports on thousands to millions of records ( not billions ) Apache Phoenix, it provides some good enhancements on base HBase from a performance perspective for anything that touches more than a row. You can also do Joins aggregations etc. pp. ( If you want HBase but have kerberos setup have a look at Knox its a SSL capable proxy that strips away the Kerberos requirement and replaces it with a normal web authentication setting for the hbase API )
... View more
04-08-2016
12:29 PM
2 Kudos
@Maharaj Muthusamy You need to set the hint after the first statement. I.e. the upsert statement not the select one so it works. Just tried it This works and results in a sort merge join: explain upsert /*+ USE_SORT_MERGE_JOIN */ into productsales select productsales.product, productsales.date, productsales.amount from sales,productsales where sales.product = productsales.product; This doesn't: explain upsert into productsales select /*+ USE_SORT_MERGE_JOIN */ productsales.product, productsales.date, productsales.amount from sales,productsales where sales.product = productsales.product;
... View more
04-08-2016
11:32 AM
You can use your own frontend using angular or whatever you want ( Dojo has some nice charts ) https://www.sitepen.com/blog/2008/06/06/a-beginners-guide-to-dojo-charting-part-1-of-2/ However if you need only a lower level of flexibility you could use BIRT http://www.eclipse.org/birt/ or pentaho or other reporting tools. That would be easier and BIRT for example provides pretty flexible report creation capabilities. ( It breaks down when you want a hugely interactive frontend. )
... View more
04-08-2016
11:28 AM
Normally Copy and paste works. Do you use putty? Also if filezilla fails you could use winscp. But normally he tells you more than critical transfer error.
... View more
04-07-2016
10:17 PM
1 Kudo
Hive expects a SASL Wrapper from the client. ( empty in your case ). And doesnt seem to get any or with a wrong status. Is it possible that the odbc driver is old? Did you use the ODBC driver from here? http://hortonworks.com/hdp/addons/
... View more
04-07-2016
09:04 PM
I assume if you run explain that he shows both times that he ignores the hint. I found this link where SquirrelJDBC was removing hints but it looks like you use sqlline and the link said it works there and the usage looks identical to what you do. ( Could you try it once without the upsert to see if that helps? ) https://mail-archives.apache.org/mod_mbox/phoenix-user/201503.mbox/%3cfc15b78a902a1ad5c0d66efc4a5b2342@mail.gmail.com%3e The second possibility would obviously be to increase the hash cache? 100MB is not that much in todays day and age and 3m rows is not the world. phoenix.query.maxServerCacheBytes I hope he is smart enough to only build a cache from the two columns of the right side he actually needs. But I assume since its a description column, that is actually bigger. 200 bytes * 3m rows would be 600m of data.
... View more
04-07-2016
06:16 PM
Yeah the first approach is simple and what I did before so I know it works. Scanning the whole data 8 times is a bit wasteful but the operation should be very fast ( you only parse the dataset once and filters are quick) . Groupby might be more efficient for large number of types but you need to somehow implement a file save for an array and he will put everything for one type in memory of one executor I think. So more work and less robust. If you go that second way an article here would be cool.
... View more
04-07-2016
05:16 PM
so the data is in the same stream? I.e. one row will have one format and the second one will have another? If you have 8 Kafka Streams I suppose you wouldn't ask. In that case you have two chances: - Make a identify function apply it and then filter the RDD 8 times for each type then each time do the correct parsing and persisting in SQL. As an illustration val inputStream ... inputStream.map(record => ( identifyType(record), record)) type1Stream = inputStream.filter ( record._1 == "type1" ); type2Stream = inputStream.filter ( record._1 == "type2" ); ... type1Stream.map(record => myParse1Function(record._2); type1Stream.map(persist as my dataframe in table1); type2Stream.map(record => myParse2Function(record._2); type2Stream.map(persist as my dataframe in table2); - Make a identify function apply it and then group by the type somehow, problem is how do you save the grouped by values they will all end up in the same executor I think. Would be a bit more work but more efficient because above you filter the same stream 8 times. Unfortunately there is no tee yet that could split apart a stream. That would be exactly what you need. ( If I understood the question correctly ) https://issues.apache.org/jira/browse/SPARK-13378
... View more