Member since
08-22-2018
5
Posts
0
Kudos Received
0
Solutions
09-26-2018
09:39 PM
Though this is an old post, want to know if there's a solution for the original query. I have concerns similar to narasimha meruva I've built an MR project in eclipse and tested in it Eclipse, where in I created a Driver class with all configurations(mapper class, number of reducer etc). Next, I packed it into a JAR and ran it as MR job(Hadoop JAR command) in my cluster to check if it works fine. Next, I simply want to execute this MR JAR in oozie and thus oozie should have provision to execute the JAR WITHOUT providing the configuration parameters again since they are already present in driver class. Please do post if anyone knows the best practice.
... View more
09-20-2018
09:20 PM
HI TomasTF, I tried 3 ways, 1. Data without newline : sample below {"id":"P121", "name":"Round Tee", "reseller":"Puma", "category":"Top Wear", "price":2379.06, "discount":22.57, "profit_percent":20.13},{"id":"P122", "name":"pant", "reseller":"Reebok", "category":"bottom", "price":1234.99, "discount":99.99, "profit_percent":67.12} 2. Data with new line and comma {"id":"P121", "name":"Round Tee", "reseller":"Puma", "category":"Top Wear", "price":2379.06, "discount":22.57, "profit_percent":20.13}, {"id":"P122", "name":"pant", "reseller":"Reebok", "category":"bottom", "price":1234.99, "discount":99.99, "profit_percent":67.12} 3. Data with new line and without comma {"id":"P121", "name":"Round Tee", "reseller":"Puma", "category":"Top Wear", "price":2379.06, "discount":22.57, "profit_percent":20.13} {"id":"P122", "name":"pant", "reseller":"Reebok", "category":"bottom", "price":1234.99, "discount":99.99, "profit_percent":67.12} all three failed with same error Next, I deleted the table and recreated with Double CREATE EXTERNAL TABLE IF NOT EXISTS testtable (Id string,Name string,Reseller string,Category string,Price Double,Discount Double,Profit_percent Double) ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' location "/user/testuser/test"; with Data in below format (new line without comma) {"id":"P121", "name":"Round Tee", "reseller":"Puma", "category":"Top Wear", "price":2379.06, "discount":22.57, "profit_percent":20.13} {"id":"P122", "name":"pant", "reseller":"Reebok", "category":"bottom", "price":1234.99, "discount":99.99, "profit_percent":67.12} Now Select Query is not fetching result without error. However, am still now sure why Decimal was throwing error.
... View more
09-12-2018
02:21 AM
Hi Team, Below are my installation details Hive 1.1.0-cdh5.11.2 Hadoop 2.6.0-cdh5.11.2 I want to load JSON data into Hive table for further analysis thus I followed belwo steps, 1. created Hive table using serde - successful CREATE EXTERNAL TABLE IF NOT EXISTS testtable (name String, age int, createdon Date) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.DelimitedJSONSerDe' WITH SERDEPROPERTIES ("ignore.malformed.json" = "true") LOCATION '/user/testuser/test'; 2. Loaded JSON data into Hive table from local file - successful load data local inpath '/home/testuser/test.json' into table testtable; However, when I try to query this data using Selct command I get below error, hive> select * from testtable; OK Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: DelimitedJSONSerDe cannot deserialize. Please help me understand the root cause of this error. Also, do suggest if this is the right way to load and query JSON data.
... View more
Labels:
- Labels:
-
Apache Hive