Member since
09-24-2015
527
Posts
136
Kudos Received
19
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2281 | 06-30-2017 03:15 PM | |
3306 | 10-14-2016 10:08 AM | |
8567 | 09-07-2016 06:04 AM | |
10354 | 08-26-2016 11:27 AM | |
1533 | 08-23-2016 02:09 PM |
02-23-2016
09:46 AM
1 Kudo
Hi: Thanks for the information, finally, i used this: $0#'VARIABLE'#'IMP-NOMINAL-F' Many thanks
... View more
02-23-2016
08:05 AM
2 Kudos
Hi: I need to reed this json but i cant print or access to the multilevel field just I can read the fisrt level, look: for example i need to read "VARIABLE":{"TIPINC-F") my json {"NUM-PARTICION-F":"001","NOMBRE-REGLA-F":"SAI_TIP_INC_TRN","FECHA-OPRCN-F":"2015-12-06 00:00:01","COD-NRBE-EN-F":"9998","COD-NRBE-EN-FSC-F":"9998","COD-INTERNO-UO-F":"0001","COD-INTERNO-UO-FSC-F":"0001","COD-CSB-OF-F":"0001","COD-CENT-UO-F":"","ID-INTERNO-TERM-TN-F":"A0299989","ID-INTERNO-EMPL-EP-F":"99999989","CANAL":"01","NUM-SEC-F":"764","COD-TX-F":"SAI01COU","COD-TX-DI-F":"TUX","ID-EMPL-AUT-F":"U028765","FECHA-CTBLE-F":"2015-12-07","COD-IDENTIFICACION-F":"","IDENTIFICACION-F":"","VALOR-IMP-F":"0.00","VARIABLE":{"TIPINC-F":"0","PERFIL-CAJ-F":"0","PERFIL-COM-F":"0","PERFIL-TAR-F":"0","RESPONSABLE-F":"0","RESP-EXCEP-F":"0","EXCEPCION-F":"0","STD-CHAR-01-F":"1","STD-DEC-1-F":"0","STD-DEC-2-F":"0"} } my code: A = LOAD '/RSI/staging/input/logs/log.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad');
B = FOREACH A GENERATE (CHARARRAY) $0#'FECHA-OPRCN-F' AS
fecha, (CHARARRAY) $0#'COD-NRBE-EN-F' AS entidad, (CHARARRAY) $0#'COD-INTERNO-UO-FSC-F' AS ofi
, (CHARARRAY) $0#'COD-TX-F' AS ope;
my output (2015-12-06 00:06:40,9998,0001,DVI82OOU,)
(2015-12-06 00:06:42,9998,0001,DVI95COU,)
(2015-12-06 00:06:49,3191,9204,BDPPM1ZJ,)
(2015-12-06 00:06:49,3076,9554,STR03CON,)
(2015-12-06 00:06:53,3008,9521,BDPPM1RJ,)
... View more
Labels:
- Labels:
-
Apache Pig
02-18-2016
02:36 PM
1 Kudo
I mean the files are here: /user/dangulo/tables_pig/year=2016/month=01 And the table that i create was like this: CREATE EXTERNAL TABLE journey_v4_externa(
CODTF string,
CODNRBEENF string,
FECHAOPRCNF timestamp,
FRECUENCIA int)
PARTITIONED BY (year string,month string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
stored as Avro
LOCATION '/user/dangulo/tables_pig'
TBLPROPERTIES ("immutable"="false","avro.compress"="zlib");
so, i dont know why there is not data into the table.
... View more
02-18-2016
01:13 PM
Yes i know, for that reason after drop and create I dont know why the table is empty, because the files are there. thanks
... View more
02-18-2016
12:27 PM
1 Kudo
Hi: i create external table: CREATE EXTERNAL TABLE journey_v4_externa(
CODTF string,
CODNRBEENF string,
FECHAOPRCNF timestamp,
FRECUENCIA int)
PARTITIONED BY (year string,month string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
stored as Avro
LOCATION '/user/dangulo/tables_pig'
TBLPROPERTIES ("immutable"="false","avro.compress"="zlib");
then insert the date, then y drop the table, but the files still in HDFS and the i RE-create the table but the table is empty, what can i do to insert the data again??? thanks.
... View more
02-18-2016
11:51 AM
2 Kudos
Hi: I want to delete one column to Hive table, my table is like that: CREATE TABLE journey_v4(
CODTF string,
CODNRBEENF string,
FECHAOPRCNF timestamp,
FRECUENCIA int)
PARTITIONED BY (year string,month string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
stored as Avro
TBLPROPERTIES ("immutable"="false","avro.compress"="zlib","immutable"="false");
and then i added new column: ALTER TABLE journey_v4 ADD COLUMNS (EXTRA string); then i want to delete de column EXTRA to go back to the original table, but it doesnt have effect ALTER TABLE journey_v4 REPLACE COLUMNS (CODTF string, CODNRBEENF string, FECHAOPRCNF timestamp,FRECUENCIA in); Anny suggestions???? Thanks
... View more
Labels:
- Labels:
-
Apache HCatalog
-
Apache Hive
02-18-2016
10:20 AM
Hi: Yes i sued the compress but still log time the reducer tasks, i think its in the merge final tasks, the shuffle and sort i think its fine. Ill try the combiner class from R. Ill inform you. Many thanks
... View more
02-17-2016
06:24 PM
Hi, yes i used the compress map output, i forget to comment, but still i didnt use combinner class, ill try and ill tell you. Many many thanks.
... View more
02-17-2016
04:56 PM
Hi: I change this parameter and now the job finished after 32 minutes but, Still I dont know why from 96% to 100 % the reducer long time llok: 16/02/17 17:46:32 INFO mapreduce.Job: Running job: job_1455727501370_0001
16/02/17 17:46:39 INFO mapreduce.Job: Job job_1455727501370_0001 running in uber mode : false
16/02/17 17:46:39 INFO mapreduce.Job: map 0% reduce 0%
.
16/02/17 17:53:29 INFO mapreduce.Job: map 100% reduce 92%
16/02/17 17:53:31 INFO mapreduce.Job: map 100% reduce 93%
16/02/17 17:53:46 INFO mapreduce.Job: map 100% reduce 96%
"and now after 30 minute will finifhed"
The parameter i changed are: mapreduce.job.reduce.slowstart.completedmaps=0,8
mapreduce.reduce.shuffle.parallelcopies
mapreduce.reduce.shuffle.input.buffer.percent
mapreduce.reduce.shuffle.merge.percent
FROM RStudio
rmr.options(backend.parameters = list(
hadoop = list(D = "mapreduce.map.memory.mb=4096",
D = "mapreduce.job.reduces=7",
D="mapreduce.reduce.memory.mb=5120" Any more parameter that it can help me??? Thanks
... View more
02-17-2016
12:49 PM
this -Dmapred.reduce.tasks=x is for mapreduce1 iam using mapreduce2 and yarn and i dont know how to change this parameter. anny suggestion?? Thanks
... View more