Member since
12-15-2015
49
Posts
20
Kudos Received
0
Solutions
02-18-2019
03:24 AM
3 Kudos
Hi, We have just upgraded to HDP 3.1 and I found that Hive 3.1 show tables command does not list views. While connecting with Tableau ODBC driver, only tables are visible, because it triggers show tables command. not show views. Kindly let me know, if anyone has solved this issue?? Regards Mamta Chawla
... View more
02-18-2019
01:18 AM
Hi @philippe pouliot, I am also facing the same issue, I just upgraded to HDP 3.1 and found with ODBC driver, I am able to see tables only not the views. DId you get it any solution for the same?? Regards Mamta Chawla
... View more
01-10-2019
09:25 PM
Hi, I have a list of json files (single line json) below one HDFS folder, when I try to read json in data set using sparkContext.read.json("/x/y/z/*") and do count operation, it takes around 50 minutes for 3 millions record. Kindly let me know, how can I optimize it. Regards Mamta Chawla
... View more
Labels:
06-16-2018
04:46 PM
What about the scenario when you have around 200 groups and we do not want to create that many groups in AD, while we can easily manage at ranger level. This is really wired, we can create the groups in ranger, but it does not work even. I am seeing this as bug in Ranger!!
... View more
06-11-2018
02:35 PM
Hi Ankit, we tried that as well -Dsqoop.export.statements.per.transaction to 1, but it does not work!! Regards Mamta Chawla
... View more
06-08-2018
04:28 AM
Hi, When I run below sqoop command with -num-mapper 10, while it succeeeds when -num-mappers is set 1; kindly suggest, how to run the command with -num-mapper 10. driver=com.sybase.jdbc4.jdbc.SybDriver
echo "${driver}"
jconnect=jdbc:sybase:Tds:USD01V-SYIQ003:7777/DATABASE=JFORSDEV
echo "${jconnect}"
sqoop export \
-Dsqoop.export.statements.per.transaction=1000 \
-verbose \
-driver "${driver}" \
-connect "${jconnect}" \
-username=tableau \
-password=Smile123 \
-direct \
-export-dir '/tmp/test' \
-input-lines-terminated-by '\n' \
-input-optionally-enclosed-by '\"' \
-fields-terminated-by '\t' \
-table tableau.sqoopExport_orc \
-columns 'id,first_name,last_name,address' \
-batch \
-num-mappers 10 \
; Sqoop export does partial export and fails with error 2018-06-08 04:17:19,385 FATAL [IPC Server handler 11 on 37311] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1528413303562_0066_m_000005_0 - exited : java.io.IOException: java.sql.BatchUpdateException: JZ0BE: BatchUpdateException: Error occurred while executing batch statement: SQL Anywhere Error -210: User 'another user' has the row in 'sqoopExport_orc' locked
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:205)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
... View more
Labels:
10-13-2017
11:32 PM
Hi, We have Kerberos, AD as KDC. I want to generate the keytabs for service accounts. kadmin -r <ad-domain> -p CN=kadmin,OU=Service Accounts,DC=xxxx,DC=xxxx,DC=com -w xxxxxxx -s ADSever kadmin: Missing parameters in krb5.conf required for kadmin client while initializing kadmin interface Kindly Suggest if any solution?? Regards Mamta Chawla
... View more
10-02-2017
03:57 PM
Hi, I am trying to get table schema from source system with data types, is is possible with sqoop commands?? Right now I am using sqoop eval, which gets list of columns only, but need data types as well?? kindly let me know if any solution?? Regards Mamta Chawla
... View more
Labels:
09-24-2017
08:26 PM
Hi, I have created a AVRO format External Hive table. Then I added few more columns in AVSC and recreated new Hive table, with new columns having default values. But when I am querying the table, old data is not appearing. Created old table as- SET avro.output.codec=snappy;
CREATE EXTERNAL TABLE finpolicy1
PARTITIONED BY (ds string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION '/user/mamta.chawla/fin/avro'
TBLPROPERTIES ('avro.schema.url'='/user/mamta.chawla/fin/avsc/fin_1.avsc'); Created new table as SET avro.output.codec=snappy;
CREATE EXTERNAL TABLE finpolicy1
PARTITIONED BY (ds string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION '/user/mamta.chawla/fin/avro'
TBLPROPERTIES ('avro.schema.url'='/user/mamta.chawla/fin/avsc/fin_2.avsc'); fin_1.avsc is {
"namespace": "testing.hive.avro.serde",
"name": "cards",
"type": "record",
"fields": [
{
"name":"batchID",
"type":"string",
"doc":"Order of playing the role"
},
{
"name":"color",
"type":"string",
"doc":"Order of playing the role"
},
{
"name":"suit",
"type":"string",
"doc":"card suit"
}]} fin_2.avsc is {
"namespace": "testing.hive.avro.serde",
"name": "cards",
"type": "record",
"fields": [
{
"name":"batchID",
"type":"string",
"doc":"Order of playing the role"
},
{
"name":"color",
"type":"string",
"doc":"Order of playing the role"
},
{
"name":"suit",
"type":"string",
"doc":"card suit"
},
{
"name":"PIA",
"type":"string",
"doc":"last name of actor playing role",
"default":""
}]} When I query data after schema evolution and query the table as select * from finpolicy, I am getting no record returned. I am loading data into avro table from text table using hive -e "insert into table ${avro_table} PARTITION(ds='Q1')
select t.* from ${tabName} t;" Is there I am missing something?? Regards Mamta Chawla
... View more
09-15-2017
01:55 AM
Hi, In my project we are storing snappy compressed avro file in encrypted zone, the requirement is that if someone is successfully able to copy avro file outside of the Hadoop cluster and try to de-serialized, avro file should not be de-serialized. Only in the cluster it should be de-serialized. Is there any solution?? Thanks Mamta
... View more
07-13-2016
05:26 AM
@Sunile Manjee I am loading data from data file as hive load. Can't do above. But as per my understanding '\N' should reflect as None. Any solution or gap?? Mamta
... View more
07-13-2016
03:50 AM
Hi, I have data file, in which I am replacing NULL values with '\N', but in Hive it is appearing as it is. Below is how my file looks like. U500010602,E","'\N'",,"'\N'",,"'\N'",,"'\N'",00010101 and below is the sanpshot of how it looks in hive, But I want, '\N' should appear as None. Kindly let me know, how can I do so. Thanks Mamta Chawla
... View more
07-10-2016
05:40 AM
Hi, Is it possible to sqoop from sql db to edge node/unix local path, instaed of HDFS?? Regards Mamta Chawal
... View more
Labels:
05-30-2016
11:45 PM
Hi @Ravi Mutyala can you tell me how to use node label?? Regards Mamta Chawla
... View more
05-30-2016
09:19 PM
2 Kudos
Hi, I have shell scripts, which contains hive commands. I am running shell script using OOzie. But I get error hive : command not found. <action name="get_run_date">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>${hive_conf_path}</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${Run_Date_File}</exec>
<argument>${run_var_path}</argument>
<file>${Run_Date_File_Path}${Run_Date_File}</file>
<capture-output />
</shell>
<ok to="Insert_Subro_Table" />
<error to="fail" />
</action> ${Run_Date_File} - contains hive command. Kindly let me know, how can I execute Hive command from shell action in oozie. Is that something related to Edge node not able to access shell?? Regards Mamta Chawla
... View more
Labels:
05-25-2016
07:19 PM
Hi, I have a shell script in HDFS echo.sh, which contains echo Hello I want to execute the shell script from Unix shell as below. sh <HDFS_SCRIPT_PATH>/echo.sh fails saying no such file. Kindly let me know how can i do it. Regards Mamta Chawla
... View more
- Tags:
- Data Processing
- HDFS
Labels:
05-16-2016
11:19 PM
Thanks @Robert lan Looking forward for solution of the problem. Regards Mamta Chawla
... View more
05-15-2016
11:38 PM
Hi, I am trying very simple oozie workflow, which executes a shell file, having and and redirecting echo to a file. But nothing is happening. Below is my workflow, shell file and property file. Workflow <?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.5" name="RESERVE_FEATURE_PREDICTIVE_MODEL">
<start to="IncrementalLoad" />
<!-- Choose the script to Run -->
<!-- Delete Output Folder from HDFS -->
<action name="IncrementalLoad">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>${hive_conf_path}</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${IncrementalScript}</exec>
<argument>${LogPath}</argument>
<file>${script_path}${IncrementalScript}#${IncrementalScript}</file>
<capture-output />
</shell>
<ok to="end" />
<error to="sendEmailKill" />
</action>
<kill name="sendEmailKill">
<message>WorkFlow Name : RESERVE_FEATURE_PREDICTIVE_MODEL == > Killed job due to error: ${wf:errorMessage(wf:lastErrorNode())}</message>
</kill>
<end name="end" />
</workflow-app> Shell script (Incremental.sh) #!/bin/bash -e LogPath=$5
#echo logging
#echo logging >>$LogPath>> Incremetal.sh is in hdfs. LogPath is local path. Even I tried with HDFS path. When I excecute the shell script if finishes successful, with out doing anything. If I don't use # in #echo logging, it gives me shell main class exit error. Kindly let me know, how can I receive echo/logging to file using Oozie/shell action. Regards Mamta Chawla
... View more
Labels:
05-11-2016
08:35 PM
Hi, I have data files, which need to be uploaded to Hive. But there is requirement to check the values of columns with non string data type, like int, timestamp etc, have the data according to the type. Like int column has int value, date column should have date etc. Kindly suggest me how can I do this in hive. Or if there is any way to validate the file. File is delimired file. Regards Mamta Chawla
... View more
Labels:
05-11-2016
05:40 AM
Thanks @Kaliyug Antagonist, for the solution it worked.
... View more
05-09-2016
10:58 PM
@mark doutre Hi Mark, Can you please plrovide me a sample how to add Avro schema with avro data?? Thanks Mamta
... View more
05-06-2016
11:50 PM
Hi,I have very simple AVSC file, And I generated the Avro using GitHub Code, xml to Avro converter, https://github.com/elodina/xml-avro/tree/master/src/ly/stealth/xmlavro/Converter.java But when I query the table I get below error. Avro - java.io.IOException: java.io.IOException: Not a data file. I can see the avro file inside the table folder. Kindly let me know what I am missing?? Below is my AVSC- CREATE TABLE embedded
COMMENT "just drop the schema right into the HQL"
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION 'hdfs://csaa-aap-qa/apps/hive/warehouse/reservemodel.db/embedded'
TBLPROPERTIES (
'avro.schema.literal'='{
"fields": [
{
"name": "BillDate",
"source": "element BillDate",
"type": "string"
},
{
"name": "BillTime",
"source": "element BillTime",
"type": "string"
},
{
"name": "Remit_CompanyName",
"source": "element Remit_CompanyName",
"type": "string"
},
{
"name": "Remit_Addr",
"source": "element Remit_Addr",
"type": "string"
},
{
"name": "Remit_CityStZip",
"source": "element Remit_CityStZip",
"type": "string"
},
{
"name": "Remit_Phone",
"source": "element Remit_Phone",
"type": "string"
},
{
"name": "Remit_Fax",
"source": "element Remit_Fax",
"type": "string"
},
{
"name": "Remit_TaxID",
"source": "element Remit_TaxID",
"type": "string"
},
{
"name": "Previous_Balance",
"source": "element Previous_Balance",
"type": "string"
},
{
"name": "others",
"type": {
"type": "map",
"values": "string"
}
}
],
"name": "MetroBillType",
"namespace": "ly.stealth.xmlavro",
"protocol": "xml",
"type": "record"
} ')
;
And the XML is. <?xml version="1.0" encoding="UTF-8" ?>
<MetroBill xmlns:xs="http://www.w3.org/2001/XMLSchema-instance" xs:noNamespaceSchemaLocation="Metrobill.xsd" >
<BillDate>02/29/2016</BillDate>
<BillTime>18:49:05</BillTime>
<Remit_CompanyName>METROPOLITAN REPORTING BUREAU</Remit_CompanyName>
<Remit_Addr>P.O. BOX 926, WILLIAM PENN ANNEX</Remit_Addr>
<Remit_CityStZip>PHILADELPHIA, PA 19105-0926</Remit_CityStZip>
<Remit_Phone>(800) 245-6686</Remit_Phone>
<Remit_Fax>(800) 343-9047</Remit_Fax>
<Remit_TaxID>23-1879730</Remit_TaxID>
<Previous_Balance>1663</Previous_Balance>
</MetroBill>
The Avro genrated is with special chars- 02/29/201618:49:05:METROPOLITAN REPORTING BUREAU@P.O. BOX 926, WILLIAM PENN ANNEX6PHILADELPHIA, PA 19105-0926(800) 245-6686(800) 343-904723-18797301663 Regards Mamta Chawla
... View more
- Tags:
- Avro
- Data Processing
04-22-2016
06:48 AM
1 Kudo
I have an avsc like below, in which record type is under record type. When I am trying to import in hive, I get table created with columns error schema and all.. Kindly suggest me how to import such avsc in hive. CREATE TABLE metro
ROW FORMAT
SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES ('avro.schema.literal'='{
"namespace": "ly.stealth.xmlavro", "protocol": "xml", "type" : "record",
"name" : "MetroBillType", "fields" : [ {
"name" : "BillDate",
"type" : "string"
}, {
"name" : "BillTime",
"type" : "string"
}, {
"name" : "Remit_CompanyName",
"type" : "string"
}, {
"name" : "Remit_Addr",
"type" : "string" }, {
"name" : "Remit_CityStZip",
"type" : "string" }, {
"name" : "Remit_Phone",
"type" : "string" }, {
"name" : "Remit_Fax",
"type" : "string" }, {
"name" : "Remit_TaxID",
"type" : "string" }, {
"name" : "BillAcct_Break",
"type" : {
"type" : "record",
"name" : "BillAcct_BreakType",
"fields" : [ {
"name" : "BillAcct",
"type" : "string" }, {
"name" : "Invoice_Number",
"type" : "int"
}, {
"name" : "Acct_Break",
"type" : {
"type" : "record",
"name" : "Acct_BreakType",
"fields" : [ {
"name" : "Acct",
"type" : "string" }, {
"name" : "Items",
"type" : {
"type" : "record",
"name" : "ItemsType",
"fields" : [ {
"name" : "Item",
"type" : {
"type" : "array",
"items" : {
"type" : "record",
"name" : "ItemType",
"fields" : [ {
"name" : "Account",
"type" : "string" }, {
"name" : "Claim_Number",
"type" : "string" }, {
"name" : "Insured_Name",
"type" : "string" }, {
"name" : "Price",
"type" : "float" }, {
"name" : "Control_Number",
"type" : "int" }, {
"name" : "State",
"type" : "string" }, {
"name" : "Report_Type_Code",
"type" : "string" }, {
"name" : "Report_Type_Desc",
"type" : "string" }, {
"name" : "Policy_Number",
"type" : "string" }, {
"name" : "Date_of_Loss",
"type" : "string" }, {
"name" : "Date_Received",
"type" : "string" }, {
"name" : "Date_Closed",
"type" : "string" }, {
"name" : "Days_to_Fill",
"type" : "int" }, {
"name" : "Police_Dept",
"type" : "string"
}, { "name" : "Attention",
"type" : "string" }, {
"name" : "RequestID",
"type" : "int" }, {
"name" : "ForceDup",
"type" : "string" }, {
"name" : "BillAcct",
"type" : "string" }, {
"name" : "BillCode",
"type" : "string" } ]
}
}
} ]
}
}, {
"name" : "Acct_Total",
"type" : "float"
}, {
"name" : "Acct_Count",
"type" : "int"
} ]
}
}, {
"name" : "Bill_Total",
"type" : "float"
}, {
"name" : "Bill_Count",
"type" : "int"
} ]
}
}, {
"name" : "Previous_Balance",
"type" : "int"
} ]
}'); Thanks Mamta
... View more
- Tags:
- Avro
- Data Processing
03-22-2016
04:53 PM
@Saurabh Kumar I am looking for bothe database, using sqoop. sqoop list-databases --connect jdbc:mysql://sandbox.hortonworks.com/hive --username hive --password hive sqoop list-databases --connect jdbc:mysql://sandbox.hortonworks.com/mysql --username hive --password hive I got these urls from hive-site.xml. Is there any other connection string I need to refer??? Mamta
... View more
03-22-2016
05:11 AM
Hi, sqoop list-databases --connect jdbc:mysql://sandbox.hortonworks.com/hive --username hive --password hive is showing only system databases or public. How can i see the databases i created in hive?? Mamta
... View more
- Tags:
- Data Processing
- Sqoop
Labels:
03-16-2016
07:19 PM
1 Kudo
this is the error message org.apache.hadoop.security.AccessControlException: Permission denied: user=hue does not have privilage to create Database. While it works from CLI. Mamta
... View more
03-16-2016
07:19 PM
1 Kudo
this is the error message org.apache.hadoop.security.AccessControlException: Permission denied: user=hue does not have privilage to create Database. While it works from CLI. Mamta
... View more
03-16-2016
06:50 PM
1 Kudo
How to know ranger is enabled??
... View more
03-16-2016
05:48 PM
1 Kudo
I have created a DB using CLI, using user hue who is superuser also. And granted hue owner permission as well. but when I go to Hive UI/Beeline UI and try to create table, it give me permission denied error. Kindly let me know, how can i solve it.
... View more
Labels:
03-01-2016
06:53 PM
@Neeraj Sabharwal Thanks for your response. But the query I am trying is nested query. Simple queries are running fine for me. Why failing for nested??
... View more