Member since
07-03-2017
29
Posts
21
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4819 | 10-27-2017 07:50 AM |
06-29-2018
08:23 AM
@Javert Kirilov When a table is dropped, the data in case of managed tables gets cleaned up. However for an external table only the metadata of the table is cleared and the data still persists as it is in its place. Could you please confirm if you are using managed or external table?
... View more
06-27-2018
06:21 PM
@Robert Lake
Can you please add the yarn capacity scheduler configs here? Thanks!
... View more
03-28-2018
07:27 AM
@PJ All the hive queries are executed using the queue specified by "tez.queue.name" property. You should try using -Dtez.queue.name=A argument in sqoop import job.
... View more
01-10-2018
06:23 PM
@Mohammed Syam You can check the host on which hiveserver2 is running by going to Ambari UI and clicking on HiverServer2 under Hive service section. Login to the host and run the script. You can confirm if hive is running by using the command 'which hive' as mentioned above before running the script.
... View more
01-10-2018
06:56 AM
1 Kudo
@Vijay Tiwari As I can see from the command the --dataset parameter is not supplied with any argument. Can you confirm if this is a typo? Otherwise please share complete logs with the command you are trying.
... View more
01-10-2018
06:48 AM
1 Kudo
@Revathy Mourouguessane https://stackoverflow.com/questions/21234370/how-to-change-the-field-terminated-value-for-an-existing-hive-table This link would be useful to change the delimiter field without having to copy data to another table.
... View more
01-10-2018
06:48 AM
@Revathy Mourouguessane https://stackoverflow.com/questions/21234370/how-to-change-the-field-terminated-value-for-an-existing-hive-table This link would be useful to change the delimiter field without having to copy data to another table.
... View more
01-10-2018
06:42 AM
@Revathy Mourouguessane No, you cannot use multiple delimiter with sqoop export. Enclosed by(#) and field terminated by(|) might not work with your dataset. You could try copying the data to another hive table using "\b" (backspace) as delimiter and then you should able to export the data to my sql.
... View more
01-10-2018
06:41 AM
1 Kudo
@Revathy Mourouguessane No, you cannot use multiple delimiter with sqoop export. Enclosed by(#) and field terminated by(|) might not work with your dataset. You could try copying the data to another hive table using "\b" (backspace) as delimiter and then you should able to export the data to my sql.
... View more
01-09-2018
07:29 PM
1 Kudo
@sivareddy akkala can you post the sqoop job command here and full logs. The error states its unable to find the file in hdfs location - java.io.FileNotFoundException: File hdfs://Intra1cluster/intraday/stage/lte/vertica/rops/fdd5/_temporary/1 does not exist.
... View more
10-31-2017
06:58 PM
1 Kudo
@Chaitanya D Can you try adding the option --bindir to your sqoop command. --bindir <path to sqoop home>/lib
e.g. /usr/lib/sqoop/sqoop-1.4.6/lib/
... View more
10-31-2017
06:40 PM
1 Kudo
@Andrew Duncan Can you try running the sqoop job by placing all options related to incremental append before --connect option. Something like this - sqoop job --create myjob -- import --append ---check-column <column> --incremental append --last-value -9999999 --connect "jdbc:sqlserver://<server>;database=<database>" -username <databaseUser> --password <databasePassword> --table <table> --target-dir "<someDataDirectory>" --hive-import --hive-database <hiveDatabase>--autoreset-to-one-mapper --hive-delims-replacement " " --outdir <outputdir> --bindir <bindir> -- -- --schema <databaseSchema> Do let me know if it works.
... View more
10-31-2017
06:12 AM
1 Kudo
@aparna aravind From the error message it seems IDENTITY_INSERT is OFF for the target table. Can you set IDENTITY_INSERT to ON for the target table as shown below before executing sqoop job? SET IDENTITY_INSERT <target table> ON
... View more
10-29-2017
06:02 AM
@Kishore Jasthi Can you try using '--hcatalog-table' instead of '--hive-table'?
... View more
10-29-2017
05:49 AM
@Ravikiran Dasari Can you please share the sqoop command you are trying to run?
... View more
10-27-2017
07:50 AM
2 Kudos
@Omkar Nalawade As per the Sqoop document, "The --null-string and --null-non-string arguments are optional.\ If not specified, then the string "null" will be used." So, in case if you are using --null-string or --null-non-string you will need to pass some value to these arguments. If you want string "null" to be used in place of null values you can ignore these arguments. You can also use '\N' or "\\N" to use null values. https://community.hortonworks.com/content/supportkb/49673/null-strings-not-handled-by-sqoop-export.html
... View more
10-19-2017
12:08 AM
1 Kudo
Hi, Did you try this ? agent.sources.localsource.interceptors = search-replace
agent.sources.localsource.interceptors.search-replace.type = search_replace
agent.sources.interceptors.search-replace.searchPattern = ^INFO:
agent.sources.interceptors.search-replace.replaceString = Log msg:
... View more
09-16-2017
05:38 PM
1 Kudo
@kishan vantakala Can you please paste the full logs using --verbose option while running the command here.
... View more
09-14-2017
03:07 PM
2 Kudos
@Gayathri Devi You can export hive table to mysql using sqoop. You can refer to the syntax below - sqoop export --connect jdbc:mysql://127.0.0.1/export --username hive --password hive --table exported --direct --export-dir /apps/hive/warehouse/export_table --driver com.mysql.jdbc.Driver Please make sure the table is created in mysql before exporting the data. Please give it a try and let me know if you face any issue.
... View more
09-13-2017
04:35 AM
2 Kudos
@Sami Ahmad Hive Sink is part of Flume 1.5.2 https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/ds_flume/FlumeUserGuide.html#hive-sink
... View more
09-13-2017
04:27 AM
@Biswajit Chakraborty Did you get a chance to try this? Please update.
... View more
09-13-2017
04:26 AM
1 Kudo
@Sami Ahmad Did you get a chance to try this config and does it work? Please update.
... View more
09-07-2017
01:58 PM
2 Kudos
@Sami Ahmad You can specify multiple sinks in the agent.sinks config. Something like this - flume1.sinks = hdfs-sink-1 hive-sink-2
flume1.sinks.hive-sink-2.channel = hdfs-channel-1
flume1.sinks.hive-sink-2.type = hive
flume1.sinks.hive-sink-2.hive.metastore = thrift://localhost:9083
flume1.sinks.hive-sink-2.hive.database = <flumedb>
flume1.sinks.hive-sink-2.hive.table = <flumetable>
flume1.sinks.hive-sink-2.serializer = DELIMITED
flume1.sinks.hive-sink-2.serializer.delimiter = ,
flume1.sinks.hive-sink-2.serializer.fieldnames = <field_names>
flume1.sinks.hive-sink-2.batchSize = 10 Modify and add the configs accordingly and please give it a try.
... View more
09-06-2017
12:50 PM
1 Kudo
@Andres Urrego Can you please use --verbose option and share the full logs of the failed job?
... View more
09-06-2017
11:57 AM
@kumar pmaha Can you please update on this and close the issue if resolved?
... View more
09-06-2017
11:06 AM
1 Kudo
@Biswajit Chakraborty Can you try using tail command with -F option rather than -f?
... View more
09-04-2017
06:21 PM
@D Giri It's working for you now for various scenarios?
... View more
08-30-2017
06:30 AM
@kumar pmaha You should use Sqoop command > logfile 2>&1 instead. You can achieve the same functionality using Sqoop command &> logfile. Please let me know if you face any issues.
... View more
08-29-2017
03:01 PM
1 Kudo
@D Giri I have tried the scenario with combination of Kafka Channel and Kafka Sink with source as pstream. Source (pstream) -> Channel (kafkaChannel) -> Sink (kafkaSink) Please find below the configs that I have used - agent1.sources = pstream
agent1.channels = kafkaChannel
agent1.sinks = kafkaSink
agent1.sources.pstream.type = exec
agent1.sources.pstream.channels = kafkaChannel
agent1.sources.pstream.command = python print_number.py
agent1.channels.kafkaChannel.type = org.apache.flume.channel.kafka.KafkaChannel
agent1.channels.kafkaChannel.kafka.topic = dharmik-test
agent1.channels.kafkaChannel.parseAsFlumeEvent = false
agent1.channels.kafkaChannel.kafka.bootstrap.servers = <kafkaBroker:port>
agent1.sinks.kafkaSink.type = org.apache.flume.sink.kafka.KafkaSink
agent1.sinks.kafkaSink.channel = kafkaChannel
agent1.sinks.kafkaSink.batchSize = 1
agent1.sinks.kafkaSink.topic = dharmik-test
agent1.sinks.kafkaSink.requiredAcks = 1
agent1.sinks.kafkaSink.kafka.topic.metadata.refresh.interval.ms = 1000
agent1.sinks.kafkaSink.brokerList = <kafkaBroker:port>
print_number.py just prints number between 0 and 100. You can check if the data is properly produced by the kafkaSink using the consumer cli. sh /usr/hdp/2.6.2.0-152/kafka/bin/kafka-console-consumer.sh --new-consumer --topic dharmik-test --bootstrap-server <kafkaBroker:port> Please give it a try and let me know if you face any issue.
... View more