Member since
12-21-2016
83
Posts
5
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13676 | 02-08-2017 05:56 AM | |
3246 | 01-02-2017 11:05 PM |
12-20-2020
04:38 PM
How to add a new column to an existing parquet table and how to update it ?
... View more
Labels:
04-24-2020
09:32 AM
Traceback (most recent call last): File "consumer.py", line 8, in <module> consumer = KafkaConsumer('test', File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/consumer/group.py", line 355, in __init__ self._client = KafkaClient(metrics=self._metrics, **self.config) File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/client_async.py", line 242, in __init__ self.config['api_version'] = self.check_version(timeout=check_timeout) File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/client_async.py", line 907, in check_version version = conn.check_version(timeout=remaining, strict=strict, topics=list(self.config['bootstrap_topics_filter'])) File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/conn.py", line 1228, in check_version if not self.connect_blocking(timeout_at - time.time()): File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/conn.py", line 337, in connect_blocking self.connect() File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/conn.py", line 426, in connect if self._try_handshake(): File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/conn.py", line 505, in _try_handshake self._sock.do_handshake() File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1108) I am getting above error after running the program, Any inputs ?
... View more
04-23-2020
10:35 AM
Web Hdfs is disabled for our cluster.. Is there any other options ?
... View more
04-21-2020
04:41 PM
Hi,
I am trying to connect Kafka from my local machine to kafka kerberized cluster using python, but i am connect with below credentials. Could any guide me and you help is appreciated.
consumer = KafkaConsumer('test',bootstrap_servers='XXX:1234', #client_id= kafka-python- + __version__, request_timeout_ms=30000, connections_max_idle_ms=9 * 60 * 1000, reconnect_backoff_ms=50, reconnect_backoff_max_ms=1000, max_in_flight_requests_per_connection=5, receive_buffer_bytes=None, send_buffer_bytes=None, #socket_options= [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)], sock_chunk_bytes=4096, # undocumented experimental option sock_chunk_buffer_count=1000, # undocumented experimental option retry_backoff_ms=100, metadata_max_age_ms=300000, security_protocol='SASL_SSL', ssl_context=None, ssl_check_hostname=True, ssl_cafile=None, ssl_certfile=None, ssl_keyfile=None, ssl_password=None, ssl_crlfile=None, api_version=None, api_version_auto_timeout_ms=2000, #selector=selectors.DefaultSelector, sasl_mechanism='GSSAPI', #sasl_plain_username= None, #sasl_plain_password='XXX', sasl_kerberos_service_name='XXX', # metrics configs metric_reporters=[], metrics_num_samples=2, metrics_sample_window_ms=30000)
for msg in consumer: print(msg)
Please guide and you help is appreciated.
Thanks
... View more
Labels:
04-20-2020
04:37 PM
Hi,
I am trying to connect and authenticate kerberized cluster using python program and read hdfs files. Could anyone help me to achieve it ?
Your help is appreciated.
Thanks
... View more
04-20-2020
03:17 PM
Hi, I am trying to connect from local machine to a kerberized kafka cluster through python as python client, could you please let me know what all the properties to include along with bootstrap server ? consumer = KafkaConsumer('test',bootstrap_servers='XXX.ORG:XXXX', #client_id= kafka-python- + __version__, request_timeout_ms=30000, connections_max_idle_ms=9 * 60 * 1000, reconnect_backoff_ms=50, reconnect_backoff_max_ms=1000, max_in_flight_requests_per_connection=5, receive_buffer_bytes=None, send_buffer_bytes=None, #socket_options= [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)], sock_chunk_bytes=4096, # undocumented experimental option sock_chunk_buffer_count=1000, # undocumented experimental option retry_backoff_ms=100, metadata_max_age_ms=300000, security_protocol='SASL_SSL', ssl_context=None, ssl_check_hostname=True, ssl_cafile=None, ssl_certfile=None, ssl_keyfile=None, ssl_password=None, ssl_crlfile=None, api_version=None, api_version_auto_timeout_ms=2000, #selector=selectors.DefaultSelector, sasl_mechanism='GSSAPI', #sasl_plain_username= None, #sasl_plain_password='XXXX', sasl_kerberos_service_name='XXXX', # metrics configs metric_reporters=[], metrics_num_samples=2, metrics_sample_window_ms=30000) Your help is appreciated. Thanks
... View more
04-20-2020
03:07 PM
Hi All, I am trying to connect from Local machine to kafka cluster(kerberized Cluster) through python. Can anyone help what are the properties to specify for the krb5.conf file and other properties. your help is appreciated.
... View more
12-12-2019
04:10 PM
We even ran the MSCK repair table, but still no luck. Any other options ?
... View more
12-12-2019
02:17 PM
I am unable to create external hive table after manually deleting underlying hdfs location files of the table.
when table desc is statement is issued, it gives the describe of the table, but when select is performed on the table, then we are getting table doesn't exists. So we issued drop statement.
After issuing drop statement, then we again tried to create the table, but we are getting table already exists. Do we need to manually do a delete from the hive metastore ? or is there any way to forcefully re-create the table ? Please let me know.
... View more
Labels:
09-13-2018
12:24 PM
I tried with both the formats (Avro and Parquet), but no luck
... View more
09-13-2018
11:31 AM
If we use the query, the problem is we might be doing the function to replace to only one column and so times these kind of characters can come in other columns as-well. I feel, at runtime with query, these kind of characters can come in from any column. I feel using an argument, will apply to entire table instead of just for a specific column. It would be nice if we can have argument applied to entire table. Thanks Praveen
... View more
09-13-2018
11:20 AM
How to escape ^M character in data while doing sqoop import from the Oracle Database
... View more
Labels:
08-17-2017
05:21 PM
I am getting error which i am trying to check if the hdfs directory or not and i am trying check it through oozie fs action and below is the code, however, i am getting error. Appreciate any help on this. </action> <decision name="deleteFrompraveenPostCondition"> <switch> <case to="Export"> ${fs:exists(/dev/praveen/test/*)} </case> <default to="statusLog"/> </switch> </decision>
--------------------------------- ERROR ---------------------------------------------------------- Encountered "/", expected one of [<INTEGER_LITERAL>, <FLOATING_POINT_LITERAL>, <STRING_LITERAL>, "true", "false", "null", "(", ")", "-", "not", "!", "empty", <IDENTIFIER>] Appreciate any help on this ......
... View more
Labels:
08-17-2017
05:15 PM
I am getting error which i am trying to check if the hdfs directory or not and i am trying check it through oozie fs action and below is the code, however, i am getting error. Appreciate any help on this. </action>
<decision name="deleteFrompraveenPostCondition"> <switch> <case to="Export">
${fs:exists(/dev/praveen/test/*)} </case>
<default to="statusLog"/>
</switch>
</decision> --------------------------------- ERROR ---------------------------------------------------------- Encountered "/", expected one of [<INTEGER_LITERAL>, <FLOATING_POINT_LITERAL>, <STRING_LITERAL>, "true", "false", "null", "(", ")", "-", "not", "!", "empty", <IDENTIFIER>] Appreciate any help on this ......
... View more
Labels:
08-17-2017
10:41 AM
I am getting error which i am trying to check if the hdfs directory or not and i am trying check it through oozie fs action and below is the code, however, i am getting error. Appreciate any help on this. </action> <decision name="deleteFrompraveenPostCondition"> <switch> <case to="Export"> ${fs:ex ists(/dev/praveen/test/*)} </case> <default to="statusLog"/> </switch> </decision> --------------------------------- ERROR ---------------------------------------------------------- Encountered "/", expected one of [<INTEGER_LITERAL>, <FLOATING_POINT_LITERAL>, <STRING_LITERAL>, "true", "false", "null", "(", ")", "-", "not", "!", "empty", <IDENTIFIER>] Appreciate any help on this ......
... View more
Labels:
07-21-2017
07:31 PM
Hive - i would like to calculate percentage of column and based on the percentage i would like to load the data into another table(if the percentage of n is less 20%) or else not to load colA
y
y
y
n
------------------
Output: -- This is what i am expecting
y 80% n 20%
... View more
Labels:
05-17-2017
06:22 AM
Can you eloberate more details as i am facing same issue and when i checked it i see the java-json.jar in the oozie shared lib path, However, i don't see it in the sqoop-client/lib path on the gateway.
... View more
04-27-2017
06:24 PM
Replication is for the data-node failure, when Human deletes the data, data will lost where-ever it resides be it on any number of nodes. and this is moved into trash and if needed we can get it back within certain interval time.
... View more
04-25-2017
06:07 PM
Thanks and Yes, i can re-write it, however i am looking options if there is any way to get it back and when i drop the table,immediately the commit will occur to the meta-store, which might be causing for not recovering the hive table schema back. Any other alternatives options ?
... View more
04-25-2017
05:31 PM
I have a Hive external table and Unfortunately, the schema of the table got dropped and i want to get back the schema. Is there any ways to get it back ? I do understand that Hdfs is a file system, However, try to see if there are any possibilities.
... View more
- Tags:
- Hadoop Core
- HDFS
- hiverserver2
- Upgrade to HDP 2.5.3 : ConcurrentModificationException When Executing Insert Overwrite : Hive
Labels:
03-10-2017
06:41 PM
Thanks, This is done on the folder level of encryption, however i am looking on the fields level of encryption rather than entire file. I know ranger has this feature, however, this only help us on the hive column level of encryption when i query it, but eventually, when i look at the raw file, i could still see the sensitive data.
... View more
03-09-2017
09:55 PM
I have a requirement, where i need to encrypt certain sensitive data before landing/ingestion into Hadoop. Just want to understand, how Hadoop process these kind of encrypted data(be it in hive or pig or any map-reduce). Do we need to write specific programs? to read this kind of encrypted files in hadoop or do we need to set any parameters on hive table or pig session to read these this kind of encrypt data ? Any ideas/thoughts or suggestions ?
... View more
- Tags:
- Data Processing
- Mapreduce
- Pig
- Sqoop
- Upgrade to HDP 2.5.3 : ConcurrentModificationException When Executing Insert Overwrite : Hive
Labels:
02-16-2017
03:15 AM
I found a solution to export this kind of data to any RDBS in the form of UTF8 or any other character set by giving the specific character set after the database/host name.
... View more
02-15-2017
11:40 PM
Yes, It is displaying the special characters with good reading format after adding serilization encoding property, however,while i am exporting the data to teradata with sqoop statement as using a connection manager i getting as non-readable characters in teradata. Attached is the screen shot(teradat.png). I suspect sqoop is not reconizing the special chracters correctly or do i need to use any specific teradata jar's while exporting the data ? I have attached the ingested data(after-ingestion-data-into-hadoop.png) and the showed the data in hive after adding encoding property(after-adding-encoding-to-hive-table.png), where as the same data is not same in Teradata. I would like to see the same type of characters in teradata as-well. Any Help appreciated. )
... View more
02-14-2017
07:47 PM
I have requirement to handle file which contains special characters (like trademarks, non-utf and so on..)
... View more
Labels:
02-11-2017
12:21 AM
Could you let me know if the data is not in Quto's, How to Handle it and below is the example column 1| column 2 first|second|last In the above example the first|second are actually one column. Could you let me know how to handle if the data is not in quoto's and if the delimiter is part of the data. Any suggestion or help is appreciated.
... View more
02-09-2017
03:03 AM
Thanks and Yah .. Open Csv serde will do it.However, i am looking if there any other alternatives.
... View more
02-08-2017
06:35 AM
In hive, One of my column data contains Pipe as the part of the data('|'), however, while exporting data from this table, we need to export the data as the pipe('|') as the delimiter between each fields, How to handle if the delimiters as part of the data while creating the flat file from the hive table.
... View more
02-08-2017
06:04 AM
Just curious to find..... Why Sqoop will not allow us to create an external table while sqooping the data from RDBMS ?
... View more
Labels: