About bkosaraju

bkosaraju · ‎12-12-2017

hi @Karan Alang, looks like the column ("format") is a reserved word causing the problem, please exclude that from selection and have a try.

bkosaraju · ‎12-11-2017

Hi @Sebastien F, Couple of things needs to be considered to achieve this, as the optimizer will not be knowing till the execution of what are the values from "currentpartitiontables" (t2) is going to produced to generate the optimal explain plan. Thus, we need to make sure that Cost base optimizer (CBO) is aware of the data profile to ensure optimal explain plan, On the same context we know that t2 is very small, hence the optimizer should opt for map side join using the t2. MAPJOINs are processed by loading the smaller table into an in-memory hash map and matching keys with the larger table as they are streamed through, we can provide the hints to hive so that it will be fully aware of the data demographics, two ways we can do this With Hints select /*+ MAPJOIN(t2) */ count(*) from mypartitionedtable t1 INNER JOIN currentpartitiontable t2 on t2 ON t1.YEAR=t2.YEAR etc. etc. With the Auto Join Conversion set hive.auto.convert.join=true; //When auto join is enabled, there is no longer a need to provide the map-join hints in the query. The auto join option can be enabled with two configuration parameters: set hive.auto.convert.join.noconditionaltask = true; set hive.auto.convert.join.noconditionaltask.size = 10000000; Ensure that statistics are updated to so that optimizer will across the data demographics. the other alternative is to pass the values pragmatically(much easier in some cases to achieve), If the values are fewer ( year,month and day ) the simplistic approach is to pragmatically retrive the values and pass as variables in user program ( might not be possible in all the cases ex: BI Utilities) more on the join optimization can be found in hive Language manual Hope this helps !!

bkosaraju · ‎12-11-2017

Hi @Karan Alang, For an external Partitioned table, we need to update the partition metadata as the hive will not be aware of these partitions unless the explicitly updated that can be done by either ALTER TABLE power_k1 RECOVER PARTITIONS; //or MSCK REPAIR TABLE power_k1; more on this can be found from hive DDL Language Manual. Hope this helps !!

bkosaraju · ‎12-06-2017

Hi @Mayan Nath, Can you please ensure to set "Strict Host Key Checking to False " and also remove the known_hosts entries for the target host (under the directory ~/.ssh/knon_hosts or nifi user) the best thing is to do the sftp manually from the command line and test once thats working incorporate to the NiFi cluster.( please ensure that you have performed the same across all he hosts) Hope this helps !!

bkosaraju · ‎12-04-2017

Hi @Julià Delos, there are two ways you can handle the scenario, old way: You can read the files, after that split json per message basis, evoulte Json Path to extract the attributes (id,type,login ) with values [$.id, $.type, $.actor.ligin ] use replace text processor to replace the entire content of each message flow file with ${id},${type},${login} and concatenate the folwfiles and write the data. the same has been documented in HCC KB New way: Use the record reader and writer, which should grammatically convert your json, more on this can be found at NiFi blog Hope this helps !!

bkosaraju · ‎12-04-2017

Hi @Erkan ŞİRİN, From the error, I could see that, it was the access issue that server2 is not allowing user from Server1 ( where you initiated the sqoop connection - user@server1), for mysql. the solution would be provide appropriate mysql access for the user which is initiating the connection from Server1. GRANT <PREV LIST> PRIVILEGES ON *.* TO '<user>'@'%' IDENTIFIED BY 'PASSWORD' WITH GRANT OPTION; FLUSH PRIVILEGES; % indicates from all the hosts, it is good to give the access to range of hosts as The connect string you supply will be used on TaskTracker nodes throughout your MapReduce cluster( pick one of node from your cluster) before you initiate the connection you can verify the communication ( if you have mysql "Client" installed on the server1 ) by mysql -h <Server2> -u <User1> -p <PASSWORD> Hope this helps !!

bkosaraju · ‎11-26-2017

Hi @John Doo, Apparently the job is unable to pick the table from the zookeeper Znode you have provided. you have given HBase Zookeeper Znode information for phoenix to retrieve the table information, can you please check the phoenix Znode by changing into just the zookeeper quorum(you can get the precise value from hbase-site.xml file to validate your zookeeper is running on localhost or sandbox.hortonworks.com) on the other note, Phoenix columns are automatically casted into Capital letters (if you choose to create a view on top of HBase table), hence using the capital letters for both sides (HBase and Phonix side), alternately you may use quotes to resolve this too. Hope this helps !!

bkosaraju · ‎10-31-2017

Hi @Jacqualin jasmin, This is due to the SSL trust between LDAP server and the Ranger Host. for this, can you please import the LDAP certificate echo | openssl s_client -connect free-ipa-dev-01.uat.txdc.datastax.com:636 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > /tmp/ldaps.pem once after the file extracted (/tmp/ldaps.pem), import into the trust store of the user sync process. keytool -import -alias "ldapserver" -file /tmp/ldaps.pem -keystore {value for your ranegr.usersync.truststore.file(jks)} -storepass <changeit or other password you sepecifid> then redo the test. that should fix the problem if its the issue with SSL. alternatively for testing "purpose" you can use non-ssl port on IPA server i.e "ldap://free-ipa-dev-01.uat.txdc.datastax.com:386" Hope this helps !!

bkosaraju · ‎10-21-2017

Hi @Timothy Spann, Looks thats the error form File system Unavailability, Could you please check the file system size ( not 100% full ) or if thats in NAS/SAN - ensure it was not disconnected at that time ( can be verified from /var/log/messages).

bkosaraju · ‎10-21-2017

and please ensure that, there are no columns in salary ( if it is defined as Decimal/integer in Database)

Online	Offline
Last Visited	‎04-09-2019 11:41 AM

Member Since	‎01-03-2017 05:05 AM
Last Visited	‎04-09-2019 11:41 AM
Posts	181
Kudos received	44

Cloudera Community

Re: Api to help pull yarn metrics and RM metrics

Re: NiFi Cluster Setup

Re: Hive LLAP ranger insert issue (requires defaul...

Re: Ranger Audit Log (Add filter)

Re: HDFS is not rebalancing after adding new DataN...

Re: Hive table with parquet data showing 0 records

Re: [HIVE] select a partitioned table and specify ...

Re: Hive table with parquet data showing 0 records

Re: Using ListSFTP results in an exception: Failed...

Re: NiFi JSON array to CSV file

Re: Sqoop export to mysql job tries to connect wro...

Re: Error reading/writing to Phoenix table from py...

Re: ERROR: Failed to perfom ldap bind. Please veri...

Re: On MiniFi 0.20

Re: Nifi putsql processor error: Due to org.postgr...