About balavignesh_nag

balavignesh_nag · ‎04-26-2017

I could see the latest version of java. I have checked it already. Java is available and i have environment variable as well..

balavignesh_nag · ‎04-26-2017

@Matt Clarke @Wynner I have downloaded and ran run-nifi.bat in command line. A new cmd opens and throws "Error: Could not find or load main class Files\NiFi\nifi-1.1.2\bin\..\\logs". Any suggestion on how to get ride of this error?

balavignesh_nag · ‎04-26-2017

Thanks @Matt Clarke and @Wynner. I will try it out and check if it works for me.

balavignesh_nag · ‎04-25-2017

Hi @mqureshi This question just popped on my mind. I know that table once dropped cant be retrieved. We do have a replication in hadoop. So if the data is deleted then I believe that all the replication will also be deleted. Is there a way to retrieve the data if it's deleted or a managed hive table is dropped? In case of failures or error with the help of replication we will be able to handle it. But whats the possibility of retrieving it when its deleted by human mistake ?

balavignesh_nag · ‎04-25-2017

@PPR Reddy At present there is no way of retrieving it. We need to re-create it again. Atleast you have still got your data.. That way I would say you are lucky! But I like your idea of retrieving it. But still hive doesnt have any options of commit. If we have then what you are expecting is possible. Now the only way is to re-create it.

balavignesh_nag · ‎04-25-2017

@PPR Reddy I dont think there is anyway possible to retrive the table DDL after its dropped. You need to re-create it.

balavignesh_nag · ‎04-24-2017

If the table has primary keys through which you can identify unique records then make use of those keys to get chunks of data and load it into hive. Sqoop will always works good with bulk import. But when the data is too huge its not recommended to import in one shot. Its also depends upon your source RDBMS as well. I have encountered the same issue where I am able to import a table which is 20TB from teradata into hive which works perfectly fine. But when the table size increases to 30Tb im unable to import in one single stretch. In such cases I will go with multiple chucks and or import the table by using primary keys as split by and increase the mapper size it should also hold good for your scenario.

balavignesh_nag · ‎04-24-2017

Could someone help me with links to install NiFi in windows which should access HDFS in sandbox. Thanks in Advance!!

balavignesh_nag · ‎04-20-2017

Hi @Simran Kaur, There is no way that first column can be considered as column name. But if the structure changes the its better to load the data as AVRO or Parquet file in hive. Even if the structure changes there is no need for you to change the old data and new data can be inserted into the same hive table. Points to be noted: 1.External table has to be used 2.You might need a stage table before loading into External hive table which should be in avro/parquet format Steps: 1. Create external table with columns which you have as avro/parquet. 2. Load the csv into stage table and then load the stage data into external table hive table. 3. If the columns changes then drop the external table and re-create with additional fields. 4. Insert the new file by following steps 1-2 By this way there will not be any manually work needed to modify the existing data as avro by default will show 'null' for columns which are available in the table but not in the file. The only manually work is to drop and re-create the table ddl. Let me know if you needed any details. And if you feel it answers your question then please accept the answer

balavignesh_nag · ‎04-20-2017

There is no way that first column can be considered as column name. But if the structure changes the its better to load the data as AVRO or Parquet file in hive. Even if the structure changes there is no need for you to change the old data and new data can be inserted into the same hive table. Points to be noted: 1.External table has to be used 2.You might need a stage table before loading into External hive table which should be in avro/parquet format Steps: 1. Create external table with columns which you have as avro/parquet. 2. Load the csv into stage table and then load the stage data into external table hive table. 3. If the columns changes then drop the external table and re-create with additional fields. 4. Insert the new file by following steps 1-2. By this way there will not be any manually work needed to modify the existing data as avro by default will show 'null' for columns which are available in the table but not in the file. The only manually work is to drop and re-create the table ddl. Let me know if you needed any details. And if you feel it answers your question then please accept the answer

Online	Offline
Last Visited	‎10-03-2019 09:01 AM

Member Since	‎05-02-2017 01:47 PM
Last Visited	‎10-03-2019 09:01 AM
Posts	360
Kudos received	64

Cloudera Community

Re: what is the best way to get ftp file to hdfs c...

Re: when yarn communicates with the namenodes when...

Re: [TEZ] are partition, sort and shuffle built-in...

Re: CASE statement Error in Beeline HIVE

Re: hive query to display Week of the timestamp an...

Re: NiFi installation

Re: NiFi installation

Re: NiFi installation

Re: Hive - external table schema got dropped unfor...

Re: Hive - external table schema got dropped unfor...

Re: Hive - external table schema got dropped unfor...

Re: Sqoop import & hive ORC

NiFi installation

Re: column names in a hive table

Re: column names in a hive table