Member since
05-02-2017
360
Posts
65
Kudos Received
22
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
13503 | 02-20-2018 12:33 PM | |
1532 | 02-19-2018 05:12 AM | |
1891 | 12-28-2017 06:13 AM | |
7191 | 09-28-2017 09:25 AM | |
12258 | 09-25-2017 11:19 AM |
04-26-2017
11:08 PM
I could see the latest version of java. I have checked it already. Java is available and i have environment variable as well..
... View more
04-26-2017
08:03 PM
@Matt Clarke @Wynner I have downloaded and ran run-nifi.bat in command line. A new cmd opens and throws "Error: Could not find or load main class Files\NiFi\nifi-1.1.2\bin\..\\logs". Any suggestion on how to get ride of this error?
... View more
04-26-2017
07:05 AM
Thanks @Matt Clarke and @Wynner. I will try it out and check if it works for me.
... View more
04-25-2017
06:22 PM
Hi @mqureshi This question just popped on my mind. I know that table once dropped cant be retrieved. We do have a replication in hadoop. So if the data is deleted then I believe that all the replication will also be deleted. Is there a way to retrieve the data if it's deleted or a managed hive table is dropped? In case of failures or error with the help of replication we will be able to handle it. But whats the possibility of retrieving it when its deleted by human mistake ?
... View more
04-25-2017
06:17 PM
@PPR Reddy At present there is no way of retrieving it. We need to re-create it again. Atleast you have still got your data.. That way I would say you are lucky! But I like your idea of retrieving it. But still hive doesnt have any options of commit. If we have then what you are expecting is possible. Now the only way is to re-create it.
... View more
04-25-2017
05:56 PM
@PPR Reddy I dont think there is anyway possible to retrive the table DDL after its dropped. You need to re-create it.
... View more
04-24-2017
10:38 PM
1 Kudo
If the table has primary keys through which you can identify unique records then make use of those keys to get chunks of data and load it into hive. Sqoop will always works good with bulk import. But when the data is too huge its not recommended to import in one shot. Its also depends upon your source RDBMS as well. I have encountered the same issue where I am able to import a table which is 20TB from teradata into hive which works perfectly fine. But when the table size increases to 30Tb im unable to import in one single stretch. In such cases I will go with multiple chucks and or import the table by using primary keys as split by and increase the mapper size it should also hold good for your scenario.
... View more
04-24-2017
10:31 PM
Could someone help me with links to install NiFi in windows which should access HDFS in sandbox. Thanks in Advance!!
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache NiFi
04-20-2017
05:56 PM
Hi @Simran Kaur, There is no way that first column can be considered as column name. But if the structure changes the its better to load the data as AVRO or Parquet file in hive. Even if the structure changes there is no need for you to change the old data and new data can be inserted into the same hive table. Points to be noted: 1.External table has to be used 2.You might need a stage table before loading into External hive table which should be in avro/parquet format Steps: 1. Create external table with columns which you have as avro/parquet. 2. Load the csv into stage table and then load the stage data into external table hive table. 3. If the columns changes then drop the external table and re-create with additional fields. 4. Insert the new file by following steps 1-2 By this way there will not be any manually work needed to modify the existing data as avro by default will show 'null' for columns which are available in the table but not in the file. The only manually work is to drop and re-create the table ddl. Let me know if you needed any details. And if you feel it answers your question then please accept the answer
... View more
04-20-2017
10:26 AM
There is no way that first column can be considered as column name. But if the structure changes the its better to load the data as AVRO or Parquet file in hive. Even if the structure changes there is no need for you to change the old data and new data can be inserted into the same hive table. Points to be noted:
1.External table has to be used
2.You might need a stage table before loading into External hive table which should be in avro/parquet format
Steps:
1. Create external table with columns which you have as avro/parquet.
2. Load the csv into stage table and then load the stage data into external table hive table.
3. If the columns changes then drop the external table and re-create with additional fields.
4. Insert the new file by following steps 1-2. By this way there will not be any manually work needed to modify the existing data as avro by default will show 'null' for columns which are available in the table but not in the file. The only manually work is to drop and re-create the table ddl. Let me know if you needed any details. And if you feel it answers your question then please accept the answer
... View more