Member since 
    
	
		
		
		11-21-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                70
            
            
                Posts
            
        
                5
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		01-14-2021
	
		
		11:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi ravikirandasar1,     I also have the same query.Could you please let me know how did you automate this job using crontab for everyday download of the files to hdfs? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-10-2019
	
		
		01:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have currently set up a HDF 3.3.1 with NiFi on a standalone machine. I want to go ahead and install HDFS for storage purposes.  Can I work with the latest version of HDP ??   Please advice !!! @Matt Burgess 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-16-2018
	
		
		11:03 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Salvator,  Im facing same problem, do u find any solution for this? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-23-2018
	
		
		12:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Ravikiran Dasari   Please accept the answer if it addresses your query 🙂 or let me know if you need any further information. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-15-2018
	
		
		12:00 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Jay Kumar SenSharma Thank u.. do u have any idea abt installation of NiFi on HDP cluster? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-10-2018
	
		
		03:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Ravikiran Dasari Create a sqoop job for your import as  sqoop job --create <job-name> -- import --connect "jdbc:sqlserver://10.21.29.15:1433;database=db;username=ReportingServices;password=ReportingServices" --check-column batchid --incremental append -m 1 --hive-table mmidwpresentation.journeypositions_archive --table JourneyPositions --hive-import --schema safedrive  So once you create sqoop job sqoop will store the last value for the batchid(it's check column argument), when ever you run the job again sqoop will pull only new records after the last state value.  Sqoop Job Arguments:-  $ sqoop job 
		--create <job-name>Define a new saved job with the specified job-id (name). A second Sqoop comm		and-lin	e, separated by a -- should be specified; this defines the saved job.
		--delete <job-name>Delete a saved job.
		--exec <job-name>Given a job defined with --create, run the saved job.
		--show <job-name>Show the parameters for a saved job.
		--list  these are all the arguments you can use with sqoop job command to execute, list,delete jobs ..etc.  Use --password-file option to Set path for a file containing the authentication password while creating sqoop jobs 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-09-2018
	
		
		08:58 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Ravikiran Dasari: You can see all parameters from hive with "hive -H".   hive -H
usage: hive
 -d,--define <key=value>          Variable substitution to apply to Hive
                                  commands. e.g. -d A=B or --define A=B
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
 -h <hostname>                    Connecting to Hive Server on remote host
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable substitution to apply to hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -p <port>                        Connecting to Hive Server on port number
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)    You can add two or more tables into the same schema if they have different names (which will be the case if you use the timestamp). If you are running your create script in parallel, you could always just get a new timestamp in case the tablename with the timestamp you have already exists. If needed you can add the date stamp as well by  curr_timestamp=`date +%Y%m%d_%s` 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-12-2017
	
		
		01:08 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 No. 840GB, that means a single node has almost 120GB RAM, and it's not ideal way to maintain system. Because each nodes need some free memory for other services such os applications or agents which are using by ambari and etc. Just start 90GB to 100GB, then you can slightly change for that. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-06-2017
	
		
		05:16 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Yes continuously, automatically.  By default it polls for new files every 60 seconds, you can shrink that.  You can also convert those files to Apache ORC and auto build new Hive tables on them if the files are CSV, TSV, Avro, Excel, JSON, XML, EDI, HL7 or C-CDA.  Install Apache NiFi on an edge node, there are ways to combine them with HDP 2.6 and HDF 3 with the new Ambari.  But it's easiest to have a separate node for Apache NiFi to start.  You can also just download nifi unzip and run on a laptop that has JDK 8 installed  https://www.apache.org/dyn/closer.lua?path=/nifi/1.4.0/nifi-1.4.0-bin.zip     
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-14-2018
	
		
		01:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Jordan Moore, what option would you suggest if you have 100 different sftp sources and 10-15 files in each of them. Configuring individual NiFi processes is not an option here. I've played around with NiFi processors and they are not very good at working with parameters. Would Spark be a good solution for my case?  Thanks,   Farid 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
         
					
				