About Shelton

Shelton · ‎03-15-2021

@emeric Can you copy and paste the new flume. conf for clarity I have split the different parts Flow Diagram Configuring the flume.conf # Naming the components on the current agent. TwitterAgent.sources = Twitter # Added TwitterAgent.channels = MemChannel TwitterAgent.sinks = HDFS # Configuring the source TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource TwitterAgent.sources.Twitter.consumerKey = <consumerKey> TwitterAgent.sources.Twitter.consumerSecret = <consumerSecret> TwitterAgent.sources.Twitter.accessToken = <accessToken> TwitterAgent.sources.Twitter.accessTokenSecret = <accessTokenSecret> TwitterAgent.sources.Twitter.keywords = <keyword> # Configuring the sink TwitterAgent.sinks.HDFS.type = hdfs TwitterAgent.sinks.HDFS.hdfs.path = hdfs://quickstart.cloudera:8020/user/flume/tweets/ TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream TwitterAgent.sinks.HDFS.hdfs.writeFormat = text TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000 TwitterAgent.sinks.HDFS.hdfs.rollSize = 0 TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000 TwitterAgent.sinks.HDFS.hdfs.rollInterval = 600 # Configuring the channel TwitterAgent.channels.MemChannel.type = memory TwitterAgent.channels.MemChannel.capacity = 10000 TwitterAgent.channels.MemChannel.transitionCapacity = 100 # Binding the source and sink to the channel TwitterAgent.sources.Twitter.channels = MemChannel TwitterAgent.sinks.HDFS.channel = MemChannel $ bin/flume-ng agent --conf ./conf/ -f /home/cloudera/flume.conf -n TwitterAgent Please let me know if it runs successfully

Shelton · ‎03-15-2021

@sandipkumar Think about it Impala uses HMS so remember that the Hive metastore database is required for Impala to function. So if HMS is not running then no Impala query/job should be launched. Hope that helps

Shelton · ‎03-15-2021

@ryu How is your cluster setup? The number of nodes and the HDP versions? Are you running your HQL from the edge node? Give as much information as possible.

Shelton · ‎03-14-2021

@Jay2021 Impala and hive share metadata catalog ie Hive MetaStore , when a database/table is created in HIVE it's readily available for hive users but not Impala! To successfully query a table or database created in HIVE there is a caveat you need to run the INVALIDATE METADATA from the impala-shell before the table is available for Impala queries. INVALIDATE METADATA reloads all the metadata for the table needed for a subsequent query The next time the current Impala node performs a query against a table whose metadata is invalidated you definitely will run into errors you could use the REFRESH in the common case where you add new data files for an existing table it reloads the metadata immediately, but only loads the block location data for newly added data files, making it a less expensive operation overall. INVALIDATE METADATA [[db_name.]table_name] Example $ impala-shell > INVALIDATE METADATA new_db_from_hive.new_table_from_hive; > SHOW TABLES IN new_db_from_hive; +---------------------+ | new_table_from_hive | +---------------------+ That should resolve your issue Happy hadooping

Shelton · ‎03-13-2021

@SnehasishRSC REFRESH in the common case where you add new data files for an existing table it reloads the metadata immediately, but only loads the block location data for newly added data files, making it a less expensive operation overall. It is recommended to run COMPUTE STATS when 30 % of data is altered in a table, where altered means the addition or deletion of files/data. INVALIDATE METADATA is a relatively expensive operation compared to the incremental metadata update done by the REFRESH statement, so in the common scenario of adding new data files to an existing table, prefer REFRESH rather than INVALIDATE METADATA which marks the metadata for one or all tables as stale. The next time the Impala service performs a query against a table whose metadata is invalidated, Impala reloads the associated metadata before the query proceed. Hope that helps

Shelton · ‎03-01-2021

@rohit_sharma Any updates? Do you still need help?

Shelton · ‎03-01-2021

@raghurok Bad news As of February 1, 2021, all downloads of CDH and Cloudera Manager require a username and password and use a modified URL. You must use the modified URL, including the username and password when downloading the cloudera repository contents Hope that helps

Shelton · ‎03-01-2021

@ryu My advice is just don't attempt because the HDP software is closely wired. Vigorous unit testing and compatibility are implemented before certifying a version. HDP is a packaged software when you update it's either all or none, you can't update only a component except Ambari and the underlying databases for hive,oozie,ranger etc Yes, the old good days of real open source is gone.I loved HWX If you are running production clusters then you definitely need to a subscription Hope that helps

Shelton · ‎03-01-2021

@totti1 You will need to copy the hdfs/core-site.xml to a local path accessible to your windows. And you will need to update your host's file entry to make the VM reachable from the windows machine. You should be able to ping your vm from the windows machine and vice versa. Edit and change core-site.xml and hdfs-site.xml files and remove the FQDN:8020 to an IP ie for class C network like 192.168.10.201:8020 restart the processors and let me know. Hope that helps?

Shelton · ‎03-01-2021

@Alex_IT From my Oracle knowledge, there are 2 options for migrating the same Oracle_home [DB] from 12C to 19C if you are running 12.1.0.2 then you have the direct path see the attached matrix. With this option, you won't need to change the hostname. The other option is to export your current schema CM ,oozie,hive,hue,Ranger etc schemas install a fresh Oracle 19c box with an empty database, and import the old schemas this could be a challenge as you might have to rebuild indexes or recompile some database packages etc but bot are doable. Hope that helps

Online	Offline
Last Visited	‎12-11-2025 11:50 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎12-11-2025 11:50 PM
Posts	3,679
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: Not able to stream twitter data in to hdfs wit...

Re: Custom health check in impala

Re: Hive CLI error

Re: Cant see latest data from HDFS file from Impal...

Re: Impala refresh vs compute stats

Re: Unable to create topic in kafka cluster.

Re: Read timeout while accessing https://repositor...

Re: Is there anyway to upgrade HIVE from an existi...

Re: How to connect apache nifi with hadoop?

Re: Cloudera CDH and Oracle Database Migration