About Shelton

Shelton · ‎03-18-2021

@emeric Cool so how will you send it safely? check my linkedin should be easy to connect 🙂

Shelton · ‎03-17-2021

@emeric Twitter has made it di^fficult to register any app, so I am waiting for approval. Sincerely I couldn't stand the 200-word essay to explain what I intend to do whether I am part of a govt etc bla bla bla. I just copied some text from the website and paste from some website I hope I pass the review 🙂 By Friday I should be good to go

Shelton · ‎03-17-2021

@emeric Could you try substituting the current values with the below flume.conf hdfs://10.0.2.15:8020/user/flume/tweets/ hdfs://127.0.0.1:8020/user/flume/tweets/ hdfs://192.168.56.101:8020/user/flume/tweets/ Let me know

Shelton · ‎03-16-2021

@emeric what is the output from the Quickstart sandbox CLI of the below command? $ ifconfig I am thinking we are on the right path. I will download a sandbox tomorrow if you don't success and try to reproduce your situation. Happy hadooping

Shelton · ‎03-16-2021

@emeric That looks a hostname issue this looks like the offending line TwitterAgent.sinks.HDFS.hdfs.path = hdfs://quickstart.cloudera:8020/user/flume/tweets/ Can you replace the quickstart.cloudera:8020/user/flume/tweets/ with <Sandbox-IP>:8020/user/flume/tweets/ Please let me know

Shelton · ‎03-15-2021

@ryu try the following solution. Always note all the changes you make incase you will need to revert Follow these steps to resolve the issue: 1. Open Ambari. 2. Go to TEZ / Configs / Advanced tez-site. 3. Locate the configuration tez.history.logging.service.class. 4. Replace the value org.apache.tez.dag.history.logging.ats.ATSV15HistoryLoggingService with the new value: org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService" 5. Save the configuration changes. 6. Restart all services that Ambari is asking you to restart. Then retry the [root@test02 ~]# hive Please revert

Shelton · ‎03-15-2021

@totti1 This all about the HMS hive Metadata Refreshing Spark SQL caches Parquet metadata for better performance. When Hive metastore Parquet table conversion is enabled, metadata of those converted tables are also cached. If these tables are updated by Hive or other external tools, you need to refresh them manually to ensure consistent metadata. from os.path import expanduser, join from pyspark.sql import SparkSession from pyspark.sql import Row # warehouse_location points to the default location for managed databases and tables warehouse_location = 'spark-warehouse' spark = SparkSession \ .builder \ .appName("Python Spark SQL Hive integration example") \ .config("spark.sql.warehouse.dir", warehouse_location) \ .enableHiveSupport() \ .getOrCreate() # spark is an existing SparkSession spark.sql("CREATE TABLE IF NOT EXISTS totti (key INT, value STRING)") # Load some data here spark.sql("LOAD DATA LOCAL INPATH 'path/to/the/table/totti.txt' INTO TABLE totti") # Refresh the HMS metastore // spark is an existing SparkSession spark.catalog.refreshTable("totti") # Queries are expressed in HiveQL spark.sql("SELECT * FROM totti").show() In the above example, you will need to connect to the database to create the table totti. Notice I run the refresh before the select so that the Metadata is invalidated and fetched from the databases else I will get no table found etc

Shelton · ‎03-15-2021

@emeric Can you copy and paste the new flume. conf for clarity I have split the different parts Flow Diagram Configuring the flume.conf # Naming the components on the current agent. TwitterAgent.sources = Twitter # Added TwitterAgent.channels = MemChannel TwitterAgent.sinks = HDFS # Configuring the source TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource TwitterAgent.sources.Twitter.consumerKey = <consumerKey> TwitterAgent.sources.Twitter.consumerSecret = <consumerSecret> TwitterAgent.sources.Twitter.accessToken = <accessToken> TwitterAgent.sources.Twitter.accessTokenSecret = <accessTokenSecret> TwitterAgent.sources.Twitter.keywords = <keyword> # Configuring the sink TwitterAgent.sinks.HDFS.type = hdfs TwitterAgent.sinks.HDFS.hdfs.path = hdfs://quickstart.cloudera:8020/user/flume/tweets/ TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream TwitterAgent.sinks.HDFS.hdfs.writeFormat = text TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000 TwitterAgent.sinks.HDFS.hdfs.rollSize = 0 TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000 TwitterAgent.sinks.HDFS.hdfs.rollInterval = 600 # Configuring the channel TwitterAgent.channels.MemChannel.type = memory TwitterAgent.channels.MemChannel.capacity = 10000 TwitterAgent.channels.MemChannel.transitionCapacity = 100 # Binding the source and sink to the channel TwitterAgent.sources.Twitter.channels = MemChannel TwitterAgent.sinks.HDFS.channel = MemChannel $ bin/flume-ng agent --conf ./conf/ -f /home/cloudera/flume.conf -n TwitterAgent Please let me know if it runs successfully

Shelton · ‎03-15-2021

@sandipkumar Think about it Impala uses HMS so remember that the Hive metastore database is required for Impala to function. So if HMS is not running then no Impala query/job should be launched. Hope that helps

Shelton · ‎03-15-2021

@ryu How is your cluster setup? The number of nodes and the HDP versions? Are you running your HQL from the edge node? Give as much information as possible.

Online	Offline
Last Visited	‎06-05-2025 02:03 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎06-05-2025 02:03 PM
Posts	3,676
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: Not able to stream twitter data in to hdfs wit...

Re: Not able to stream twitter data in to hdfs wit...

Re: Not able to stream twitter data in to hdfs wit...

Re: Not able to stream twitter data in to hdfs wit...

Re: Not able to stream twitter data in to hdfs wit...

Re: Hive CLI error

Re: How to connect to Hive usin gSpark (pspark)?

Re: Not able to stream twitter data in to hdfs wit...

Re: Custom health check in impala

Re: Hive CLI error