Subject Author Views Posted
This is a topic with new unread messages
Running environment, 12 executors (8gb for each) with 8 cores. The data roughly 20gb(in parquet) ne...
23 ‎11-21-2017 01:13 AM
This is a topic with new unread messages
In pyspark using   mapPartitionsWithIndex, I can iterate over each partition of rdd and apply actio...
30 ‎11-21-2017 12:27 AM
This is a topic with new unread messages
Hello, I would like to retrieve the column names created after using PolynomialExpansion. My co...
33 ‎11-20-2017 09:45 AM
This is a topic with new unread messages
I have Hive Partitions based on date yyyy-mm-dd. I want to run a script everyday that can delete a...
43 ‎11-19-2017 03:28 AM
This is a topic with new unread messages
Greetings,   We upgrade Spark2 from 2.1 to 2.2 on Cloudera Hadoop 5.12.1 We have a job executing...
31 ‎11-18-2017 08:54 AM
This is a topic with new unread messages
I upgraded CDH from CDH5.8.3 to CDH5.10.2 last month on one of my cluster. We started noticing job...
37 ‎11-17-2017 10:36 AM
This is a topic with new unread messages
Hi   deployed spark 2.2 on CDH 5.12. After the deployment and service restart got below error.  ...
31 ‎11-11-2017 07:38 PM
This is a topic with new unread messages
As checking some resource through community, we are recommended to use sparklyr to support R in Spa...
95 ‎11-09-2017 07:10 AM
This is a topic with new unread messages
Hi,   I have just installed spark2 in CDH 5.13.0  after enabling spark.authenticate i get an err...
94 ‎11-06-2017 01:43 PM
This is a topic with new unread messages
I am running spark on my PC in stand alone mode.   I am trying to load a 1.6GB file as below   ...
60 ‎11-05-2017 07:49 AM
This is a topic with new unread messages
  I have an existing table with data which is already partitioned and I am inserting data from a d...
78 ‎11-05-2017 07:25 AM
This is a topic with new unread messages
Hello   I'm trying to run a SparkSQL query which reads data from a Hive-table, and it fails when ...
99 ‎11-03-2017 07:23 AM
This is a topic with new unread messages
Hi All, Currently I am working on an ETL project, which uses spark largely for all of its computati...
100 ‎10-27-2017 09:37 AM
This is a topic with new unread messages
I installed SPARK2 parcel/service ( SPARK2-2.1.0.cloudera1-1.cdh5.7.0.p0.120904 ) on my cluster and...
98 ‎10-24-2017 04:00 PM
This is a topic with new unread messages
Dear community,   I'm getting an error in Pycharm (CDH 5.8.0 and Spark 1.6.2) with the following ...
100 ‎10-23-2017 01:22 PM
This is a topic with new unread messages
Hi,   When we try to run the spark-submit on our kerberos cluster, we get below mentioned error. ...
99 ‎10-23-2017 10:30 AM
This is a topic with new unread messages
Hi   I want to merge small avro files into one single avro file.The code which I followed is comp...
99 ‎10-23-2017 05:25 AM
This is a topic with new unread messages
Hi, I'm working with Spark Streaming using python and recently I started try to use the package Gr...
99 ‎10-22-2017 04:53 PM
This is a topic with new unread messages
I have a cluster with Spark 2.2 on CDH 5.12 with RHEL and I am trying to set up IPython to use with...
90 ‎10-22-2017 05:22 AM
This is a topic with new unread messages
I am getting following error at the yarn application level      Stack trace: ExitCodeException e...
99 ‎10-20-2017 07:18 AM