About vjain

vjain · ‎12-08-2015

I am trying to run regression on a dataset, but I ran into 2 issues: 1. When I try to Split the dataset, that I imported from a textfile, I get the following error: java.lang.NumberFormatException: For input string: "[34" That's because the textfile has the data in the format: [x, y, z ....] [a, b, c ....] 2. So I try to use SparkSQL to create a DF that I can then convert to RDD using xRDD = x.rdd, but I get a type mismatch error. found : org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] required: org.apache.spark.rdd.RDD[org.apache.spark.mllib.regression.LabeledPoint] How should I resolve this ?

vjain · ‎12-04-2015

Yes, that solved the problem. Thanks!

vjain · ‎12-03-2015

I am attempting to use SQOOP on a HANA tables of size 180 TB (compressed, 800TB on disk) into a HIVE table. When I pass LIMIT in query argument, the number of rows I get is 4 times the amount passed as LIMIT. So 250 LIMIT fetched 1000 rows. And they are not duplicated. Another issue I am facing is with fetch-size. When I pass the fetch size, the process errors out with the message, "Search Limit exceeded"

vjain · ‎10-31-2015

vjain · ‎10-27-2015

[root@M2MOCHDPK001 tmp]# rpm -ivh hive-odbc-native-2.0.5.1005-1.el6.x86_64.rpm error: Failed dependencies: libsasl2.so.2()(64bit) is needed by hive-odbc-native-2.0.5.1005-1.x86_64 Currently the OS has the following library file [root@M2MOCHDPK001 tmp]# ls -lrta /usr/lib64/libsasl* -rwxr-xr-x. 1 root root 121288 Jan 24 2014 /usr/lib64/libsasl2.so.3.0.0 lrwxrwxrwx. 1 root root 17 Aug 13 09:09 /usr/lib64/libsasl2.so.3 -> libsasl2.so.3.0.0

vjain · ‎10-23-2015

It works fine ... if I don't modify the Spark classpath

vjain · ‎10-23-2015

See the stack trace attached. spark-phoenix-stack-trace.txtI ran: spark-submit --master yarn

vjain · ‎10-22-2015

Trying to connect spark with phoenix using JDBC. Appended location of phoenix-client.jar to the SPARK_CLASSPATH in spark_env.sh. When I launch Spark shell, I get the following errors: <console>:10: error: not found: value sqlContext import sqlContext.implicits._ ^ <console>:10: error: not found: value sqlContext import sqlContext.sql

vjain · ‎10-22-2015

CREATE TABLE PDMS.PROTOCOL ( FLDNUM NUMBER NOT NULL, PROTOCOL VARCHAR2(130 CHAR), ON_PDMS VARCHAR2(130 CHAR), TITLE VARCHAR2(550 CHAR), CHAIRMAN NUMBER, MANAGER NUMBER, MULTICENTER_TRIAL VARCHAR2(130 CHAR), OB_BLIND_STUDY_YN VARCHAR2(130 CHAR), CREATE_DTM DATE, CREATE_USER VARCHAR2(30 BYTE), LAST_UPDATED_DTM DATE, LAST_UPDATE_USER VARCHAR2(30 BYTE), CONSTRAINT PROTOCOL_PK PRIMARY KEY(FLDNUM) ENABLE )

vjain · ‎10-22-2015

After fixing the empty schema bug, when trying to pull data from Oracle server to PutFile I get the following exception: 2015-10-20 15:59:59,859 ERROR [Timer-Driven Process Thread-9] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=f92d313d-fc87-42a9-ac24-ce7d9b6972c9] ExecuteSQL[id=f92d313d-fc87-42a9-ac24-ce7d9b6972c9] failed to process due to org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.CharSequence; rolling back session: org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.CharSequence 2015-10-20 15:59:59,860 ERROR [Timer-Driven Process Thread-9] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=f92d313d-fc87-42a9-ac24-ce7d9b6972c9] ExecuteSQL[id=f92d313d-fc87-42a9-ac24-ce7d9b6972c9] failed to process session due to org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.CharSequence: org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.CharSequence 2015-10-20 15:59:59,860 WARN [Timer-Driven Process Thread-9] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=f92d313d-fc87-42a9-ac24-ce7d9b6972c9] Processor Administratively Yielded for 1 sec due to processing failure 2015-10-20 15:59:59,860 WARN [Timer-Driven Process Thread-9] o.a.n.c.t.ContinuallyRunProcessorTask Administratively Yielding ExecuteSQL[id=f92d313d-fc87-42a9-ac24-ce7d9b6972c9] due to uncaught Exception: org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.CharSequence 2015-10-20 15:59:59,864 WARN [Timer-Driven Process Thread-9] o.a.n.c.t.ContinuallyRunProcessorTask org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.CharSequence at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:296) ~[na:na] at org.apache.nifi.processors.standard.util.JdbcCommon.convertToAvroStream(JdbcCommon.java:87) ~[na:na] at org.apache.nifi.processors.standard.ExecuteSQL$1.process(ExecuteSQL.java:142) ~[na:na] at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:1937) ~[nifi-framework-core-0.3.0.jar:0.3.0] at org.apache.nifi.processors.standard.ExecuteSQL.onTrigger(ExecuteSQL.java:136) ~[na:na] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-0.3.0.jar:0.3.0] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1077) ~[nifi-framework-core-0.3.0.jar:0.3.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:127) [nifi-framework-core-0.3.0.jar:0.3.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:49) [nifi-framework-core-0.3.0.jar:0.3.0] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:119) [nifi-framework-core-0.3.0.jar:0.3.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_79] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_79] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_79] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_79] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] Caused by: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.CharSequence at org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:213) ~[na:na] at org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:208) ~[na:na] at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:76) ~[na:na] at org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:114) ~[na:na] at org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:104) ~[na:na] at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66) ~[na:na] at org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58) ~[na:na] at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:290) ~[na:na]

Online	Offline
Last Visited	‎08-08-2019 05:49 AM

Member Since	‎10-02-2015 08:11 PM
Last Visited	‎08-08-2019 05:49 AM
Posts	76
Kudos received	77

Cloudera Community

Re: how to upgrade spark 1.3.1.2.3 to spark 1.41

Re: Write Dataframe to teradata

Re: SAP HANA / SAP HANA Vora Processor for Apache ...

Re: Getting 'publicIP' error when installing Cloud...

Re: Filesystem exception

Run RDD operations on SQL Dataframe in 1.3.1

Re: SQOOP HANA to HIVE ORC

SQOOP HANA to HIVE ORC

How can I configure different machines separately ...

LIBSASL version fro HIVE OBDC & RHEL 7

Re: Spark to Phoenix

Re: Spark to Phoenix

Spark to Phoenix

Re: Exception in ExecuteSQL processor

Exception in ExecuteSQL processor