Reply
New Contributor
Posts: 2
Registered: ‎12-09-2014

Spark SQL

Hi,

 

I am new to Spark.

I am trying to query the rdd using spark sql.

 

I am getting following error :

The type scala.reflect.api.TypeTags$TypeTag cannot be resolved. It is indirectly referenced from required .class files

 

at below line : 

JavaSQLContext sqlCtx = new JavaSQLContext(ctx);

 

I have downloaded spark-sql_2.10 jar from :

http://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10/1.1.1

 

 

import java.io.Serializable;
import java.util.Arrays;
import java.util.List;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;

import org.apache.spark.sql.api.java.JavaSQLContext;
import org.apache.spark.sql.api.java.JavaSchemaRDD;
import org.apache.spark.sql.api.java.Row;

public class SparkSQLApp {

public static void main(String[] args) throws Exception {
SparkConf sparkConf = new SparkConf().setAppName("JavaSparkSQL");
JavaSparkContext ctx = new JavaSparkContext(sparkConf);
JavaSQLContext sqlCtx = new JavaSQLContext(ctx);

System.out.println("=== Data source: RDD ===");
// Load a text file and convert each line to a Java Bean.
JavaRDD<Person> people = ctx.textFile("examples/src/main/resources/people.txt").map(
new Function<String, Person>() {
@Override
public Person call(String line) {
String[] parts = line.split(",");

Person person = new Person();
person.setName(parts[0]);
person.setAge(Integer.parseInt(parts[1].trim()));

return person;
}
});

// Apply a schema to an RDD of Java Beans and register it as a table.
JavaSchemaRDD schemaPeople = sqlCtx.applySchema(people, Person.class);
schemaPeople.registerTempTable("people");

// SQL can be run over RDDs that have been registered as tables.
JavaSchemaRDD teenagers = sqlCtx.sql("SELECT name FROM people WHERE age >= 13 AND age <= 19");

// The results of SQL queries are SchemaRDDs and support all the normal RDD operations.
// The columns of a row in the result can be accessed by ordinal.
List<String> teenagerNames = teenagers.map(new Function<Row, String>() {
@Override
public String call(Row row) {
return "Name: " + row.getString(0);
}
}).collect();
for (String name: teenagerNames) {
System.out.println(name);
}


ctx.stop();
}
}

 

 

please help

Highlighted
Cloudera Employee
Posts: 366
Registered: ‎07-29-2013

Re: Spark SQL

This sounds like you are somehow trying to run a Spark program on its own, but you are not putting Scala classes on the classpath. You should instead run apps with spark-submit.