Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

PIG LOAD ERROR NO FILESYSTEM SCHEME FOR KAFKA

PIG LOAD ERROR NO FILESYSTEM SCHEME FOR KAFKA

Expert Contributor

Hi,

I am new to pig.

trying to load a topic from kafka to the pig using

tweets = LOAD 'kafka://hostname:6667/tweets' USING PigStorage();

but giving error as

java.lang.RuntimeException: Failed to create DataStorage

java.io.IOException: No FileSystem for scheme: kafka

I have registered the below jar files before running this command

REGISTER '/root/PIG/kafka-hadoop-producer-0.8.1.2.2.0.0-2041.jar'

REGISTER '/usr/hdp/current//pig-client/lib/avro-1.7.5.jar'

REGISTER '/usr/hdp/current//pig-client/lib/piggybank.jar'

REGISTER '/usr/hdp/current/kafka-broker/libs/kafka_2.10-0.9.0.2.3.4.7-4.jar'

REGISTER '/usr/hdp/current//pig-client/lib/json-simple-1.1.jar'

REGISTER '/usr/hdp/current//pig-client/lib/piggybank.jar'

and aslo i tried with the another command that is,

tweets = LOAD 'kafka://hostname:6667/tweets' org.apache.pig.piggybank.storage.JsonLoader();

but still the same issue.

please correct me the mistake that i'm doing.

thanks.

Mohan.V

2 REPLIES 2

Re: PIG LOAD ERROR NO FILESYSTEM SCHEME FOR KAFKA

Super Guru
@Mohan V

You are consuming the data from Kafka into Hadoop using Pig, right? I think you need Kafka Hadoop consumer in your path. I think the jar file should be kafka-hadoop-consumer not producer. You should be able to download it from the following maven repository.

https://mvnrepository.com/artifact/org.apache.kafka/kafka-hadoop-consumer/0.8.1

Re: PIG LOAD ERROR NO FILESYSTEM SCHEME FOR KAFKA

Expert Contributor

thanks for your reply.

i did what you have suggested.

downloaded jar and registered. but when i tried the command

tweets = LOAD 'kafka://localhost:6667/tweets' USING PigStorage();

i.e same error has occured.

Caused by: java.lang.RuntimeException: Failed to create DataStorage at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:53) at org.apache.pig.builtin.JsonMetadata.findMetaFile(JsonMetadata.java:109) at org.apache.pig.builtin.JsonMetadata.getSchema(JsonMetadata.java:189) at org.apache.pig.builtin.PigStorage.getSchema(PigStorage.java:538) at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175) at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89) at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901) at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191) ... 16 more Caused by: java.io.IOException: No FileSystem for scheme: kafka at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2644) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:70)

Don't have an account?
Coming from Hortonworks? Activate your account here