Support Questions
Find answers, ask questions, and share your expertise

PIG LOAD ERROR NO FILESYSTEM SCHEME FOR KAFKA

Highlighted

PIG LOAD ERROR NO FILESYSTEM SCHEME FOR KAFKA

Expert Contributor

Hi,

I am new to pig.

trying to load a topic from kafka to the pig using

tweets = LOAD 'kafka://hostname:6667/tweets' USING PigStorage();

but giving error as

java.lang.RuntimeException: Failed to create DataStorage

java.io.IOException: No FileSystem for scheme: kafka

I have registered the below jar files before running this command

REGISTER '/root/PIG/kafka-hadoop-producer-0.8.1.2.2.0.0-2041.jar'

REGISTER '/usr/hdp/current//pig-client/lib/avro-1.7.5.jar'

REGISTER '/usr/hdp/current//pig-client/lib/piggybank.jar'

REGISTER '/usr/hdp/current/kafka-broker/libs/kafka_2.10-0.9.0.2.3.4.7-4.jar'

REGISTER '/usr/hdp/current//pig-client/lib/json-simple-1.1.jar'

REGISTER '/usr/hdp/current//pig-client/lib/piggybank.jar'

and aslo i tried with the another command that is,

tweets = LOAD 'kafka://hostname:6667/tweets' org.apache.pig.piggybank.storage.JsonLoader();

but still the same issue.

please correct me the mistake that i'm doing.

thanks.

Mohan.V

2 REPLIES 2
Highlighted

Re: PIG LOAD ERROR NO FILESYSTEM SCHEME FOR KAFKA

Super Guru
@Mohan V

You are consuming the data from Kafka into Hadoop using Pig, right? I think you need Kafka Hadoop consumer in your path. I think the jar file should be kafka-hadoop-consumer not producer. You should be able to download it from the following maven repository.

https://mvnrepository.com/artifact/org.apache.kafka/kafka-hadoop-consumer/0.8.1

Highlighted

Re: PIG LOAD ERROR NO FILESYSTEM SCHEME FOR KAFKA

Expert Contributor

thanks for your reply.

i did what you have suggested.

downloaded jar and registered. but when i tried the command

tweets = LOAD 'kafka://localhost:6667/tweets' USING PigStorage();

i.e same error has occured.

Caused by: java.lang.RuntimeException: Failed to create DataStorage at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:53) at org.apache.pig.builtin.JsonMetadata.findMetaFile(JsonMetadata.java:109) at org.apache.pig.builtin.JsonMetadata.getSchema(JsonMetadata.java:189) at org.apache.pig.builtin.PigStorage.getSchema(PigStorage.java:538) at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175) at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89) at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901) at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191) ... 16 more Caused by: java.io.IOException: No FileSystem for scheme: kafka at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2644) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:70)