Impala is developed and only tested against the supported version of CDH.
All of the APIs we use are available in Apache Hadoop so it should be possible
to run Impala with little to no changes.
What version of Impala works with Apache Hadoop 2.7.x?
What are the recommended configurations to be made for Impala to work with Apache Hadoop 2.7.x?
I've an installation of Impala 2.1 and Apache Hadoop 2.7.2. Impalad cannot start up because:
E0701 11:11:19.187844 17507 impala-server.cc:210] Could not read the HDFS root directory at hdfs://namenode:9000. Error was:
Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields: callId, status
Right now, Impala is only tested against CDH. We'd like to change this in the upstream Apache project - if that sounds like something you'd like to help with, we'd love to see you on the mailing list! (firstname.lastname@example.org)
We don't support running on apache hadoop right now, but there's no specific reason it shouldn't work. So there isn't really a recommended configuration.
I'm not sure why you're getting that error, the wire protocol for apache hadoop should have those fields.
Is there some specific reason why you want to run Impala 2.1? That's a fairly old version and you're missing out on a lot of improvements.
I'm working on setting up Open Network Insight to test for possible adoption of this technology.
It is using Impala as part of its visualization layer.
Internally, we are already using Apache Hadoop to store our logs and network data. So, I will have to try to get Impala to work with our Apache Hadoop.
I'm now trying to get the source codes down to compile the latest.
The problem I'm facing is that Apache Hadoop version is 2.7.2 and the Impala 2.1.0 is not supporting that.
The protobuf error only shows that the message communication has problem due the the message format, not that the protobuf itself is the problem.