Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HAWQ - Error while querying external HDFS table (PXF)

Highlighted

HAWQ - Error while querying external HDFS table (PXF)

Rising Star

I have a problem with PXF and external tables in HAWQ.

I have seven data nodes in cluster (and seven HAWQ segments) and I installed PXF on each of them. It looks like that:

Node1 - NameNode, HiveMetastore, HAWQ master, Hive Client, HCat Client, PXF

Node2 - SNameNode, DataNode, HAWQ Segment, PXF

Node3-7 - DataNode, HAWQ Segment, PXF

Node8-9 - HiveClient, HCat Clients

I created table based on "Test PXF" on this site. But I have a problem with accesing data. When I try to run a simple query:

SELECT * FROM pxf_hdfs_textsimple;

I get error:

ERROR:  remote component error (0): (libchurl.c:897)

That's all. In HAWQ Master Node Log I see this:

2018-01-04 18:43:06.902998 CET,"hawq","postgres",p19781,th-319940160,"[local]",,2018-01-04 18:43:06 CET,16768,con26,,seg-10000,,,x16768,sx1,"LOG","00000","no master mirroring standby configuration found",,,,,,,0,,"cdblink.c",159,
2018-01-04 18:43:10.759145 CET,"hawq","poligon",p19820,th-319940160,"[local]",,2018-01-04 18:43:10 CET,0,con27,,seg-10000,,,,,"LOG","00000","getLocalTmpDirFromMasterConfig session_id:27 tmpdir:/tmp",,,,,,,0,,"postinit.c",470,
2018-01-04 18:43:20.600046 CET,"hawq","poligon",p19820,th-319940160,"[local]",,2018-01-04 18:43:10 CET,16773,con27,cmd7,seg-10000,,,x16773,sx1,"ERROR","XX000","remote component error (0): (libchurl.c:897)",,,,,,"select * from pxf_hdfs_textsimple ;",0,,"libchurl.c",897,"Stack trace:
1    0x8c165e postgres errstart (??:?)
2    0x8c34fb postgres elog_finish (??:?)
3    0x5124d6 postgres check_response_code (??:?)
4    0x512686 postgres churl_read_check_connectivity (??:?)
5    0x517b22 postgres <symbol not found> (pxfutils.c:?)
6    0x517d66 postgres call_rest (??:?)
7    0x5168c0 postgres <symbol not found> (pxfmasterapi.c:?)
8    0x516f97 postgres get_data_fragment_list (??:?)
9    0x512ff5 postgres map_hddata_2gp_segments (??:?)
10   0x73f8a2 postgres <symbol not found> (createplan.c:?)
11   0x73fdb5 postgres <symbol not found> (createplan.c:?)
12   0x741dec postgres create_plan (??:?)
13   0x74d1a6 postgres <symbol not found> (planner.c:?)
14   0x74eb3c postgres subquery_planner (??:?)
15   0x74f177 postgres <symbol not found> (planner.c:?)
16   0x74f72e postgres planner (??:?)
17   0x7e496a postgres pg_plan_queries (??:?)
18   0x7e4e05 postgres <symbol not found> (postgres.c:?)
19   0x7e6560 postgres PostgresMain (??:?)
20   0x799860 postgres <symbol not found> (postmaster.c:?)
21   0x79c5e9 postgres PostmasterMain (??:?)
22   0x4a2dff postgres main (??:?)
23   0x7fdfe9e2ab35 libc.so.6 __libc_start_main (??:0)
24   0x4a2e7c postgres <symbol not found> (??:?)
"

When I try access data in Hive I get error too. Query:

select count(*) from hcatalog.default.aaa;

Error:

ERROR:  remote component error (500) from '127.0.0.1:51200':  type  Exception report   message   java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/MetaException    description   The server encountered an internal error that prevented it from fulfilling this request.    exception   javax.servlet.ServletException: java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/MetaException (libchurl.c:897)
LINE 1: select count(*) from hcatalog.default.aaa;

Does anyone know what I am doing wrong? What can cause a problem with accesing data from HDFS, Hive by PXF?

3 REPLIES 3

Re: HAWQ - Error while querying external HDFS table (PXF)

New Contributor

Please take a look at pxf-private.classpath to make sure the hive jars have the correct path. Also, take a look at pxf-service.log under tomcat where PXF is running. The above error is being returned from the PXF java server to C client.

Re: HAWQ - Error while querying external HDFS table (PXF)

Rising Star

@Lav Jain

Thanks for the answer. I checked my pxf-private.classpath. The problem was, that I installed PXF on all data nodes, but there are no Hive Clients installed (no Hive jars too). I copied jars on all machines and it works now. But there is another problem.

I have a problem with querying tables which have different format than textfile. If I execute query:

select count(*) from hcatalog.default.sales_info;

And the table sales_info is text file I got correct results:

 count
-------  
 7540 
(1 row)

But if table format is ORC I got this error:

ERROR:  pxfwritable_import_beginscan function was not found (nodeExternalscan.c:310)  (seg0 Node4:40000 pid=42109) (dispatcher.c:1805)

Does anyone know how to query ORC data?

And I have still a problem with external tables. When I create a table:

CREATE EXTERNAL TABLE pxf_hdfs_textsimple(location text, month text, num_orders int, total_sales float8)
LOCATION ('pxf://Node1:51200/tmp/pxf_test/pxf_hdfs_simple.txt?PROFILE=HdfsTextSimple')
FORMAT 'TEXT' (delimiter=E',');

I am not able to query this table. There is no error, I can see only this:

ERROR:  remote component error (0):(libchurl.c:897)

I checked HAWQ segments logs, HAWQ master logs, Catalina and PXF-service logs, but I cannot find anything about this error. I would appreciate if anyone could help me.

Re: HAWQ - Error while querying external HDFS table (PXF)

New Contributor

@grabowski14

What kind of platform is your Hadoop cluster running under (e.g. OSX, Centos6, etc)? You must have PXF server running on all the data-nodes as well as on the name node. It is best if you use Ambari to install HAWQ and PXF as it guides you with the setup.