Member since
11-07-2016
11
Posts
4
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1001 | 08-13-2018 12:53 PM | |
2321 | 11-07-2016 11:41 AM |
07-28-2020
12:43 PM
I know this is old, but maybe will help someone. Check the firewall between CM node and MySQL node. You should be able to telnet the MySQL node on port 3306 from CM node (what you defined in CM db.properties)
... View more
08-28-2018
05:05 PM
1 Kudo
Did you try this? Fetch the HDFS usage report: https://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_services_-serviceName-_reports_hdfsUsageReport.html It works for me: https://<cm_host>:7183/api/v19/clusters/<ClusterName>/services/<ServiceName>/reports/hdfsUsageReport https://cmmanager.domain.com:7183/api/v19/clusters/Production%20US/services/hdfs/reports/hdfsUsageReport You notice I use for ClusterName the chars %20 that represents a blank (my ClusterName is "Production US"), so I made the URL understand that is a blank Regards, Nic
... View more
08-28-2018
05:04 PM
Did you try this? Fetch the HDFS usage report: https://cloudera.github.io/cm_api/apidocs/v19/path__clusters_-clusterName-_services_-serviceName-_reports_hdfsUsageReport.html It worls for me: https://<cm_host>:7183/api/v19/clusters/<ClusterName>/services/<ServiceName>/reports/hdfsUsageReport https://cmmanager.domain.com:7183/api/v19/clusters/<Production%20US>/services/hdfs/reports/hdfsUsageReport You notice I use for ClusterName the chars %20 that represents a blank (my ClusterName is "Production US"), so I mahe the URL understand that is a blank Regards, Nic
... View more
08-16-2018
07:49 AM
Worth to look at: https://qna.dwh4u.com/index.php?qa=11682&qa_1=couldnt-fetch-delegation-sqoop-import-action-oozie-workflow You might hit this BUG SQOOP-3177 If that is your case, it looks you can workaround this by using --target_dir to load the data in HDFS and then load into Hive table from HDFS. Regards, Nic
... View more
08-13-2018
02:47 PM
Ceck whether below is helping you (add split-by in Sqoop command): https://issues.cloudera.org/browse/SQOOP-107 Clarification: Sqoop has no logic to handle composite primary keys. This is for sqoop imports. If you do not specify a split-by, sqoop defaults to the PK. In the case where the table has a composite PK, it uses only the first column of the composite PK to split. This is confirmed behavior in Oracle, need to test with others. The workaround is to specify a split-by and sqoop will not care about the PK. Regards, Nic
... View more
08-13-2018
02:07 PM
You didn't say what are the challenges that you have. You can use some inventory (book keeping) table in Oracle db to know all the time what date/day/hour you already imported or need to reload in Hadoop. You keep feeding this table with info about what you need to load. On Hadoop side you can use perl DBI to connect to Oracle db and read the book keeping table so you know what is next for loading and mark flag/column imported = 'Y' in bookeeping if import is successfull (mark other flags based on the events if failed, errors, etc in same book keeping table, you can even capture the errors and log it into same book keeping table for current import so at the end you can simply query book keeping to look for errors). You can then dinamically build, after reading book keeping table, in the same Perl script, the Sqoop query to load the data into Hadoop (at this stage you know what period you want, for what table, etc - all is in book keeping so you only need to loop that and Sqoop it). Something like this to build the Sqoop command: sub GetSqoopCommandString { my $astrDir = shift (@_); my $aStrAdaptor = shift (@_); my $astrTable = shift (@_); my $astrQuery = shift (@_); my $astrConn = ""; my $delimiter = $DefaultDelimiter; UpdateLogIfTrace("Building Sqoop command string", 1); $delimiter = $AdapterDelimiter if ( $AdapterDelimiter ); $astrConn = "sqoop import $SQOOP_OPTS --connect \"" . $StrConnectSqoop . "\" --username $StrUserSqoop --password $StrPasswordSqoop " . "--append --query \"" . $astrQuery . ' " --target-dir ' . $astrDir . " --fields-terminated-by '$delimiter' -m 1 --null-string '\\\\N' --null-non-string '\\\\N' --hive-delims-replacement ''"; $astrConn .= " --driver $dbDriver" if ( $dbDriver ); return($astrConn); } # loop book keeping table $sqoopCmd = GetSqoopCommandString($tmpDir, $owner, $table_name, $sqlQuery); Execute ($sqoopCmd, "Exception", \$sqoopDuration, $StrLogFile, \@retCode); # end loop book keeping table Regards, Nic
... View more
08-13-2018
12:53 PM
Did you try INVALIDATE METADATA? https://www.cloudera.com/documentation/enterprise/5-14-x/topics/impala_invalidate_metadata.html Nic
... View more
07-27-2018
12:01 PM
Maybe I don't understand your problem, but is not that data already stored in HDFS? Nic
... View more
07-26-2018
09:11 AM
You can check Hive Metastore logs for any errors. You can also try to telnet the port 9083 where the metastore is located (make sure port is listened by metastore). ps -ef|grep HiveMetaStore netstat -tulpn|grep 9083 Nic
... View more
07-25-2018
03:55 PM
1 Kudo
Someone might find below link useful. I used it when I had to migrate MySQL to different machines. https://gist.github.com/bugcy013/91cd7767e5306d1703e5 Regards, Nic
... View more
11-07-2016
11:41 AM
2 Kudos
Hello Esteban, My problem was that I couldn't get more than 100 records for export due to the CM webpage limitation with the same message you showed: "More queries match your filter than can be displayed. Try narrowing your search filter or decreasing the time range over which you are searching" . Below is an answer I got from Cloudera how to get more rows and for a bigger period using API. The way I managed with it after having the output is a bit customized but pretty quick for me as I'm an Oracle admin. I created an external table in Oracle some database with the output file and then just took out the sql text from there or any other info I needed. Here is the answer from Cloudera: << Here is a little bit instruction on extracting Impala out of CM: 1. click support dropdown in top right 2. click api documentation 3. find the endpoint "impalaQueries" click this link Use the endpoint address it specifics in the docs, add the filter information to the URL and set the limit parameter to a value greater than 100 (the default). If it takes a very long time to dump this data you can also page through it using the 'offset' parameter, or you could use 'from' and 'to' with a small sliding window to grab for the time period you're interested in. You can use the URL that CM directs you to when doing your export and simple add a "&limit=200" to the end of it to get more than 100 >> It worked for me using: ...api/v7/clusters/cluster/services/impala/impalaQueries?from=2016-11-07T18%3A50%3A21.758Z&to=2016-11-07T19%3A20%3A21.758Z&limit=200&filter= ## Parsing the output using Oracle external table: ################################################## drop table admin_task.impala; create table admin_task.impala ( column_text varchar2(4000) ) organization external ( type oracle_loader default directory data_pump_dir access parameters ( records delimited by newline fields terminated by '#' missing field values are null ) location ('impala.txt') ); alter table admin_task.impala reject limit unlimited; create table admin_task.impala_text (text varchar2(4000)); create or replace procedure admin_task.generate_impala_sql as cursor c is select * from admin_task.impala; v_count1 number(10); v_count2 number(10); v_query varchar2(4000); begin for r in c loop if r.column_text like '%"statement"%' then v_count1 := instr(upper(r.column_text),'SELECT', 1, 1); v_count2 := instr(r.column_text,'",', 1, 1) - v_count1; v_query := substr(r.column_text, v_count1, v_count2) || ';' || chr(13) || chr(10); insert into admin_task.impala_text values(v_query); dbms_output.put_line(r.column_text || ' / ' || v_count1 || ' / ' || v_count2 || ' / ' || v_query); -- dbms_output.put_line(v_query); end if; end loop; commit; end; / show errors procedure admin_task.generate_impala_sql; set serveroutput on set serveroutput on size 5000000 set line 10000 exec admin_task.generate_impala_sql; Regards, nicusan
... View more