Member since
02-01-2017
42
Posts
1
Kudos Received
0
Solutions
02-07-2023
04:47 AM
Indeed. There is no metadata tab. It is really annoying that it is nearly impossible to find your own queries in this. Also the filtering options are not great.
... View more
10-06-2019
03:13 PM
@gimp077 , Did you mean that "REFRESH" takes time, and eventually you can see the update data, but just some delay? How big is the table? I mean in terms of number of partitions and number of files in HDFS? Eric
... View more
09-18-2019
12:01 AM
You can do alter like I mentioned before: ALTER TABLE test CHANGE col1 col1 int COMMENT 'test comment'; But I do not think you can remove it, but rather to just empty it. Cheers Eric
... View more
03-14-2018
03:11 PM
Try hdfs dfs -ls /
... View more
02-27-2018
04:42 AM
Hi @gimp077 I think there is two ways to do it: 1- You can put the output of impala-query in HDFS after you get it in a system file with PUT HDFS command: sudo -u hdfs hdfs dfs -put "${3}" hdfs_path 2- You can use a directe insert into a result_table (stored in HDFS) just before your select statement: INSERT INTO result_tables YOUR_QUERY
... View more
10-10-2017
08:10 PM
Another option I forgot to mention: if your table is partitioned, and your insert query uses dynamic partitioning, it will generate 1 file per partition: insert into table2 partition(par1,par2) select col1, col2 .. colN, par1, par2 from table1; ... again up to the max parquet file size currently set, so you can play with that max to achieve 2 files per partition. https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_partitioning.html#partition_static_dynamic
... View more
08-23-2017
08:59 PM
2 Kudos
https://issues.apache.org/jira/browse/IMPALA-1570 That feature is available since Impala 2.8 (CDH 5.11)
... View more
04-03-2017
04:51 PM
Try addig some arguments into your Oozie run command like so: $ oozie job -oozie http://localhost:11000/oozie -config job.properties -run If those changes don't work for you might try the following: Put your job.properties out in HDFS in the same directory as your workflow, then use Hue FileBrowser to execute the workflow and see if that works. To do that, just checkmark the workflow.xml and a button will appear for you to take action like a submit. Reduce your workflow down to a simple email, then test... add the SSH, then test... keep adding and testing along the way. If things fail at the first and most simple test (email action), then we've eliminated the other actions as being the culprit, and likely quite a few of your job.properties variables too.
... View more
02-16-2017
12:38 PM
I don't know specifically, but yes, it is most likely because the libraries used were not built for distributed system. For instance, if you had three executors running the code in the library then all three would be reading from the sftp side and directory all vying for the same files and copying them to the destination. It would be a mess.
... View more
02-15-2017
06:46 AM
thanks for the response really good and detailed could you give a little bit of a lower level response as well say how would I add data from a dataframe in spark to a table in hive effeciently. The goal is to improve the speed by using spark instead of hive or impala for db insertions thanks.
... View more