Member since
05-10-2016
184
Posts
60
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4079 | 05-06-2017 10:21 PM | |
4092 | 05-04-2017 08:02 PM | |
5005 | 12-28-2016 04:49 PM | |
1240 | 11-11-2016 08:09 PM | |
3318 | 10-22-2016 03:03 AM |
05-06-2017
04:13 PM
1 Kudo
@Raj B Its true that it won't change the layout, however, you can create an external table on top of the same location in its original format i.e. create external table abc (col1 int, col2 string) location '/location/of/modified/orc/table' stored as textfile (or original format); Ideally if the data has not changed then you should be able to view it and then read or do a CTAS from that.
... View more
05-04-2017
08:27 PM
GOAL To change the default log location for HUE from /var/log/hue to elsewhere. Steps Default HUE service writes the log files to /var/log/hue directory. This is somehow enforced in the code which states this The ``log_dir`` will replace the %LOG_DIR% in log.conf. If not specified, we look for the DESTKOP_LOG_DIR environment variable, and then default to the DEFAULT_LOG_DIR. However, trying to set either variables i.e. DESKTOP_LOG_DIR and DEFAULT_LOG_DIR don't seem to work. We can however change the %LOG_DIR% with an absolute path within /etc/hue/conf/log.conf. Here is an output of the log.conf replaced with absolute path for the log directory. #args=('%LOG_DIR%/access.log', 'a', 1000000, 3)
args=('/opt/log/hue/access.log', 'a', 1000000, 3)
--
#args=('%LOG_DIR%/error.log', 'a', 1000000, 3)
args=('/opt/log/hue/error.log', 'a', 1000000, 3)
--
#args=('%LOG_DIR%/%PROC_NAME%.log', 'a', 1000000, 3)
args=('/opt/log/hue/%PROC_NAME%.log', 'a', 1000000, 3)
--
#args=('%LOG_DIR%/shell_output.log', 'a', 1000000, 3)
args=('/opt/log/hue/shell_output.log', 'a', 1000000, 3)
--
#args=('%LOG_DIR%/shell_input.log', 'a', 1000000, 3)
args=('/opt/log/hue/shell_input.log', 'a', 1000000, 3)
NOTE: Do ensure that the new log directory exists and has hue as user and group. Post the changes, a restart of hue service should be good enough to route the new logs in the new location.
... View more
Labels:
05-04-2017
08:02 PM
@Raagul Sukumar Works if you change/replace the LOG_DIR variable to something else. You can modify "/etc/hue/conf/log.conf" and replace it something like this wherever you see LOG_DIR. In my case I first created a directory "/opt/log/hue", then make the following changes. #args=('%LOG_DIR%/shell_input.log', 'a', 1000000, 3)
args=('/opt/log/hue/shell_input.log', 'a', 1000000, 3)
Once the changes are made, restart hue service "service hue restart" You can tail access.log and it should be progressing with every click on Hue services. [Please accept the answer if this works for you]
... View more
05-04-2017
07:28 PM
Glad you were able to figure it out @mliem
... View more
05-04-2017
05:44 PM
Error Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 110
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.planReadPartialDataStreams(RecordReaderImpl.java:914)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readPartialDataStreams(RecordReaderImpl.java:958)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:793)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:986)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1019)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:205)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rows(ReaderImpl.java:585)
at org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:291)
at org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:261)
at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:127)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Cause The issue is related to lack of enhancement in the code RecordReaderImpl.java in 2.3.4 and lower versions while reading stream of data. The issue lies with the "includedColumns[column]" check wherein the size of output exceeds the size of array variable, around lines 916. Resolution This issue was fixed in 2.4 and above versions. Its possible that an intermediate fix might be available within one of the versions higher than 2.3.4.7. Its safer to upgrade to 2.4.x or better 2.6.
... View more
Labels:
05-03-2017
04:30 AM
@mliem Have you tried this ? ranger.usersync.ldap.username.caseconversion=lower
ranger.usersync.ldap.groupname.caseconversion=lower
then restart ranger.
... View more
05-02-2017
10:59 PM
@Kumar Veerappan You've got all the pointers, you just need to do the math: 1TB of MS SQL Data = 3 TB on HDFS for minimal replicated blocks. If one data node as 10 TB storage, then you should consider leaving about 2TB on each data node for other operations With this config, you'd need 24 data nodes (8GB of storage for hdfs data) which can store up to 192TB of data which is good for 92 days OR more, provided you enable compression. Try the same for bigger volume of data which should give you the numbers for storing data over 1-5 years period.
Compression also depends on type of data (i.e., columns, types and rows) hence, it would be ideal for you to test this out on a sandbox/single node cluster, this should give you some good estimates on compression for overall planning.
... View more
05-01-2017
06:46 PM
@zkfs on which node are you changing the permissions ? You should do that on the namenode.
... View more
04-16-2017
08:31 AM
2 Kudos
Seems like a known bug https://issues.apache.org/jira/browse/TEZ-3336 fixed in 2.6.
... View more