Welcome to the Cloudera Community
With over 60549 Community members, 17599 solutions and 3574 articles you’ve come to the right place
Announcements
Alert: The Cloudera Community will undergo maintenance on Saturday, August 17 at 12:00am PDT. See more info here.

aniston's Posts

Hi everyone,   I'm impala developer, i need some help about impala issues, I'm using Impala 1.4 with CDH 5.1.2. I'm working on a view X, every FIELD is String datatype.   When i'm trying to ru... See more...
Hi everyone,   I'm impala developer, i need some help about impala issues, I'm using Impala 1.4 with CDH 5.1.2. I'm working on a view X, every FIELD is String datatype.   When i'm trying to run this code: select length(FIELD) from view X; The result is 4 for example.   A part of the content of the field is "\0". This means a NULL value. For example if i try to run this query : select count(FIELD) from X where FIELD like '%\0%' The result of the count is N.   When i'm trying to materialize this view, the length of a specified field is duplicated. I mean,when i run this query: select length(FIELD) from materialize view x the result is 8, wich is exactly 4*2...XD   This miss probabily that Impala is duplicating the legnth of that string in the materialized view.   I don't know why. I need some help about this issues, Thank you very much Alberto
Hi, I have a string of batch indexing solr. I need to run to index new files every X time. I'm currently using a crontab but I would use oozie. The right path is a shell-action or a java-action? ... See more...
Hi, I have a string of batch indexing solr. I need to run to index new files every X time. I'm currently using a crontab but I would use oozie. The right path is a shell-action or a java-action? Is there any example to follow as action java? I'm trying to make a shell action, but I get errors in a morphline where I put java code.    Stdoutput Caused by: org.kitesdk.morphline.api.MorphlineCompilationException: Cannot compile script near: { Stdoutput # GrpHdr.conf: 29 Stdoutput "code" : "String msg = record.get(message).toString();if (!msg.contains(<GrpHdr)) {return false;}return child.process(record);", Stdoutput # GrpHdr.conf: 28 Stdoutput "imports" : "import java.util.*;import java.lang.String;" Stdoutput } Stdoutput at org.kitesdk.morphline.stdlib.JavaBuilder.build(JavaBuilder.java:54) Stdoutput at org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:302) Stdoutput at org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:249) Stdoutput at org.kitesdk.morphline.stdlib.TryRulesBuilder$TryRules.<init>(TryRulesBuilder.java:82) Stdoutput at org.kitesdk.morphline.stdlib.TryRulesBuilder.build(TryRulesBuilder.java:59) Stdoutput at org.kitesdk.morphline.base.AbstractCommand.buildCommand(AbstractCommand.java:302) Stdoutput at org.kitesdk.morphline.base.AbstractCommand.buildCommandChain(AbstractCommand.java:249) Stdoutput at org.kitesdk.morphline.stdlib.Pipe.<init>(Pipe.java:46) Stdoutput at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40) Stdoutput at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126) Stdoutput at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55) Stdoutput at org.apache.solr.hadoop.morphline.MorphlineMapRunner.<init>(MorphlineMapRunner.java:157) Stdoutput at org.apache.solr.hadoop.morphline.MorphlineMapper.setup(MorphlineMapper.java:75) Stdoutput at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) Stdoutput at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) Stdoutput at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) Stdoutput at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) Stdoutput at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) Stdoutput at java.util.concurrent.FutureTask.run(FutureTask.java:262) Stdoutput at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) Stdoutput at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) Stdoutput at java.lang.Thread.run(Thread.java:744) Stdoutput Caused by: javax.script.ScriptException: Cannot compile script: String msg = record.get(message).toString();if (!msg.contains(<GrpHdr)) {return false;}return child.process(record); caused by compilation failed: > expected Stdoutput ')' expected   fails to take the java code, I tried to put everything on the same line to change quotes and double quotes, but I can not find the solution.   I'm using CDH 5.0.2 Thanks Albert
Hi,   I changed the power of the machines in aws. I have also some parameter of yarn, but the real change was to have more powerful machines.
By guide of sqoop:   Oracle also includes the additional date/time types TIMESTAMP WITH TIMEZONE and TIMESTAMP WITH LOCAL TIMEZONE. To support these types, the user’s session timezone must be spe... See more...
By guide of sqoop:   Oracle also includes the additional date/time types TIMESTAMP WITH TIMEZONE and TIMESTAMP WITH LOCAL TIMEZONE. To support these types, the user’s session timezone must be specified. By default, Sqoop will specify the timezone "GMT" to Oracle. You can override this setting by specifying a Hadoop property oracle.sessionTimeZone on the command-line when running a Sqoop job. For example: $ sqoop import -D oracle.sessionTimeZone=America/Los_Angeles \  Ii suppose when i put this code into my string: -D oracle.sessionTimeZone=Europe/Rome   Automatically in the metastore sqoop saved the last-value data in UTC time or another settings with timezone (in this example utc+2)   But i dont fix this problem, is right my opinion?
Hello, I'm importing my data with sqoop's job with a workflow. Mi timezone is Europe/Rome so UTC+2. My source is a oracle's DB with timezone UTC. I don't know what happens in sqoop's metastore, b... See more...
Hello, I'm importing my data with sqoop's job with a workflow. Mi timezone is Europe/Rome so UTC+2. My source is a oracle's DB with timezone UTC. I don't know what happens in sqoop's metastore, but i understand that my last-value is different, and i can't bring the spread. From the log i have this configuration: org.apache.sqoop.manager.OracleManager - Time zone has been set to GMT Lower bound value: TO_TIMESTAMP('2014-07-31 17:50:48.0', 'YYYY-MM-DD HH24:MI:SS.FF') Upper bound value: TO_TIMESTAMP('2014-07-31 17:53:04.0', 'YYYY-MM-DD HH24:MI:SS.FF') user.timezone=Europe/San_Marino   These are my tries: 1)settings in properties: name: oracle.sessionTimeZone value: Europe/Rome 2)settings in string-connession: -Doracle.sessionTimeZone=GMT And other tries, but now i want understand how sqoop's work, and don't try to guess.
Hi,  I can't understand why my log doesn't appair at the interface of Hue. If i click the log on the running i can see the log, if i click after the end, doesn't appair nothing. is there some conf... See more...
Hi,  I can't understand why my log doesn't appair at the interface of Hue. If i click the log on the running i can see the log, if i click after the end, doesn't appair nothing. is there some config to fix this problem?    
Hi, when I launch my workflow with Hue I have the same error: Error getting logs .............  In  Std_out ,Std_err, syslog
My problem is https://issues.apache.org/jira/browse/SQOOP-857   I try to put hbase-site.xml in oozie: <job-xml>hbase-site.xml</job-xml> <file>hbasesite.xml#hbasesite.xml</file>   But i have sa... See more...
My problem is https://issues.apache.org/jira/browse/SQOOP-857   I try to put hbase-site.xml in oozie: <job-xml>hbase-site.xml</job-xml> <file>hbasesite.xml#hbasesite.xml</file>   But i have same problem.   is it a right procedur?
Hi, I'm doing manual import on one host, i'm putting all the jar of hbase in /sqoop/lib/ But is bad system to risolve the problem. What must I do for set HBASE_HOME forever? in /etc/profile? If I... See more...
Hi, I'm doing manual import on one host, i'm putting all the jar of hbase in /sqoop/lib/ But is bad system to risolve the problem. What must I do for set HBASE_HOME forever? in /etc/profile? If I use CDH5 with classic installation, my HBASE_HOME is? Sorrry for the simple question but i want to be sure  
I'm trying to launch a sqoop's job. My job write on HBase. The code of the job is:   job --meta-connect jdbc:hsqldb:hsql://XXX:16000/sqoop --create jobA1 -- import --connect jdbc:oracle:thin:@XX... See more...
I'm trying to launch a sqoop's job. My job write on HBase. The code of the job is:   job --meta-connect jdbc:hsqldb:hsql://XXX:16000/sqoop --create jobA1 -- import --connect jdbc:oracle:thin:@XXX:XXX/XXX --username XXX --password-file /user/U0H8047/XXX.pw --table XXX -m 1 --incremental lastmodified --last-value "2014-07-07" --check-column DTTMCREATED --append --hbase-table hello --column-family original --hbase-row-key ID   If i try to write on HDFS is all done, but i've got this problem to write on HBase. I launch the job on the host with Master-HBase.   14/07/14 16:30:42 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:389) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:366) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:247) at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:183) at org.apache.sqoop.mapreduce.HBaseImportJob.jobSetup(HBaseImportJob.java:143) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:245) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:614) at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:436) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506) at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228) at org.apache.sqoop.tool.JobTool.run(JobTool.java:283) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231) at org.apache.sqoop.Sqoop.main(Sqoop.java:240) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:387) ... 17 more Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:195) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479) at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:801) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:633) ... 22 more Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358)  
Hi, i'm importing data from oracle db to hbase with sqoop. I used  sqoop line command, and all it's done.   sqoop import --connect jdbc:oracle:thin:@XXX:port/XXX --username XXX --password XXX --ta... See more...
Hi, i'm importing data from oracle db to hbase with sqoop. I used  sqoop line command, and all it's done.   sqoop import --connect jdbc:oracle:thin:@XXX:port/XXX --username XXX --password XXX --table XXX -m 1 --incremental lastmodified --last-value '2014-06-23' --check-column XXX --append --hbase-table XXX --column-family info --hbase-row-key XXX --hbase-bulkload  I tryed with oozie, and i have got this error in loop: INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused  My cluster configuration is with 5 host: master,slave1,slave2,slave3, slave4 sqoop 1 client: slave1,slave2,slave3,slave4 zookeeper server: master,slave1,slave2 hbase server: hbaserestserver -> master hbasethriftserver -> master master -> master regionserver -> slave1,slave2,slave3,slave4 I have some doubt on my configuration: if my configuration is all ok, my question is: zookeeper server with localhost/127.0.0.1:2181 is correct to start? because i've got only 3 zookeeper server and i think this is my problem.   And there is some SQOOP_CLASSPATH or HBASE_CLASSPATH to set? oozie don't initialize everything? Thanks
Hello,   I'm using CDH 5.0.1 i'm trying to import data into hbase using sqoop. i launch my sqoop line-command:   sudo -u hdfs sqoop import --connect jdbc:mysql://10.0.0.221/db --username XXX... See more...
Hello,   I'm using CDH 5.0.1 i'm trying to import data into hbase using sqoop. i launch my sqoop line-command:   sudo -u hdfs sqoop import --connect jdbc:mysql://10.0.0.221/db --username XXX --password XXX --table test -m 1 --target-dir /user/import --incremental lastmodified --check-column date --append --hbase-table forHive --column-family infos   During the log, i have this error: Error during import: HBase jars are not present in classpath, cannot import to HBase! I set my $HBASE_HOME : export HBASE_HOME=/usr/lib/hbase after that i change my hbase_home because i use CDH with this string: export HBASE_HOME=/opt/cloudera/parcels/CDH/lib/hbase and i also try: export HBASE_HOME=/opt/cloudera/parcels/CDH/lib/hbase/lib But in every case i have this error:  ERROR tool.ImportTool: Error during import: HBase jars are not present in classpath, cannot import to HBase! What kind of HBASE_HOME must i set? and is possible use builk-load HBase with sqoop for massive import? thanks
I'm trying this workflow under AWS-ec2 machines, for testing. I have changed my mini-cluster with 3 nodes, with machines powerful (x.medium -> x2.large) and my workflow it's ok. I search on interne... See more...
I'm trying this workflow under AWS-ec2 machines, for testing. I have changed my mini-cluster with 3 nodes, with machines powerful (x.medium -> x2.large) and my workflow it's ok. I search on internet some impo, but i don't find a tutorial-rules, there is some link for this? now i set virtual core, and other impostation with my mind but i want some more efficient.
I think could be right idea! I change on cloudera manager max running apps for user root in cluster->Dynamic resource pools. i also change in YARN configuration the java-heap-space. Is there som... See more...
I think could be right idea! I change on cloudera manager max running apps for user root in cluster->Dynamic resource pools. i also change in YARN configuration the java-heap-space. Is there something else to change, may be in oozie configuration?
Hello, I'm doing a simple oozie's workflow with sqqop's action. this is the xml's code   <workflow-app name="Sqoop" xmlns="uri:oozie:workflow:0.4"> <start to="Sqoop"/> <action name="Sqoop... See more...
Hello, I'm doing a simple oozie's workflow with sqqop's action. this is the xml's code   <workflow-app name="Sqoop" xmlns="uri:oozie:workflow:0.4"> <start to="Sqoop"/> <action name="Sqoop"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <command>import --connect jdbc:mysql://10.0.0.221/db --username cloudera --password cloudera --table test -m 1 --target-dir /user/albert</command> </sqoop> <ok to="end"/> <error to="kill"/> </action> <kill name="kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app> When i launch my job with hue or line-command, starts the oozie-launcher. My oozie laucher, never finish, it is always stay at 95%. If i kill the job oozie-launcher, starts the oozie-sqoop and it do his work very good. But i don't understand why the oozie-launcher doesnt' work well!   This is my output   >>> Invoking Sqoop command line now >>> 12280 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 12510 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.4-cdh5.0.1 12640 [main] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead. 12748 [main] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 13131 [main] INFO org.apache.sqoop.manager.MySQLManager - Preparing to use a MySQL streaming resultset. 13131 [main] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation 15107 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1 15202 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1 15206 [main] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/lib/hadoop-mapreduce 23112 [main] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/ab1d242ba1b3a0869bab06c1eb20c02f/test.jar 23140 [main] WARN org.apache.sqoop.manager.MySQLManager - It looks like you are importing from mysql. 23140 [main] WARN org.apache.sqoop.manager.MySQLManager - This transfer can be faster! Use the --direct 23140 [main] WARN org.apache.sqoop.manager.MySQLManager - option to exercise a MySQL-specific fast path. 23140 [main] INFO org.apache.sqoop.manager.MySQLManager - Setting zero DATETIME behavior to convertToNull (mysql) 23151 [main] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of test 23232 [main] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies. 25731 [main] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat     Someone can help me? Thank-you
Ok, i try with repo config on 0. Where can i find this setting?
Hi, during the installation cluster i have already got this problem. I see in details with internet connection on that don't download absolute nothing in the step : sudo yum info jdk But wit... See more...
Hi, during the installation cluster i have already got this problem. I see in details with internet connection on that don't download absolute nothing in the step : sudo yum info jdk But without internet i'm blocked to this step. The problem that i see is: Installing jdk package...BEGIN sudo yum list installed jdk Loaded plugins: amazon-id, rhui-lb, security Error: No matching Packages to list END (1) BEGIN sudo yum info jdk Loaded plugins: amazon-id, rhui-lb, security Could not contact any CDS load balancers: rhui2-cds01.eu-west-1.aws.ce.redhat.com, rhui2-cds02.eu-west-1.aws.ce.redhat.com. Could not contact CDS load balancer rhui2-cds01.eu-west-1.aws.ce.redhat.com, tring others. END (1) remote package jdk is not available, giving up waiting for rollback request Thanks
hi, I'm trying to install cluster hadoop without internet connection. I'm following Installation Path C - Installation Using Tarballs in cloudera cdh5.0.1 documentation. My trouble is during the c... See more...
hi, I'm trying to install cluster hadoop without internet connection. I'm following Installation Path C - Installation Using Tarballs in cloudera cdh5.0.1 documentation. My trouble is during the cluster-installation, the fault is : BEGIN sudo yum info jdk Cloudera Manager search info about jdk, but I install JDK by rpm not with yum. I'm doing my installation on AWS machines with security-groups without internet. Thank you
i'm trying to rewrite source for my flume application. for mono line i used NetCat Source i have got multi-line log to ingestion. I understood that i must rewrite source, and it must extend Abstra... See more...
i'm trying to rewrite source for my flume application. for mono line i used NetCat Source i have got multi-line log to ingestion. I understood that i must rewrite source, and it must extend AbstractSource and implements EventDrivenSource. I want to change the event, insert more line in one event, i think that it's possible. Now my question is: i saw the code of flume-core and i don't understand what piece of code i should edit, something in method run()? or in serialization?
Thanks , but this is not a wonderful news, i try to set Kerberos, is not intuitive. i also read that i can use ldap, replace Kerberos or is an additional tool?