I am working as an Hadoop Administrator for the past 2.5 years, Basically i am a Teradata Administrator. one of our important client request for BigData solution for processing more than 100 Million CSV records, so i entered into Big data world (REALLY BIG WORLD) and I am still learning like student :)
Once we are entered into BIG data world we can't able to skip the Hadoop. so i am working for past 2.5 years in hadoop tools (Hue, Hive, HBase, PIG, Map-Reduce, Sqoop,kafka,Storm etc...).
Initially i am really struggled to survive in hadoop environment (New tools including JAVA) but now i am really happy to here (community) i am learning as well as waiting for the big future too.
I am very passionate to play cricket.
To introduce myself:
have over 18 years of experience In Information Architecture, Data/Data warehouse Architect (OLTP and OLAP), Data modeling and governance, Business Intelligence design, Database Design, Database development and performance tuning, Operations & Infrastructure management, Technical Project Management
and executing POC for performance benchmarking in multi tier and multi Terabyte Data warehouse.
Experience on establishing Direction and data governance of major change program in their use of data.
I am really excited to be part of Clodera community and is privillege to interact with so many talented proessionals here. i am quite new to this group and can't wait to take deep dive into Big Data and Hadoop world and am sure to have lot of fun in future.
I love travlling, Walking, playing cricket and have passion for the community service and am associated with Non profit organization
for more than 7 years.
You can ind me in linkedIn: http://www.linkedin.com/pub/deepak-lal/9/696/720
My name is Brant. I'm a researcher at Johns Hopkins. My background is in biomedical research with an emphasis on machine learning and natural language processing. I'm coming over from mostly large shared memory and MPI machines/clusters. I do have lots of Java programming experience but most of our applications aren't Java based. I've worked on Hadoop/Accumulo based applications before but am new to the dev ops/configuration/deployment aspects.
I'm currently trying to get my dockerized applications deployed in hadoop streaming on one of our clusters and eventually hoping to get longer running dockerized gpu applications running in slider. We're using docker because we can encapsulate our c++/python/etc depedencies with our programs.
I am mohan ,base location is Singapore and working with Singtel pvt limitted(Telecom company),recently joined this company as Big data architecture .Our company using cloudera 5.4 entrerprise edition .
Having 10+years experience and cloud era distribution working more than 2 years in sizing ,storage format,data analytics .
Hello! I'm very new to the Hadoop / Cloudera environment and hoping to learn much from this community. My career arc has been software developer, software architect, and enterprise architect working entirely in the agribusiness industry. I just started a new job and have been given the task of identifying big data use cases to drive value from our Cloudera environment. So, here we go!
Hi Cloudera community!
Happy to join your community!
I'm a sysadmin and who love my job and like to works on new technology. So, I'm on Cloudera now!
For some test, we create a cluster with 3 nodes in a labs. 1 node for Cloudera Manager, 1 node for NameNode and DataNode, and the last one as DataNode only.
It's a labs to discover the new version of Cloudera 5.5. So it's just to made some test on it, not to be in production!
We install these services: hdfs, hive, hue, impala, oozie, zookeeper, Mapreduce2 (Yarn), Sqoop1.
One our developers, try to import some data into Hive, but we got an error.
Here the command line use by our developers:
sqoop import --connect jdbc:mysql://our.database.url/database --username user --password passwordtest --table table_product --target-dir /path/to/db --split-by product_id --hive-import --hive-overwrite --hive-table table_product
The command start successfully, we see the mapper do the job to 100 but when the job finish, we have an error:
6/02/12 15:37:57 WARN hive.TableDefWriter: Column last_updated had to be cast to a less precise type in Hive
16/02/12 15:37:57 INFO hive.HiveImport: Loading uploaded data into Hive
16/02/12 15:37:57 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
16/02/12 15:37:57 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
I do some search in configuration file about the HIVE_CONF_DIR and doesn't find something weird.
I don't find a solution about and I block on it... So our developer can't continue his test.
I search in Cloudera Manager configuration too.
Have you an idea about that ? I do some search on web with no success.
Thanks a lot for your help!