Member since
08-05-2016
52
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3574 | 07-21-2017 12:22 PM |
01-17-2021
12:41 PM
Hi @vjain , To configure the BuckeCache in the descripption there is a two JVM properties. Which one to use please? : HBASE_OPTS or HBASE_REGIONSERVER_OPTS In the hbase-env.sh file for each RegionServer, or in the hbase-env.sh file supplied to Ambari, set the -XX:MaxDirectMemorySize argument forHBASE_REGIONSERVER_OPTS to the amount of direct memory you wish to allocate to HBase. In the configuration for the example discussed above, the value would be 241664m. (-XX:MaxDirectMemorySize accepts a number followed by a unit indicator; m indicates megabytes.) HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize=241664m" Thanks, Helmi KHALIFA
... View more
02-22-2020
11:35 AM
Hi, I am facing the same problem. Do you find a solution to your problem ? Best, Helmi Khalifa
... View more
11-14-2019
02:42 AM
hi @avengers If it works for you, would you be kind enough to accept the answer please ? Best, Helmi KHALIFA
... View more
11-08-2019
08:42 AM
Hi @avengers , U will need to share variables between two zeppelin interpreters and i dont think that we can do it between spark and sparkSQL. I find an easier way by using sqlContext inside the same interpreter %spark: %spark val df = spark.read.format("csv").option("header", "true") .option("inferSchema", "true").load("/somefile.csv") df.createOrReplaceTempView("csvTable"); val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) val resultat = sqlContext.sql("select * from csvTable lt join hiveTable rt on lt.col = rt.col") resultat.show() I tried it and it works ! Best, Helmi KHALIFA
... View more
11-06-2019
02:08 AM
Hi @av , Here the links for the Hive and Spark interpreter doc's : https://zeppelin.apache.org/docs/0.8.2/interpreter/hive.html https://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html Best, Helmi KHALIFA
... View more
11-05-2019
01:21 AM
Hi @Rak ; here the script : CREATE EXTERNAL TABLE IF NOT EXISTS sample_date (sc_code string, ddate timestamp, co_code DECIMAL, high DECIMAL, low DECIMAL, open DECIMAL, close DECIMAL, volume DECIMAL, no_trades DECIMAL, net_turnov DECIMAL, dmcap DECIMAL, return DECIMAL, factor DECIMAL, ttmpe DECIMAL, yepe DECIMAL, flag string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '/lab/itim/ccbd/helmi/sampleDate' tblproperties('skip.header.line.count'='1'); ALTER TABLE sample_date SET SERDEPROPERTIES ("timestamp.formats"="MM/DD/YYYY"); Could you accept the answer please ? Best, Helmi KHALIFA
... View more
11-04-2019
03:03 AM
Hi @Ra You have to change the name and column type as youscan see in red below : sc_code string ddate date co_code double high double low double open double close double volume double no_trades double net_turnov double dmcap double return double factor double ttmpe double yepe double flag string I tried it and it works well for me. Best, Helmi KHALIFA
... View more
10-31-2019
06:17 AM
Hi @Rak , Can you show us some sample the 5 first rows of your csv file, please ? Best, Helmi KHALIFA
... View more
10-25-2019
01:18 AM
1 Kudo
Hi @RNN The best solution is to convert the Monthes to integers like: -Oct- => -10- -Dec- =>-12- So that is what i tested as you can see my file below: $ hdfs dfs -cat /lab/helmi/test_timestamp_MM.txt 1,2019-10-14 20:00:01.027898 2,2019-12-10 21:00:01.023 3,2019-11-25 20:00:01.03 4,2019-01-06 20:00:01.123 Create a Hive table : hive> CREATE EXTERNAL TABLE ttime(id int, t string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION '/lab/helmi/'; hive> select * from ttime; OK 1 2019-10-14 20:00:01.027898 2 2019-12-10 21:00:01.023 3 2019-11-25 20:00:01.03 4 2019-01-06 20:00:01.123 Time taken: 0.566 seconds, Fetched: 4 row(s) Finally i created another table with the right format: hive> create table mytime as select id, from_utc_timestamp(date_format(t,'yyyy-MM-dd HH:mm:ss.SSSSSS'),'UTC') as datetime from ttime; Best, Helmi KHALIFA
... View more
09-24-2019
02:18 AM
HI @hadoopguy Yes there is an impact you will have longer processing time and the operations will be queued. You have to carefully handle the timeout in your jobs. Best, @helmi_khalifa
... View more