About mathieu.d

ginsul · ‎06-14-2018

Yes

Divyani · ‎02-26-2018

Did you find any solution for this issue?

mathieu.d · ‎01-16-2018

Hi, It's been a while ! If I remember correctly, we did not find any solution back then (with CDH5.3.0) - at least other than recreating the collection and re-indexing the data. But after upgrading the CDH version using a version of Solr supporting the "ADDREPLICA" and "DELETEREPLICA" functions in the API you can add an other replica and then delete the one which is down. regards, mathieu

Manikumar Juttukonda · ‎01-12-2018

You can use PURGE option to delete data file as well along with partition mentadata but it works only in INTERNAL/MANAGED tables ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec PURGE; External Tables have a two step process to alterr table drop partition + removing file ALTER TABLE table_name DROP [IF EXISTS] PARTITION partition_spec; hadoop fs -rm -r <partition file path>

znever · ‎12-13-2017

I think the keytab you used has expired. Try to kinit a new keytab for your code, and issue should be solved

aleksandram · ‎11-29-2017

Hi. Yes, jobs are scheduled as Oozie Coordinators. The problem is that job description say that the job will be executed at every even hour, but job starts executing at every odd hour. Do you know any other way to fix this, since this happened on our production environment and "reinstalling" would take a lot of time which could lead to downtime? Thank you in advance.

epowell · ‎11-13-2017

I continued the resolution of this issue in another thread specific to the error: ls: Operation category READ is not supported in state standby The solution is marked on that thread, however, a quick summary was that I needed to add the Failover Controller role to a node in my cluster, enable Automatic Failover, and then restart the cluster for it all to kick in.

mathieu.d · ‎11-13-2017

Great ! Nice debug

cdhhadoop · ‎11-06-2017

@mathieu.d, Thanks for reply, will try this and let you know.

mathieu.d · ‎11-03-2017

The timestamp column is not "suitable" for a partition (unless you want thousands and thousand of partitions). What is suitable : - is to create an Hive table on top of the current not partitionned data, - create a second Hive table for hosting the partitionned data (the same columns + the partition column), - eventualy load the data from the first table to the second one using a query that will "parse" the timestamp column and extract what should be a suitable value for the partition column (for example the year or the year-and-the-month, ...). Example : INSERT INTO TABLE my_partitioned_table PARTITION (part_col_name) SELECT *, year(to_date(my_timestamp_column)) FROM my_not_partitioned_table; You don't have to put the partition value in the insert statement if you enable dynamic partition in Hive. set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; And on your sample it's not working properly because you didn't parse the timestamp column, you use it as is. Each unique value will create a partition. For a timestamps, it's almost each value that is unique.

Online	Offline
Last Visited	‎01-17-2018 02:52 AM

Member Since	‎07-16-2015 01:41 AM
Last Visited	‎01-17-2018 02:52 AM
Posts	177
Kudos received	28

Cloudera Community

Re: Unable to delete HDFS Corrupt files

Re: Hive partitions based on date from timestamp

Re: Partition Hive Table to Hbase Handler ?

Re: yarn logs location on disk

Re: Increase Flume graceful restart time

Re: How do I add a library to hbase-indexer classp...

Did you find any solution for this issue?

Re: How to restore a down replica ?

Re: How to delete/drop a partition of an external ...

Re: hbase kerberos authentication error with java

Re: Hue/Oozie Time settings

Re: Cannot start an HA namenode with name dirs tha...

Re: Cloudera-scm-agent auto restart every twenty m...

Re: Launcher ERROR, reason: Main class [org.apache...

Re: Hive partitions based on date from timestamp