About mathieu.d

mathieu.d · ‎01-16-2018

Hi, It's been a while ! If I remember correctly, we did not find any solution back then (with CDH5.3.0) - at least other than recreating the collection and re-indexing the data. But after upgrading the CDH version using a version of Solr supporting the "ADDREPLICA" and "DELETEREPLICA" functions in the API you can add an other replica and then delete the one which is down. regards, mathieu

mathieu.d · ‎12-08-2017

Hello, The ticket you acquire from the keytab has an expiry date and a max renewable date. So, if you see that error after a few days, it might just be that (either the expiry date or the max renewable date). You need to "handle" these cases. regards, Mathieu

mathieu.d · ‎11-29-2017

Hi, How are scheduled theses jobs ? If using oozie coordinators then it is more an oozie issue. And I don't think oozie work well with daylight saving time. Guess the workaround is to "reinstall" the coordinators. regards, Mathieu

mathieu.d · ‎11-14-2017

Hi, Well for deleting corrupted blocks there is an option on the hdfs fsck command. Add the option "-delete" and it should delete all corrupted (or missing) files. You might need to leave safe mode for deleting the corrupted files. If you want to "restore" them, then you shoulld try to follow these guidances : https://stackoverflow.com/questions/19205057/how-to-fix-corrupt-hdfs-files Most cases of corrupted files cannot be restored. regards, Mathieu

mathieu.d · ‎11-13-2017

Great ! Nice debug

mathieu.d · ‎11-06-2017

Does the log shown is correlated to an observed reboot of the agent ? If yes, I would investigate this "flood" service that seems to constantly reboot. Possible cause of a never ending looping restart service : out of memory > agent kill the service > agent restart the service > out of memory > repeat. regards, Mathieu

mathieu.d · ‎11-03-2017

For the HDFS command try targeting explecitely the active namenode hdfs dfs -ls hdfs://host:8020/

mathieu.d · ‎11-03-2017

Before fixing the situation, I would try to start only one namenode (the one with data in its directory). It should be considered as the active namenode if he is alone as long as it can start successfuly.

mathieu.d · ‎11-03-2017

The timestamp column is not "suitable" for a partition (unless you want thousands and thousand of partitions). What is suitable : - is to create an Hive table on top of the current not partitionned data, - create a second Hive table for hosting the partitionned data (the same columns + the partition column), - eventualy load the data from the first table to the second one using a query that will "parse" the timestamp column and extract what should be a suitable value for the partition column (for example the year or the year-and-the-month, ...). Example : INSERT INTO TABLE my_partitioned_table PARTITION (part_col_name) SELECT *, year(to_date(my_timestamp_column)) FROM my_not_partitioned_table; You don't have to put the partition value in the insert statement if you enable dynamic partition in Hive. set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; And on your sample it's not working properly because you didn't parse the timestamp column, you use it as is. Each unique value will create a partition. For a timestamps, it's almost each value that is unique.

mathieu.d · ‎11-03-2017

Did you check the supervisor log ?

Online	Offline
Last Visited	‎01-17-2018 02:52 AM

Member Since	‎07-16-2015 01:41 AM
Last Visited	‎01-17-2018 02:52 AM
Posts	177
Kudos received	28

Cloudera Community

Re: Unable to delete HDFS Corrupt files

Re: Hive partitions based on date from timestamp

Re: Partition Hive Table to Hbase Handler ?

Re: yarn logs location on disk

Re: Increase Flume graceful restart time

Re: How to restore a down replica ?

Re: hbase kerberos authentication error with java

Re: Hue/Oozie Time settings

Re: Unable to delete HDFS Corrupt files

Re: Cloudera-scm-agent auto restart every twenty m...

Re: Cloudera-scm-agent auto restart every twenty m...

Re: Cannot start an HA namenode with name dirs tha...

Re: Cannot start an HA namenode with name dirs tha...

Re: Hive partitions based on date from timestamp

Re: Cloudera-scm-agent auto restart every twenty m...