Generally, If there are too many small stripes with many columns or few columns with huge value in string column, overhead of storing metadata(column stats) might be higher than protobuf size.
i.e., In Orc, index data will help to filter out stripes using min/max of the column.
For String Column, min/max & sum of the lengths of the values are recorded in metadata. -- Min/Max of string column is higher than protobuf maxLimit which is throwing this exception.
Whereas for Binary Column, total length of all binary columns are recorded in metadata. -- This will not exceed protobuf maxLimit & will not have min/max values to filter the stripes.
If column values are huge & query will not have filters on such huge columns, its better to use binary data type as it stores less information in metadata.