the `filter_file_pattern` and `filter_file_rule` return a Vec because of one path can return multiple documents. But in practice most files only have one lang. Using smallvec can improve the performance