Skip to content

Can not read hive partitioned parquet files with s3 and RawBlob, One formats #87515

@alsugiliazova

Description

@alsugiliazova

Company or project name

No response

Describe what's wrong

I can not read hive partitioned parquet files with s3 and RawBlob, One formats starting from 25.8.

SELECT *
FROM s3('http://minio:9000/warehouse/data/**/**.parquet', 'admin', 'password', 'RawBlob')
SETTINGS use_hive_partitioning = 1
Code: 36. DB::Exception: Received from localhost:9000. DB::Exception: This input format is only suitable for tables with a single column of type String but the number of columns is 2: While executing S3(_table_function.s3)ReadStep. (BAD_ARGUMENTS)

Does it reproduce on the most recent release?

Yes

How to reproduce

  1. Put hive partitioned file to object storage
  2. Run following query:
SELECT *
FROM s3('http://minio:9000/warehouse/data/**/**.parquet', 'admin', 'password', 'One')
SETTINGS use_hive_partitioning = 1
Code: 36. DB::Exception: Received from localhost:9000. DB::Exception: One input format is only suitable for tables with a single column of type UInt8 but the number of columns is 2: While executing S3(_table_function.s3)ReadStep. (BAD_ARGUMENTS)

With use_hive_partitioning=0, query works correctly:

SELECT *
FROM s3('http://minio:9000/warehouse/data/**/**.parquet', 'admin', 'password', 'One')
SETTINGS use_hive_partitioning = 0
    ┌─dummy─┐
 1. │     0 │
 2. │     0 │
 3. │     0 │
 4. │     0 │
 5. │     0 │
 6. │     0 │
 7. │     0 │
 8. │     0 │
 9. │     0 │
10. │     0 │
11. │     0 │
12. │     0 │
13. │     0 │
14. │     0 │
    └───────┘

14 rows in set. Elapsed: 0.019 sec. 

Expected behavior

No exception, correct result

Error message and/or stacktrace

[clickhouse1] 2025.09.23 16:51:14.119734 [ 41 ] {2932d212-ada5-45ed-9525-4927be0a64f0} <Error> executeQuery: Code: 36. DB::Exception: This input format is only suitable for tables with a single column of type String but the number of columns is 2: While executing ReadFromObjectStorage. (BAD_ARGUMENTS) (version 25.8.4.13 (official build)) (from 127.0.0.1:45920) (query 1, line 1) (in query: SELECT * FROM s3('http://minio:9000/warehouse/data/**/**.parquet', 'admin', '[HIDDEN]', 'RawBlob') SETTINGS use_hive_partitioning = 1), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x00000000133aa85f
1. DB::Exception::Exception(String&&, int, String, bool) @ 0x000000000c8559ce
2. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000c855480
3. DB::Exception::Exception<unsigned long>(int, FormatStringHelperImpl<std::type_identity<unsigned long>::type>, unsigned long&&) @ 0x000000000e293cab
4. std::shared_ptr<DB::IInputFormat> std::__function::__policy_invoker<std::shared_ptr<DB::IInputFormat> (DB::ReadBuffer&, DB::Block const&, DB::RowInputFormatParams const&, DB::FormatSettings const&)>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<DB::registerInputFormatRawBLOB(DB::FormatFactory&)::$_0, std::shared_ptr<DB::IInputFormat> (DB::ReadBuffer&, DB::Block const&, DB::RowInputFormatParams const&, DB::FormatSettings const&)>>(std::__function::__policy_storage const*, DB::ReadBuffer&, DB::Block const&, DB::RowInputFormatParams const&, DB::FormatSettings const&) (.llvm.7030994725452934287) @ 0x0000000019d46bab
5. DB::FormatFactory::getInput(String const&, DB::ReadBuffer&, DB::Block const&, std::shared_ptr<DB::Context const> const&, unsigned long, std::optional<DB::FormatSettings> const&, std::shared_ptr<DB::FormatParserSharedResources>, std::shared_ptr<DB::FormatFilterInfo>, bool, DB::CompressionMethod, bool) const @ 0x0000000019a69271
6. DB::StorageObjectStorageSource::createReader(unsigned long, std::shared_ptr<DB::IObjectIterator> const&, std::shared_ptr<DB::StorageObjectStorageConfiguration> const&, std::shared_ptr<DB::IObjectStorage> const&, DB::ReadFromFormatInfo&, std::optional<DB::FormatSettings> const&, std::shared_ptr<DB::Context const> const&, DB::SchemaCache*, std::shared_ptr<Poco::Logger> const&, unsigned long, std::shared_ptr<DB::FormatParserSharedResources>, std::shared_ptr<DB::FormatFilterInfo>, bool) @ 0x0000000016333fd3
7. DB::StorageObjectStorageSource::generate() @ 0x0000000016331334
8. DB::ISource::tryGenerate() @ 0x0000000019abf0de
9. DB::ISource::work() @ 0x0000000019abec76
10. DB::ExecutionThreadContext::executeTask() @ 0x0000000019add642
11. DB::PipelineExecutor::executeStepImpl(unsigned long, DB::IAcquiredSlot*, std::atomic<bool>*) @ 0x0000000019acf7d0
12. void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<DB::PipelineExecutor::spawnThreads(std::shared_ptr<DB::IAcquiredSlot>)::$_0, void ()>>(std::__function::__policy_storage const*) @ 0x0000000019ad3583
13. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::worker() @ 0x0000000013508aab
14. void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*>(void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*&&)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000001350fe26
15. ThreadPoolImpl<std::thread>::ThreadFromThreadPool::worker() @ 0x0000000013505a92
16. void* std::__thread_proxy[abi:ne190107]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x000000001350d55a
17. ? @ 0x0000000000094ac3
18. ? @ 0x0000000000126850

Additional context

Just in case, table looks like this:

SELECT *
FROM s3('http://minio:9000/warehouse/data/**/**.parquet', 'admin', 'password')
    ┌─name────┬─double─┬─integer─┐
 1. │ Grace   │  56.78 │      80 │
 2. │ Charlie │  67.89 │      40 │
 3. │ Frank   │  12.34 │      70 │
 4. │ Ivan    │  34.56 │     100 │
 5. │ Karl    │  23.45 │     120 │
 6. │ Heidi   │  90.12 │      90 │
 7. │ Mallory │  11.12 │     140 │
 8. │ Eve     │  89.01 │      60 │
 9. │ Alice   │ 195.23 │      20 │
10. │ Leo     │  67.89 │     130 │
11. │ Judy    │   78.9 │     110 │
12. │ Nina    │  34.56 │     150 │
13. │ David   │  45.67 │      50 │
14. │ Bob     │ 123.45 │      30 │
    └─────────┴────────┴─────────┘

14 rows in set. Elapsed: 0.020 sec. 

Metadata

Metadata

Assignees

No one assigned

    Labels

    potential bugTo be reviewed by developers and confirmed/rejected.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions