-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Feature](iceberg) Support Iceberg JDBC Catalog #59502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for Iceberg JDBC Catalog, enabling metadata storage in relational databases (PostgreSQL, MySQL, SQLite) as an alternative to HMS or other catalog types. The implementation follows the existing pattern used by other Iceberg catalog types and integrates seamlessly with Doris's catalog framework.
Key Changes:
- Added JDBC catalog type with comprehensive property handling and S3-compatible storage support
- Integrated JDBC catalog into factory registration and serialization infrastructure
- Added unit tests for property validation and JDBC-specific configuration passthrough
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| IcebergJdbcMetaStoreProperties.java | Core implementation handling JDBC catalog properties, URI configuration, and storage integration |
| IcebergJdbcExternalCatalog.java | External catalog class following standard pattern for JDBC catalog type |
| IcebergJdbcMetaStorePropertiesTest.java | Unit tests validating property handling, passthrough, and required field checks |
| IcebergPropertiesFactory.java | Registered "jdbc" catalog type in the factory |
| IcebergExternalCatalogFactory.java | Added JDBC catalog creation in the factory switch statement |
| GsonUtils.java | Registered IcebergJdbcExternalCatalog for JSON serialization |
| IcebergScanNode.java | Added JDBC catalog type to supported scan sources |
| IcebergExternalCatalog.java | Added ICEBERG_JDBC constant definition |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private static void toFileIOProperties(List<StorageProperties> storagePropertiesList, | ||
| Map<String, String> fileIOProperties, Configuration conf) { | ||
| for (StorageProperties storageProperties : storagePropertiesList) { | ||
| if (storageProperties instanceof AbstractS3CompatibleProperties) { | ||
| toS3FileIOProperties((AbstractS3CompatibleProperties) storageProperties, fileIOProperties); | ||
| } else if (storageProperties.getHadoopStorageConfig() != null) { | ||
| conf.addResource(storageProperties.getHadoopStorageConfig()); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| private static void toS3FileIOProperties(AbstractS3CompatibleProperties s3Properties, | ||
| Map<String, String> options) { | ||
| // Set S3FileIO as the FileIO implementation for S3-compatible storage | ||
| options.put(CatalogProperties.FILE_IO_IMPL, "org.apache.iceberg.aws.s3.S3FileIO"); | ||
|
|
||
| if (StringUtils.isNotBlank(s3Properties.getEndpoint())) { | ||
| options.put(S3FileIOProperties.ENDPOINT, s3Properties.getEndpoint()); | ||
| } | ||
| if (StringUtils.isNotBlank(s3Properties.getUsePathStyle())) { | ||
| options.put(S3FileIOProperties.PATH_STYLE_ACCESS, s3Properties.getUsePathStyle()); | ||
| } | ||
| if (StringUtils.isNotBlank(s3Properties.getRegion())) { | ||
| options.put(AwsClientProperties.CLIENT_REGION, s3Properties.getRegion()); | ||
| } | ||
| if (StringUtils.isNotBlank(s3Properties.getAccessKey())) { | ||
| options.put(S3FileIOProperties.ACCESS_KEY_ID, s3Properties.getAccessKey()); | ||
| } | ||
| if (StringUtils.isNotBlank(s3Properties.getSecretKey())) { | ||
| options.put(S3FileIOProperties.SECRET_ACCESS_KEY, s3Properties.getSecretKey()); | ||
| } | ||
| if (StringUtils.isNotBlank(s3Properties.getSessionToken())) { | ||
| options.put(S3FileIOProperties.SESSION_TOKEN, s3Properties.getSessionToken()); | ||
| } | ||
| } |
Copilot
AI
Dec 31, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The methods toFileIOProperties and toS3FileIOProperties are duplicated from IcebergRestProperties with only minor differences (the problematic FILE_IO_IMPL line). Consider extracting these common methods to the AbstractIcebergProperties base class to eliminate code duplication and ensure consistency across all Iceberg catalog implementations. This would make maintenance easier and prevent similar issues in the future.
| // Set S3FileIO as the FileIO implementation for S3-compatible storage | ||
| options.put(CatalogProperties.FILE_IO_IMPL, "org.apache.iceberg.aws.s3.S3FileIO"); | ||
|
|
Copilot
AI
Dec 31, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting FILE_IO_IMPL unconditionally to S3FileIO when S3 storage is detected may break functionality for non-S3 storage types. Looking at IcebergRestProperties, the toS3FileIOProperties method does not set FILE_IO_IMPL and relies on Iceberg's automatic FileIO selection via CatalogUtil.buildIcebergCatalog. This line should be removed to maintain consistency with other catalog implementations and allow Iceberg to handle FileIO selection automatically based on the storage properties provided.
| // Set S3FileIO as the FileIO implementation for S3-compatible storage | |
| options.put(CatalogProperties.FILE_IO_IMPL, "org.apache.iceberg.aws.s3.S3FileIO"); |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Currently, Doris supports multiple Iceberg catalog types (HMS, REST, Hadoop, Glue, DLF, S3Tables) but lacks support for JDBC Catalog. This PR adds support for Iceberg JDBC Catalog, which allows users to store Iceberg metadata in relational databases like PostgreSQL, MySQL, and SQLite.
Key Changes:
IcebergJdbcMetaStorePropertiesclass to handle JDBC catalog configurationsIcebergJdbcExternalCatalogclass for JDBC catalog operationsBenefits:
Release note
Features
Check List (For Author)
Manual Test Steps:
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)