Skip to content

Conversation

@cccs-cat001
Copy link
Contributor

Here is a preliminary pass at fixing #3038, I've added in an option to have the sts use assumeRoleWithWebIdentity and pass along the user credentials. This will allow the STS to know who is asking for credentials instead of the request coming from some shared service credentials.

We've found that using our on-prem S3 solution that the way Polaris accesses the STS results in connectivity errors, and the recommended way of accessing the STS is with a web identity token (A.K.A. access token), so adding this optional feature will result in Polaris being usable by more S3 appliances than just AWS :)

Checklist

  • 🛡️ Don't disclose security issues! (contact [email protected])
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

cccs-cat001 and others added 30 commits October 24, 2025 08:16
…ting/DefaultMetricsReporter.java

Co-authored-by: Alexandre Dutra <[email protected]>
…ting/DefaultMetricsReporter.java

Co-authored-by: Alexandre Dutra <[email protected]>
…test/PolarisRestCatalogIntegrationBase.java

Co-authored-by: Alexandre Dutra <[email protected]>
@dimas-b dimas-b requested a review from adutra November 28, 2025 17:31
@dimas-b
Copy link
Contributor

dimas-b commented Nov 28, 2025

@adutra : could you double check quarkus auth integration changes (they LGTM)

@cccs-cat001 cccs-cat001 requested a review from dimas-b November 28, 2025 18:39
@dimas-b
Copy link
Contributor

dimas-b commented Dec 2, 2025

@cccs-cat001 : please resolve conflicts... otherwise CI does not run 🤷

Copy link
Contributor

@adnanhemani adnanhemani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the response here, I'm afraid I will be a -1 on this PR.

If the user has the ability to communicate with the "STS" service directly and bypass any access controls rules that the Polaris server has set, then Polaris should not be in the business of vending credentials to this client. By creating this approach, we are introducing a security loophole that is completely against Polaris' security modelling today.

If there is a dire need to use such Polaris authentication tokens with an on-prem "AWS-compatible" service using only the user's tokens, then the client should be modified to talk to STS directly to gain credentials without Polaris' involvement. IMO Polaris should not be involved in abetting ANY security loopholes.

Or we can look towards creating a similar security model as we have today whereby using the Polaris service credential or setting Polaris up as an intermediate token broker which the "STS" service trusts, a new credential can be minted for the client to access the storage layer.

(This is not a new problem that we are setting out to solve - look into IAM Roles for Service Accounts (IRSA) used by AWS EKS, which at its core, solves a very similar base problem. Trying to fit a simplistic solution to quick-solve a problem for a system we don't even fully support is a very dangerous precedence to set. Speeding through a solution for functionality is one thing, but to risk security postures while doing that is a whole different situation.)

@adutra
Copy link
Contributor

adutra commented Dec 3, 2025

@adutra : could you double check quarkus auth integration changes (they LGTM)

I find the changes in this PR a bit invasive tbh. I would rather explore Quarkus OIDC token propagation techniques, and more specifically, I would try to inject io.quarkus.oidc.client.Tokens wherever necessary:

https://quarkus.io/guides/security-openid-connect-client-reference#inject-tokens

But I don't want to delay this PR that is extremely useful. I think we can go with the approach taken here and then switch to token propagation in a follow-up PR.

@adutra
Copy link
Contributor

adutra commented Dec 3, 2025

By creating this approach, we are introducing a security loophole that is completely against Polaris' security modelling today.

I am completely failing to see what security loopholes we would be introducing by leveraging AssumeRoleWithWebIdentity. Propagating the user's access token to AWS STS using AssumeRoleWithWebIdentity is the standard pattern recommended by AWS itself for federated OIDC access.

the client should be modified to talk to STS directly to gain credentials without Polaris' involvement. IMO Polaris should not be involved in abetting ANY security loopholes.

Are you serious about allowing clients to talk to STS directly? THAT, indeed, would be a giant security loophole.

Or we can look towards [...] setting Polaris up as an intermediate token broker which the "STS" service trusts, a new credential can be minted for the client to access the storage layer.

IMHO this is unrealistic, and over-engineered. That would require a form of token exchange and would be extremely hard to implement for little added-value. And again: let's please stop considering Polaris as an OAuth2 token broker. This is legacy behavior.

Trying to fit a simplistic solution to quick-solve a problem for a system we don't even fully support is a very dangerous precedence to set.

Can you clarify what "simplistic solution" you are talking about and what is this "system [that] we don't [...] fully support"?

@dimas-b
Copy link
Contributor

dimas-b commented Dec 3, 2025

I support @adutra 's point about Polaris not acting as a token broker or authorization server. Polaris, in general, should be a resource server in OAuth2 terms.

@cccs-cat001
Copy link
Contributor Author

I find the changes in this PR a bit invasive tbh. I would rather explore Quarkus OIDC token propagation techniques, and more specifically, I would try to inject io.quarkus.oidc.client.Tokens wherever necessary:

https://quarkus.io/guides/security-openid-connect-client-reference#inject-tokens

But I don't want to delay this PR that is extremely useful. I think we can go with the approach taken here and then switch to token propagation in a follow-up PR.

that looks pretty simple compared to all the changes I'm making tbh. Is everything set up so that I can just @Inject Token token;? 🤔

@dimas-b
Copy link
Contributor

dimas-b commented Dec 3, 2025

Is everything set up so that I can just @Inject Token token;? 🤔

Injecting in runtime/service classes should be ok, but polaris-core should remain CDI-neutral by convention (i.e. no CDI annotations in core).

That is to say, I think current PolarisPrincipal changes will probably have to remain.

@adutra
Copy link
Contributor

adutra commented Dec 3, 2025

that looks pretty simple compared to all the changes I'm making tbh. Is everything set up so that I can just @Inject Token token;? 🤔

You may need to add the io.quarkus:quarkus-oidc-client dependency, I don't remember if it's already there or not.

@adutra
Copy link
Contributor

adutra commented Dec 3, 2025

That is to say, I think current PolarisPrincipal changes will probably have to remain.

True, at some point we'd need to pass the token as a parameter. @cccs-cat001 please don't rush into using token propagation if you can't find a simple path forward – what you have currently is perfectly acceptable.

@tokoko
Copy link

tokoko commented Dec 3, 2025

To be honest I'm very confused about the implementation in the PR as well. Not sure about other s3-compatible systems, but in the case of minio, (if I'm reading this correctly...) this option will only work if all catalog users have the ability to assume the role that's configured centrally on the catalog level and which also needs to have read/write privileges on all catalog locations. I fail to see how this can be useful to anyone. why would anyone hand out assume role privileges on what's basically a superuser?

If the problem that the PR is trying to solve is simply to support those systems that only support AssumeRoleWithWebIdentity, the more straightforward solution would be to enable the catalog to acquire required web identity instead of acquiring it from a user. The same way that the standard AssumeRole option relies on AWS environment variables to authenticate the call, the new AssumeRoleWithWebIdentity solution should read necessary oauth configs and have an internal background process that obtains and refreshes a token needed to authenticate AssumeRoleWithWebIdentity calls. I'm not sure why we would complicate this with some sort of token pass-through mechanism.

@dimas-b
Copy link
Contributor

dimas-b commented Dec 3, 2025

@tokoko : AFAIK (and @cccs-cat001 may have a better answer) this feature is mostly intended for custom systems. Using the workflow it enables with AWS, for example, may be possible, yet whether it is the best approach with AWS is subject to specific user demands. I tend to agree that this may not be the best approach for MinIO or AWS S3.

Nonetheless, I do not see why Polaris as an OSS project should not open this use case for deployments what may benefit from it. The feature is well isolated under a catalog property, there is no impact to other use cases.

For reference, here's the related dev ML thread: https://lists.apache.org/thread/tm76ntbgdqt31r6402dro8vb7m4pdzzq

@adnanhemani
Copy link
Contributor

Propagating the user's access token to AWS STS using AssumeRoleWithWebIdentity is the standard pattern recommended by AWS itself for federated OIDC access.

This is only the standard pattern recommended by AWS for federated OIDC access - if you do not have a governance system in between. If you feel that's wrong, please feel free to drop links to the AWS documentation here that support your claim.

Are you serious about allowing clients to talk to STS directly? THAT, indeed, would be a giant security loophole.

(Also mentioned in Laurent's email to the Dev ML regarding this approach) Isn't this possible today with how the code is written? If it isn't, what stops it? Is it a network policy? And as a result, are we betting the whole of Polaris' security posture on such a network policy? Is that truly wise?

Let's compare this to the AssumeRole path we have today. In our best practices, are IAM roles allowed to be assumed by the clients directly? That is clearly not the model that Polaris has today - so why are we breaking that security model for this feature?

Nonetheless, I do not see why Polaris as an OSS project should not open this use case for deployments what may benefit from it. The feature is well isolated under a catalog property, there is no impact to other use cases.

Not sure I agree, we are enabling a use case here that does not align with our current security best practices. And to be clear, this is not as a result of there not being a better way of doing this, but rather because this is the fastest way of achieving this goal. Trampling security best practices for velocity is not something that we have a track record of within this project.

I agree that this use case is a valid one - and that we should be supporting it. But I highly disagree in the way that this implementation does so. Rather than bending the security model that we have today (which generally can be stated like: Polaris user who is unprivileged with regards to storage credentials, authenticates and authorizes with Polaris in order to gain storage credentials) to fit this use case, we should be tailoring the implementation of this use case to fit the security model.

@cccs-cat001 cccs-cat001 marked this pull request as draft December 4, 2025 18:17
@cccs-cat001
Copy link
Contributor Author

Marking as draft as tests are failing for unknown reasons, they'll potentially be fixed with #3203, so we'll wait for that.

@cccs-cat001
Copy link
Contributor Author

Closing this PR in favour of #3224 and #3236

@cccs-cat001 cccs-cat001 closed this Dec 8, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Done in Basic Kanban Board Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants