Skip to content

Add license and notice to spark client jar and push jar to maven #1830

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 7, 2025

Conversation

gh-yzou
Copy link
Contributor

@gh-yzou gh-yzou commented Jun 7, 2025

This PR does following:

  1. add default license and notice to spark client jar
  2. update the spark client jar name to align with polaris published jar
  3. enable publish spark client jar to maven

@@ -71,7 +71,7 @@ bin/spark-shell \
--conf spark.sql.catalog.<spark-catalog-name>.scope='PRINCIPAL_ROLE:ALL' \
--conf spark.sql.catalog.<spark-catalog-name>.token-refresh-enabled=true
```
Assume the released Polaris Spark client you want to use is `org.apache.polaris:polaris-iceberg-1.8.1-spark-runtime-3.5_2.12:1.0.0`,
Assume the released Polaris Spark client you want to use is `org.apache.polaris:polaris-spark-3.5_2.12:1.0.0`,
Copy link
Contributor

@flyrain flyrain Jun 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure spark can download the bundle jar with the following config? Can we test it out? I'm not sure how do we test it against local mvn repo though, one way is to merge this into nightly maven repo first, then test it out.

--packages org.apache.polaris:polaris-spark-3.5_2.12:1.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested it out yet. once it is into nightly maven repo, i can test it out and update the doc if needed.

archiveClassifier = null
archiveBaseName =
"polaris-iceberg-${icebergVersion}-spark-runtime-${sparkMajorVersion}_${scalaVersion}"
archiveClassifier = "bundle"
Copy link
Contributor

@flyrain flyrain Jun 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think a bit more, I think it'd be nice to have the iceberg version in the package name. Otherwise, debugging would be tricky. Users couldn't figure out the Iceberg lib version easily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we could apply the Iceberg version as we did for Scala version.

Copy link
Contributor Author

@gh-yzou gh-yzou Jun 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the jar now seems <project_name>-<polaris_version>-.jar, and the scala_version is embedded in the project_name, such as spark-3.5_2.12. I don't think it is a good idea to have iceberg version as project name, since the iceberg version might gets updated more frequently.
i don't think we can touch polaris_version also, then classifier is the only place . however, it seems wired that the iceberg version is after the polaris version. I think what we can do is publish on the webpage about the iceberg client compatibility. Given in long term, we may not need iceberg version in the package name also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can open up the jar file to figure out the embedded Iceberg version. But my concern is that a lot of users don't know that or couldn't do that easily. I think it's still valuable to provide the iceberg versions as it's embedded. Unless we figure out a way to not embed it.

I don't think it is a good idea to have iceberg version as project name, since the iceberg version might gets updated more frequently.

We need to release a new version for this plugin every time we update the embedded Iceberg version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems wired that the iceberg version is after the polaris version.

True. The only option is to play with the project name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a blocker to me for this PR. We could figure it out later.

Copy link
Member

@jbonofre jbonofre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LICENSE and NOTICE are not correct. I can help on that.

I propose:

  1. We merge this PR as it is and I will fix LICENSE and NOTICE in another PR
  2. You give me access to you branch and I directly push a commit fixing LICENSE and NOTICE

@flyrain
Copy link
Contributor

flyrain commented Jun 7, 2025

LICENSE and NOTICE are not correct. I can help on that.

I propose:

  1. We merge this PR as it is and I will fix LICENSE and NOTICE in another PR
  2. You give me access to you branch and I directly push a commit fixing LICENSE and NOTICE

+1 on option 1. Let's get this in first.

@flyrain flyrain enabled auto-merge (squash) June 7, 2025 16:19
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jun 7, 2025
@flyrain flyrain merged commit be3fee8 into apache:main Jun 7, 2025
9 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jun 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants