Skip to content

Make Chains objects display only information and not statistical eval #307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

PaulinaMartin96
Copy link
Contributor

@PaulinaMartin96 PaulinaMartin96 commented Jun 10, 2021

As suggested on #246, Chains objects now display only information and not statistical evaluations.

Before:
MCMCChainsnotmodified

After:
MCMCChainsmodified

Also, the describe function now display both Summary statistics and Quantiles info.

Before:
describenotmodified

After:
describemodified

@PaulinaMartin96 PaulinaMartin96 marked this pull request as ready for review June 12, 2021 03:34
@cpfiffer
Copy link
Member

cpfiffer commented Jul 6, 2021

Woops, I missed this one as reviewable. If I forget, as I do often, you can flag me by requesting a review over here:

image

Copy link
Member

@cpfiffer cpfiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PRs like this where a UI change is involved, try to include a printout of the new behavior so someone can quickly look at the PR to see what it looks like. For example, I would want to see the results of show(chain), which is the primary change introduced by this PR.

Could you add that for me so I can take a look at the end result that a user might see?

Also, we follow ColPrac, one key feature of which is always making sure that the version number of whatever package you submit a PR to is increased according to semantic versioning. You can find the version number for MCMCChains in Project.toml, here.

For a change like this where the default behavior changes, I would recommend bumping us up to MCMCChains 4.14.0.

Flag me after you add a little description of what's changing and what I should expect to see and I'll do another pass on this one.

@PaulinaMartin96 PaulinaMartin96 changed the title Make Chains objects display only information and not statistical eval… Make Chains objects display only information and not statistical eval Jul 7, 2021
@PaulinaMartin96
Copy link
Contributor Author

@cpfiffer thank you so much for the suggestions! I'll consider them in the future. I think the PR is ready. There's a conflict with Project.toml. Let me know if further changes are required.

@PaulinaMartin96 PaulinaMartin96 requested a review from cpfiffer July 7, 2021 05:24
@cpfiffer
Copy link
Member

cpfiffer commented Jul 7, 2021

Okay, thanks for adding that stuff. Looks good to me.

When there's a merge conflict, it's typically the PR person's job to fix the merge conflict. Fortunately this one is really easy -- you can use the GitHub web editor or your local git merge manager to fix the issues. GitHub has links on how to do this at the bottom of the PR scroll:

image

This is a good learning opportunity since it's one of the easier merge conflicts. Go ahead and try to fix this issue. In this case, you basically want to fix the version number to the one you specified (4.14.0) and not the one that's currently in master.

@PaulinaMartin96
Copy link
Contributor Author

Sorry for that, and again, thank you so much for your advice!

@cpfiffer
Copy link
Member

cpfiffer commented Jul 7, 2021

No problem, all part of the learning process. I've just enabled tests and will check back to make sure they've passed.

src/stats.jl Outdated
@@ -171,6 +171,15 @@ function describe(
return dfs
end

function Base.show(io::IO, mime::MIME"text/plain", cs::Vector{ChainDataFrame})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only captures the abstract Vector{ChainDataFrame} but not any concretely typed Vector{<:ChainDataFrame}. In general, I am a bit worried about changing the display of vectors of ChainDataFrames - it seems wrong to completely opt out of the default display mechanism of vectors in Julia (I also wonder if it causes any problems) just to change the way in which describe(chain) is displayed. Maybe rather describe should not return a vector of ChainDataFrame.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm down with this. My understanding of your suggestion is that describe should be a pure IO function -- we should make display(io::IO, chn::Chain) and the output is all the stuff inside the Base.show definition above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not define display (it calls show(io, MIME("text/plain"), x) which should be implemented), but we should just implement DataAPI.describe(io, chain) if we want to display the summary statistics in a nice way: https://juliastats.org/StatsBase.jl/latest/scalarstats/#Summary-Statistics-1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for your help! So, to double check that I'm understanding these suggestions. Instead of returning a Vector{ChainDataFrame}, describe should be an implementation of StatsBase.describe and return something like this?

chn = Chains(rand(100, 2, 2), [:a, :b])
chn_arr = Array(chn)
sections = chn.name_map[:parameters]
for i in 1:length(sections)
    println("Parameter $(sections[i])")
    describe(chn_arr[:,i])
    println()
end

StatsBase _describe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine if summarystats returns different summary statistics (so this method does not have to be changed) but we should make sure that describe just prints these summary statistics in a pretty way to be consistent with how describe and summarystats are defined in StatsBase. I.e., in particular describe should not return anything but only print to IO and it should not print the quantiles if they are not part of summarystats (here it might actually be better to include them in summarystats as well and return two dataframes, possibly as a named tuple).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify maybe a little bit, there's no need to change to this code you've posted:

chn = Chains(rand(100, 2, 2), [:a, :b])
chn_arr = Array(chn)
sections = chn.name_map[:parameters]
for i in 1:length(sections)
    println("Parameter $(sections[i])")
    describe(chn_arr[:,i]) # <- This is not what we want, we want to print the results of `describe(chain)` here instead
    println()
end

Basically we want to change the stuff that is currently in show to describe, so that describe becomes a pure IO function and not a weird Vector{<:ChainDataFrame} thing that we have now.

Maybe one way is to do something like

function DataAPI.describe(io::IO, chains::Chains)
    print(io, "Chains ", chains, ":\n\n", header(chains))

    summstats = summarystats(chains)
    qs = quantiles(chains)

    println(io)
    show(io, summstats)

    println(io)
    show(io, qs)
end

which won't actually return anything, it just prints stuff out to the screen. There's probably a way more sane way to do this, but it's a rough sketch to get you started.

@delete-merged-branch delete-merged-branch bot deleted the branch TuringLang:main December 24, 2021 10:30
@shravanngoswamii shravanngoswamii force-pushed the pm/MCMCChains_Display branch from 41ad4e5 to 7c75fd3 Compare May 8, 2025 04:48
@shravanngoswamii
Copy link
Member

Can someone please review this PR, I have made these changes:

  • Implemented DataAPI.describe()
  • display() only prints the header now, with a suggestion to use describe()

See:

julia> using MCMCChains

julia> chn = Chains(rand(100, 2, 2), [:a, :b])
Chains MCMC chain (100×2×2 Array{Float64, 3}):

Iterations        = 1:1:100
Number of chains  = 2
Samples per chain = 100
parameters        = a, b

Use `describe(chains)` for summary statistics and quantiles.


julia> display(chn)
Chains MCMC chain (100×2×2 Array{Float64, 3}):

Iterations        = 1:1:100
Number of chains  = 2
Samples per chain = 100
parameters        = a, b

Use `describe(chains)` for summary statistics and quantiles.


julia> describe(chn)
Chains MCMC chain (100×2×2 Array{Float64, 3}):

Iterations        = 1:1:100
Number of chains  = 2
Samples per chain = 100
parameters        = a, b

Summary Statistics
  parameters      mean       std      mcse   ess_bulk   ess_tail      rhat   ess_per_sec 
      Symbol   Float64   Float64   Float64    Float64    Float64   Float64       Missing 

           a    0.4691    0.2997    0.0210   181.4680   106.1712    1.0190       missing
           b    0.4721    0.3108    0.0223   205.3635   184.7170    1.0254       missing

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

           a    0.0123    0.1906    0.4581    0.7287    0.9712
           b    0.0217    0.1789    0.4507    0.7560    0.9812

julia> 

cc @yebai

@yebai yebai requested a review from penelopeysm May 18, 2025 16:14
Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a breaking change to the behaviour of describe? Does this need a major version bump?

@shravanngoswamii
Copy link
Member

shravanngoswamii commented May 18, 2025

Isn't this a breaking change to the behaviour of describe? Does this need a major version bump?

Yep, It does need a version bump! I just committed the version bump from 6.0.7 to 6.1.0, is it fine?

@penelopeysm
Copy link
Member

Technically it should be 7.0.0

@shravanngoswamii
Copy link
Member

shravanngoswamii commented May 21, 2025

Technically it should be 7.0.0

I never made a big change in any package, so I was unaware of how versions are bumped! Is it fine now?
Any other suggestions @penelopeysm?

@shravanngoswamii
Copy link
Member

The format workflow is failing due to permission issues since this PR is from a fork. I suggest we skip running the workflow in this case.

Copy link
Member

@penelopeysm penelopeysm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks good to me!

@devmotion
Copy link
Member

devmotion commented May 21, 2025

The format workflow is failing due to permission issues since this PR is from a fork.

Only posting of the suggested changes fails. The action would succeed if the code would be formatted correctly.

Copy link

codecov bot commented May 22, 2025

Codecov Report

Attention: Patch coverage is 95.31250% with 3 lines in your changes missing coverage. Please review.

Project coverage is 85.55%. Comparing base (6080787) to head (e584217).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/chains.jl 95.00% 2 Missing ⚠️
src/stats.jl 95.83% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #307      +/-   ##
==========================================
+ Coverage   83.75%   85.55%   +1.80%     
==========================================
  Files          20       20              
  Lines        1071     1073       +2     
==========================================
+ Hits          897      918      +21     
+ Misses        174      155      -19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shravanngoswamii shravanngoswamii merged commit ad4942d into TuringLang:main May 22, 2025
10 checks passed
@shravanngoswamii
Copy link
Member

shravanngoswamii commented May 22, 2025

Only posting of the suggested changes fails. The action would succeed if the code would be formatted correctly.

I ran JuliaFormatter on changed files and it's fixed now, thank you!
I have also updated some tests for table to use summary stats and quantiles instead of describe as it does not return anything, it's a pure IO function now.

I am merging this PR as tests are passed and @penelopeysm have already approved main changes!

Thank you, @PaulinaMartin96 @cpfiffer @devmotion @penelopeysm @yebai!

@shravanngoswamii
Copy link
Member

shravanngoswamii commented May 22, 2025

New version should be released once this is merged: JuliaRegistries/General#131488

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants