Description
Describe the issue:
The API docs for the ordered logistic class recommends using the transform ordered to provide cutpoints for the ordinal regression. But the provided example breaks with an error reporting that the random variable for the cutpoints lacks a shape.
Reproduceable code example:
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
import numpy as np
import pytensor as pt
# Generate data for a simple 1 dimensional example problem
n1_c = 300; n2_c = 300; n3_c = 300
cluster1 = np.random.randn(n1_c) + -1
cluster2 = np.random.randn(n2_c) + 0
cluster3 = np.random.randn(n3_c) + 2
x = np.concatenate((cluster1, cluster2, cluster3))
y = np.concatenate((1*np.ones(n1_c),
2*np.ones(n2_c),
3*np.ones(n3_c))) - 1
# Ordered logistic regression
with pm.Model() as model:
cutpoints = pm.Normal("cutpoints", mu=[-1,1], sigma=10, shape=2,
transform=pm.distributions.transforms.Ordered)
y_ = pm.OrderedLogistic("y", cutpoints=cutpoints, eta=x, observed=y)
idata = pm.sample()
Error message:
Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[10], line 14
12 # Ordered logistic regression
13 with pm.Model() as model:
---> 14 cutpoints = pm.Normal("cutpoints", mu=[-1,1], sigma=10, shape=2,
15 transform=pm.distributions.transforms.Ordered)
16 y_ = pm.OrderedLogistic("y", cutpoints=cutpoints, eta=x, observed=y)
17 idata = pm.sample()
File ~/mambaforge/envs/pymc_examples_new/lib/python3.11/site-packages/pymc/distributions/distribution.py:312, in Distribution.__new__(cls, name, rng, dims, initval, observed, total_size, transform, *args, **kwargs)
308 kwargs["shape"] = tuple(observed.shape)
310 rv_out = cls.dist(*args, **kwargs)
--> 312 rv_out = model.register_rv(
313 rv_out,
314 name,
315 observed,
316 total_size,
317 dims=dims,
318 transform=transform,
319 initval=initval,
320 )
322 # add in pretty-printing support
323 rv_out.str_repr = types.MethodType(str_for_dist, rv_out)
...
---> 95 y = at.zeros(value.shape)
96 y = at.inc_subtensor(y[..., 0], value[..., 0])
97 y = at.inc_subtensor(y[..., 1:], at.log(value[..., 1:] - value[..., :-1]))
AttributeError: 'RandomGeneratorSharedVariable' object has no attribute 'shape'
PyMC version information:
Last updated: Fri Mar 17 2023
Python implementation: CPython
Python version : 3.11.0
IPython version : 8.11.0
pytensor: 2.10.1
numpy : 1.24.2
arviz : 0.15.1
matplotlib: 3.7.1
pandas : 1.5.3
pymc : 5.1.1
pytensor : 2.10.1
Watermark: 2.3.1
Context for the issue:
I was going to try and write up docs on the technique of ordinal regression, but i think failure of the ordered transform makes the entire class of models less straightforward to implement. I'm pretty sure it's related to this line:
pymc/pymc/distributions/transforms.py
Line 89 in c7279b5
But i don't know enough about the random variable implementation to know why the shape attribute is not available now.