Closed
Description
Description
Alloc
provides the same functionality as BroadcastTo
, and seems to be the default introduced in PyTensor rewrites in graphs this:
import pytensor
import pytensor.tensor as pt
x = pt.scalar("x")
out = x + [5, 5, 5]
fn = pytensor.function([x], out)
pytensor.dprint(fn, print_type=True)
Alloc [id A] <Vector(float64, shape=(3,))> 2
├─ Add [id B] <Vector(float64, shape=(1,))> 1
│ ├─ [5.] [id C] <Vector(float64, shape=(1,))>
│ └─ ExpandDims{axis=0} [id D] <Vector(float64, shape=(1,))> 0
│ └─ x [id E] <Scalar(float64, shape=())>
└─ 3 [id F] <Scalar(int64, shape=())>
This is introduced usually by this helper:
pytensor/pytensor/tensor/rewriting/basic.py
Lines 90 to 117 in 36161e8
It doesn't make sense to have two operators for the same functionality, so we should decide which one to support.
This was added in Aesara in aesara-devs/aesara#145
The original issue mentions the alloc / vs view question: aesara-devs/aesara#36, but it seems that could easily be achieved by a single Op by manipulating the view
flag.
Activity
ricardoV94 commentedon Jul 3, 2023
Actually this may touch on a more general question of when to allow
Op
s to be views vs require new allocations for the output. This also showed up in #344.I guess this depends on other inplace Operations. For instance, if you have a
set_subtensor
operation downstream you might as well allocate the outputs in new arrays from the get go.[-]Alloc vs BroadcastTo[/-][+]Alloc vs BroadcastTo vs Second[/+]ricardoV94 commentedon Jul 4, 2023
There's also
Second
(aliased toFill
) which is a hackish way of doing broadcasting via an "Elemwise" Operation so that it can be present in the gradient graphs (as those must all be defined in terms of Scalar operations).pytensor/pytensor/scalar/basic.py
Lines 826 to 830 in e20dd0b
pytensor/pytensor/scalar/basic.py
Lines 3850 to 3864 in e20dd0b
It seems that there is a rough organization in the rewrites, where
Second
is used during canonicalization and then removed during specialization.pytensor/pytensor/tensor/rewriting/basic.py
Lines 408 to 416 in e20dd0b
pytensor/pytensor/tensor/rewriting/math.py
Lines 2045 to 2046 in e20dd0b
Would be useful to understand why these we defined as the "canonical" forms. Maybe easier to merge multiple equivalent broadcasts than if they were represented as Alloc?
I am pretty sure we don't need 3 separate Ops to do the same thing here :)
ricardoV94 commentedon Jul 10, 2023
BroadcastTo
might be the only Op that returns a non-writeable output by default. It necessitated the addition oftag.indestructible
to prevent otherOp
s from trying to write in place in aesara-devs/aesara#368Otherwise, I imagine we would need that:
We could simply remove it and continue having Alloc always be a fully allocated output.
More discussion in #361 (comment)
shape_unsafe
tag to rewrites that can hide shape errors #381