Skip to content

[dask-on-ray] ValueError on read-only memory #10124

Open
@stephanie-wang

Description

@stephanie-wang

What is the problem?

Ray version and other system information (Python version, TensorFlow version, OS): 0.9dev

I was trying out the new Ray backend plugin for Dask and ran into a read-only error. Looks like the Dask code tries to write in-place, but Ray doesn't allow it since objects are immutable.

Reproduction (REQUIRED)

Please provide a script that can be run to reproduce the issue. The script should have no external library dependencies (i.e., use fake or mock data / environments):

import ray
import dask
import dask.dataframe as dd
import pandas as pd
import numpy as np
from ray.experimental.dask import ray_dask_get

import time

N_ROWS = 1000
N_PARTITIONS = 100

ray.init()

df = dd.from_pandas(pd.DataFrame(np.random.randn(N_ROWS, 2), columns=['a', 'b']), npartitions=N_PARTITIONS)
start = time.time()
print(df.groupby('b').a.sum().compute(scheduler=ray_dask_get))
end = time.time()
print("ray", end - start)

If we cannot run your script, we cannot fix your issue.

  • I have verified my script runs in a clean environment and reproduces the issue.
  • I have verified the issue also occurs with the latest wheels.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Important issue, but not time-criticalbugSomething that is supposed to be working; but isn'tpending-cleanupThis issue is pending cleanup. It will be removed in 2 weeks after being assigned.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions