Closed
Description
The following fails. In pytest.ini
we specify two cores (-n 2
) so the tests run in parallel.
py.test -s tests/keras/layers/cudnn_recurrent_test.py
Some observed error messages from TensorFlow:
UnknownError: Fail to find the dnn implementation.
InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(32, 2), b.shape=(2, 2), m=32, n=2, k=2
e = InternalError(), message = 'GPU sync failed', m = None
If we disable pytest-xdist (-n 0
) or use just a single core (-n 1
) it works ok:
py.test -n 1 tests/keras/layers/cudnn_recurrent_test.py
Note that CuDNN tests require GPU (@pytest.mark.skipif
) and are not ran on Travis CI, so this problem only appears with manual test invocation.
A workaround is to run this test file with single process (with -n 1
as above), which should be documented somewhere.
A better solution would be to enforce serial execution for tests in this file. So far it doesn't seem that pytest-xdist supports that directly.
System info:
- latest Keras master: 950e5d0
- tensorflow-gpu==1.4.1, CUDA 8.0, CuDNN 6.0
Metadata
Metadata
Assignees
Labels
No labels