-
-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dask_ml.model_selection.GridSearchCV errors for keras model #534
Comments
I believe I ran into a similar issue here / https://gist.github.com/TomAugspurger/33efb49efe611701ef122f577d0e0430 It seems to be difficult to serialize & deserialize a Keras estimator backed by Tensorflow when there are multiple processes / threads. So I guess that
applies to me to! |
I check the previous post: dask/dask-searchcv#69 |
@MikeChenfu does that mean it should work fine with multiple processes, but a single thread per process? |
Yes, it seems to work in training model . But problem occurs when finalize the work. Here is my demo code. # dask-worker $ip --nprocs 2 --nthreads 1
model = KerasClassifier(build_fn=create_model, verbose=1)
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform', 'normal', 'uniform']
epochs = [100]
batches = [512]
param_grid = dict(epochs=epochs, batch_size=batches)
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=2)
grid_result = grid.fit(x, y) My progress results are shown below.
|
I doubt the problem is not related to the multiple processes. Accidentally, it is woking well for several times when I have two dask-workers, and then it fails again. I also use one dask-worker to run it, but get the same problem. |
When it works, I get the warnings as follows: WARNING: Logging before flag parsing goes to stderr.
W0814 20:00:21.673532 139838015661824 deprecation_wrapper.py:119] From /conda/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
W0814 20:00:21.701011 139838015661824 deprecation_wrapper.py:119] From /conda/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
W0814 20:00:21.714846 139838015661824 deprecation_wrapper.py:119] From /conda/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:131: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
W0814 20:00:21.715923 139838015661824 deprecation_wrapper.py:119] From /conda/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.
W0814 20:00:21.724133 139838015661824 deprecation.py:506] From /conda/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
W0814 20:00:21.778590 139838015661824 deprecation_wrapper.py:119] From /conda/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.
W0814 20:00:24.539319 139838015661824 deprecation_wrapper.py:119] From /conda/lib/python3.7/site-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
W0814 20:00:24.551965 139838015661824 deprecation.py:323] From /root/.local/lib/python3.7/site-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where |
Tensorflow (or Keras?) does something strange with models that are
serialized between processes. I haven't been able to figure out
how to get things working. Any help would be appreciated.
…On Wed, Jan 15, 2020 at 11:09 AM Ben Weinstein ***@***.***> wrote:
any update here. I'm looking for best practices for predicting a trained
keras model to data in parallel. With a dask client I'm also getting
Tensor %s is not an element of this graph
I'll update here when I find a solution. It seems likely that it relates
to multiprocessing because if you look the traceback you get tensorflow
with self._lock:
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#534?email_source=notifications&email_token=AAKAOIVRRVSPCMSZDKFWTE3Q547M7A5CNFSM4IH6Z2D2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJBCBBY#issuecomment-574759047>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIW4D4DW3G62INXIGDTQ547M7ANCNFSM4IH6Z2DQ>
.
|
For those interested, I have an example of a trained keras model to predict with dask. |
There's support for Keras serialization now in SciKeras, which brings a Scikit-Learn API to Keras. This is mentioned explicitly in the documentation on https://ml.dask.org/keras.html. We're trying to merge serialization support upstream in Tensorflow: tensorflow/tensorflow#39609, tensorflow/community#286 |
I am trying to fill Keras model into dask_ml.model_selection.GridSearchCV. If I do not set client, it works fines. However, I got errors if I have two dask workers. It seems to be unable to deserialize something.
I appreciate if anyone has suggestions about this problem.
The text was updated successfully, but these errors were encountered: