Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvalidArgumentError: Invalid name: An op that loads optimization parameters into HBM for embedding. (ConfigureTPUEmbeddingHost) #23386

Closed
neopostmodern opened this issue Oct 30, 2018 · 13 comments
Assignees
Labels
comp:keras Keras related issues stat:awaiting response Status - Awaiting response from author

Comments

@neopostmodern
Copy link

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.10
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: No
  • TensorFlow installed from (source or binary): Source
  • TensorFlow version: 1.12.0-rc0
  • Python version: 3.6.7rc1
  • Bazel version (if compiling from source): 0.19.0
  • GCC/Compiler version (if compiling from source): 7.3.0
  • CUDA/cuDNN version: 10.0.130 / 7.3.1.20
  • GPU model and memory: Nvidia M1000M / ~4GB

Describe the current behavior
PyPI packages for tensorflow-gpu didn't work for me on Ubuntu 18.10, but I (somehow) managed to compile the latest Tensorflow. I then tried to run some code that I was working on before (and which used to run fine) and got the following error:


Using TensorFlow backend.
Tensorflow: 1.12.0-rc0
Traceback (most recent call last):
  File "/home/neopostmodern/.PyCharm2018.1/config/scratches/scratch_3.py", line 10, in <module>
    model.add(CuDNNLSTM(64, input_shape=(10, 39), return_sequences=True))
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/keras/engine/sequential.py", line 165, in add
    layer(x)
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/keras/layers/recurrent.py", line 532, in __call__
    return super(RNN, self).__call__(inputs, **kwargs)
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/keras/engine/base_layer.py", line 431, in __call__
    self.build(unpack_singleton(input_shapes))
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/keras/layers/cudnn_recurrent.py", line 425, in build
    from tensorflow.contrib.cudnn_rnn.python.ops import cudnn_rnn_ops
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/tensorflow/contrib/__init__.py", line 40, in <module>
    from tensorflow.contrib import distribute
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/tensorflow/contrib/distribute/__init__.py", line 34, in <module>
    from tensorflow.contrib.distribute.python.tpu_strategy import TPUStrategy
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/tensorflow/contrib/distribute/python/tpu_strategy.py", line 29, in <module>
    from tensorflow.contrib.tpu.python.ops import tpu_ops
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/tensorflow/contrib/tpu/__init__.py", line 69, in <module>
    from tensorflow.contrib.tpu.python.ops.tpu_ops import *
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/ops/tpu_ops.py", line 39, in <module>
    resource_loader.get_path_to_datafile("_tpu_ops.so"))
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/tensorflow/contrib/util/loader.py", line 56, in load_op_library
    ret = load_library.load_op_library(path)
  File "/home/neopostmodern/.local/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 60, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid name: 
An op that loads optimization parameters into HBM for embedding. Must be
preceded by a ConfigureTPUEmbeddingHost op that sets up the correct
embedding table configuration. For example, this op is used to install
parameters that are loaded from a checkpoint before a training loop is
executed.

parameters: A tensor containing the initial embedding table parameters to use in embedding
lookups using the Adagrad optimization algorithm.
accumulators: A tensor containing the initial embedding table accumulators to use in embedding
lookups using the Adagrad optimization algorithm.
table_name: Name of this table; must match a name in the
  TPUEmbeddingConfiguration proto (overrides table_id).
num_shards: Number of shards into which the embedding tables are divided.
shard_id: Identifier of shard for this operation.
table_id: Index of this table in the EmbeddingLayerConfiguration proto
  (deprecated).
 (Did you use CamelCase?); in OpDef: name: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" input_arg { name: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" description: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" type: DT_FLOAT type_attr: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" number_attr: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" type_list_attr: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" } input_arg { name: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" description: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" type: DT_FLOAT type_attr: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" number_attr: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" type_list_attr: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" } attr { name: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" type: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" default_value { i: -1 } description: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" has_minimum: true minimum: -1 } attr { name: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" type: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" default_value { s: "" } description: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" } attr { name: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" type: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" description: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" } attr { name: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" type: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" description: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" } summary: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" description: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded by a ConfigureTPUEmbeddingHost op that sets up the correct\nembedding table configuration. For example, this op is used to install\nparameters that are loaded from a checkpoint before a training loop is\nexecuted.\n\nparameters: A tensor containing the initial embedding table parameters to use in embedding\nlookups using the Adagrad optimization algorithm.\naccumulators: A tensor containing the initial embedding table accumulators to use in embedding\nlookups using the Adagrad optimization algorithm.\ntable_name: Name of this table; must match a name in the\n  TPUEmbeddingConfiguration proto (overrides table_id).\nnum_shards: Number of shards into which the embedding tables are divided.\nshard_id: Identifier of shard for this operation.\ntable_id: Index of this table in the EmbeddingLayerConfiguration proto\n  (deprecated).\n" is_stateful: true

Describe the expected behavior
It should run, as it did before.

Code to reproduce the issue
Reduced to the minimum:

import tensorflow as tf
from keras.models import Sequential
from keras.layers import CuDNNLSTM

tf.set_random_seed(42)

print(f'Tensorflow: {tf.__version__}')

model = Sequential()
model.add(CuDNNLSTM(64, input_shape=(10, 39), return_sequences=True))
@frreiss
Copy link
Contributor

frreiss commented Oct 31, 2018

I am seeing the same issue on my build server. Configuration is as follows:

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04 x64
  • TensorFlow installed from (source or binary): Source
  • TensorFlow version: master branch on Github
  • Python version: 3.6.7rc1
  • Bazel version (if compiling from source): 0.19.0
  • GCC/Compiler version (if compiling from source): gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
  • CUDA/cuDNN version: n/a

I traced the problem to the function RegisterPerTableLoadOpsForAlgorithmBody() in contrib/tpu/ops/tpu_embedding_ops.cc. This function somehow manages to create an OpDef C++ proto object with incorrect string pointers. I suspect a compiler bug.

@frreiss
Copy link
Contributor

frreiss commented Oct 31, 2018

Update: Recreating my Linux test VM from scratch made the problem go away for me.

@neopostmodern
Copy link
Author

Unfortunately I can't recreate my machine since it's my regular one - any hints how to proceed? (Or how recreate the substantial parts...)

@ymodak ymodak self-assigned this Nov 5, 2018
@ymodak ymodak added the comp:keras Keras related issues label Nov 5, 2018
@ymodak
Copy link
Contributor

ymodak commented Nov 5, 2018

The code snippet you provided works fine for me. It doesn't look like a bug on TF end. You can try posting it on Stack Overflow

@ymodak ymodak added the stat:awaiting response Status - Awaiting response from author label Nov 5, 2018
@neopostmodern
Copy link
Author

Yes, the code snippet is supposed to work. It used to work on my machine, and it does on @frreiss machine too. But, as observed on both my machine and @frreiss machine, under certain circumstances, when building TF from source, it doesn't.
So, I don't see why this would not be a TF issue, although maybe build related? This comment hints to a possible origin, but I'm not savvy enough to investigate that further.

@neopostmodern
Copy link
Author

Update:
If I replace CuDNNLSTM with LSTM the above code runs without errors. Doesn't make it less weird, but maybe it's CUDA related - although in the other occurence of this behavior no CUDA was present.
@frreiss can you share the code you were running when you observed the error maybe?

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Nov 6, 2018
@lesniewski
Copy link
Contributor

I also hit this issue when upgrading from 1.10 to 1.12.
Adding --define=framework_shared_object=true to the bazel build suppressed it, so this may be a bug in the static build.

@aprimostka
Copy link
Contributor

Confirm roots of issue in static build.
After removing --config=monolithic I can build TF without problems.

muendelezaji added a commit to muendelezaji/tensorflow-on-arm that referenced this issue Jan 15, 2019
@ymodak
Copy link
Contributor

ymodak commented Feb 16, 2019

@neopostmodern Apologies for the delay in response. Is this still an issue for you? Do you see the same behavior against latest TF?

@ymodak ymodak added the stat:awaiting response Status - Awaiting response from author label Feb 16, 2019
@ymodak
Copy link
Contributor

ymodak commented Feb 23, 2019

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

@ymodak ymodak closed this as completed Feb 23, 2019
@versusvoid
Copy link

versusvoid commented Feb 23, 2019

Just got it on 1.13.0-rc2 with monolithic build:

  export USE_DEFAULT_PYTHON_LIB_PATH=1
  export TF_DOWNLOAD_CLANG=0
  export TF_ENABLE_XLA=0
  export TF_NEED_COMPUTECPP=0
  export TF_NEED_CUDA=0
  export TF_NEED_HDFS=0
  export TF_NEED_IGNITE=0
  export TF_NEED_MPI=0
  export TF_NEED_OPENCL_SYCL=0
  export TF_NEED_ROCM=0
  export TF_NEED_TENSORRT=0
  export TF_SET_ANDROID_WORKSPACE=0
  export CC_OPT_FLAGS="-march=haswell"

  bazel build -j 6 \
      --config=opt \
      --config=mkl \
      --config=monolithic \
      --config=noaws \
      --config=nogcp \
      --config=nohdfs \
      --config=noignite \
      --config=nokafka \
      --config=nonccl \
      //tensorflow:libtensorflow.so \
      //tensorflow:libtensorflow_cc.so \
      //tensorflow:install_headers \
      //tensorflow/tools/pip_package:build_pip_package

bazel 19.2, linux
Tried to run example script from hand3d:

> python run.py
Traceback (most recent call last):
  File "run.py", line 47, in <module>
    keypoints_scoremap_tf, keypoint_coord3d_tf = net.inference(image_tf, hand_side_tf, evaluation)
  File "/home/versus/Programming/gesture/hand3d/nets/ColorHandPose3DNetwork.py", line 78, in inference
    hand_scoremap = self.inference_detection(image)
  File "/home/versus/Programming/gesture/hand3d/nets/ColorHandPose3DNetwork.py", line 152, in inference_detection
    x = ops.conv_relu(x, 'conv%d_%d' % (block_id, layer_id+1), kernel_size=3, stride=1, out_chan=chan_num, trainable=train)
  File "/home/versus/Programming/gesture/hand3d/utils/general.py", line 57, in conv_relu
    tensor = cls.conv(in_tensor, layer_name, kernel_size, stride, out_chan, trainable)
  File "/home/versus/Programming/gesture/hand3d/utils/general.py", line 45, in conv
    tf.contrib.layers.xavier_initializer_conv2d(), trainable=trainable, collections=['wd', 'variables', 'filters'])
  File "/usr/lib/python3.7/site-packages/tensorflow/python/util/lazy_loader.py", line 61, in __getattr__
    module = self._load()
  File "/usr/lib/python3.7/site-packages/tensorflow/python/util/lazy_loader.py", line 44, in _load
    module = importlib.import_module(self.__name__)
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/lib/python3.7/site-packages/tensorflow/contrib/__init__.py", line 43, in <module>
    from tensorflow.contrib import distribute
  File "/usr/lib/python3.7/site-packages/tensorflow/contrib/distribute/__init__.py", line 33, in <module>
    from tensorflow.contrib.distribute.python.tpu_strategy import TPUStrategy
  File "/usr/lib/python3.7/site-packages/tensorflow/contrib/distribute/python/tpu_strategy.py", line 27, in <module>
    from tensorflow.contrib.tpu.python.ops import tpu_ops
  File "/usr/lib/python3.7/site-packages/tensorflow/contrib/tpu/__init__.py", line 69, in <module>
    from tensorflow.contrib.tpu.python.ops.tpu_ops import *
  File "/usr/lib/python3.7/site-packages/tensorflow/contrib/tpu/python/ops/tpu_ops.py", line 39, in <module>
    resource_loader.get_path_to_datafile("_tpu_ops.so"))
  File "/usr/lib/python3.7/site-packages/tensorflow/contrib/util/loader.py", line 56, in load_op_library
    ret = load_library.load_op_library(path)
  File "/usr/lib/python3.7/site-packages/tensorflow/python/framework/load_library.py", line 62, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid name: 
An op that loads optimization parameters into HBM for embedding. Must be
preceded by a ConfigureTPUEmbeddingHost op that sets up the correct
embedding table configuration. For example, this op is used to install
parameters that are loaded from a checkpoint before a training loop is
executed.

parameters: A tensor containing the initial embedding table parameters to use in embedding
lookups using the Adagrad optimization algorithm.
accumulators: A tensor containing the initial embedding table accumulators to use in embedding
lookups using the Adagrad optimization algorithm.
table_name: Name of this table; must match a name in the
  TPUEmbeddingConfiguration proto (overrides table_id).
num_shards: Number of shards into which the embedding tables are divided.
shard_id: Identifier of shard for this operation.
table_id: Index of this table in the EmbeddingLayerConfiguration proto
  (deprecated).
 (Did you use CamelCase?); in OpDef: name: "\nAn op that loads optimization parameters into HBM for embedding. Must be\npreceded
...

@puneet336
Copy link

puneet336 commented Feb 26, 2019

Got same issue with tensorflow 1.12.0 (compiled from source with gcc 4.8 + python 2.7)
as -march=native (default) was getting me less performance with cifar10, i used following to build tensorflow -
bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --define=grpc_no_ares=true -k //tensorflow/tools/pip_package:build_pip_package

while running -
python -c 'from tensorflow.examples.tutorials.mnist import input_data'
i got
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid name:
error log is attached -mnist_test_error.log

UPDATE: following worked after adding --define=framework_shared_object=true in bazel build comilation line, thank you @lesniewski

[puneet@mach MNIST]$ python -c 'from tensorflow.examples.tutorials.mnist import input_data'
[puneet@mach MNIST]$ echo $?
0

@caandewiel
Copy link

caandewiel commented Nov 30, 2021

I ran into this as well and this is just to document this for any future encounters. Do not add @com_google_protobuf//:protobuf as a bazel dependency, but use @com_google_protobuf//:protobuf_headers instead. If not, you end up linking protobuf twice causing an issue that all mutable protobuf strings point to the same memory address. What happens is that the last string manipulation overwrites all other protobuf strings. The name you got seems to be overwritten from parameter_descriptions. It's a very obscure error message and is described in a bit more detail here.

The reason a monolithic build fixes this, is because it just packages everything as one basically avoiding this problem in the first place. But I don't think that's a very good solution as it's just avoiding the problem instead of actually fixing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:keras Keras related issues stat:awaiting response Status - Awaiting response from author
Projects
None yet
Development

No branches or pull requests

9 participants