-
Notifications
You must be signed in to change notification settings - Fork 545
dynamic shape #328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@OValery16 Did u using onnx2trt convert or using your own plugin parse? |
I use the following command |
same problem with pytorch1.3 and TRT 6.0.1.5 [TensorRT] WARNING: Explicit batch network detected and batch size specified, use enqueue without batch size instead. |
Hi @OValery16 , did you solve the problem? I have a similar issue here...
I've looked into the sample code "sampleDynamicReshape.cpp" to see how to add optimization profile during the conversion. |
I'm having the same issue as @cocoyen1995, using keras2onnx to convert a tf.keras model to ONNX, and then attempting to use onnx2trt to create the inference engine. The result is the same dynamic inputs error. It seems that the general ONNX parser cannot handle dynamic batch sizes. From the TensorRT C++ API documentation: Note: In TensorRT 7.0, the ONNX parser only supports full-dimensions mode, meaning that your network definition must be created with the In the Working with Dynamic Shapes section, there is no explicit mention of the ONNX parser. I believe that the dynamic shape specification is only for non-batch shapes (e.g. H, W), in which case one would need to build the optimization profile following the given instructions. This kind of defeats the purpose of onnx2trt (easy construction of the engine from the ONNX networks), forcing one to go through the C++/Python API. I find this all a little strange, as batching is critical for good inference performance, and the UFF format has been deprecated as of TensorRT 7. What other options do we have besides importing from Caffe, or (God forbid) building TRT network definitions from scratch and manually loading in weights? |
Any updates? |
Hi, @OValery16 , You can peek at the code here: https://github.com/rmccorm4/tensorrt-utils/blob/master/classification/imagenet/onnx_to_tensorrt.py as a rough reference for converting ONNX models to TensorRT, taking dynamic batch into consideration. For example, you could try something like
And that should create some default optimization profiles with various batch sizes. You can tweak these numbers manually in the script or make your own script based off it. @cwentland0 dynamic shape refers to any dimension with a value of -1, including the batch dimension (when considering an explicit batch network). UFF models do not support explicit batch, they are implicit batch only. I'm not sure about UFF support for dynamic shapes off the top of my head, but I don't think it's supported. You could try my script mentioned above as a reference. All of the logic in that script can be applied to a C++ version if necessary as well, as it's the same API. Additionally, I believe TensorRT's trtexec comes with a little bit of extra logic on top of onnx2trt to create a default optimization profile for you if none were specified (which is the error you're getting). I would generally stick with using trtexec over onnx2trt for simplicity. |
Thank you for your help. The link is wrong. |
Fixed, thanks. |
@rmccorm4 I attempted to use your code, as I am at my wits' end trying to get However, your script produced the same result, so I'm 100% sure I'm just being an idiot. Even if I generate an engine with entirely fixed dimensions, the |
Hi @cwentland0
I believe
I believe I believe something like this for an explicit batch ONNX model with a dynamic batch dimension (-1):
Would be roughly equivalent to setting At least, that's my understanding. I hope this clears some things up. |
I am dealing with a ONNX model that is used for segmentation. All discussion above are about batch axis. I'm facing a problem of dynamic size on the other axis. Specifically, can I use ONNX and ONNX2trt to handle Tensor shape of (?, 1, ?, ?)? Yes, the width and height needs to be dynamic in this case. Thanks very much. |
Hi @zheng-xing, As long as you adhere to the restrictions here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#rest_dynamic_shapes, I think it should work. You could probably quickly test this with something like:
I chose arbitrary dimensions above - be sure to choose min,opt,max shape values for batch/height/width that are needed for your use case. |
Hi @rmccorm4 , Thanks! I really appreciate your suggestion. I had tried this and this command does run without any problem. An engine is generated. However, the segmentation results are wrong for any shapes that are within range of minShapes and maxShapes, but different from the input shape specified in ONNX model file. I think the reasons is related to how I use this inference engine. My code samples is like following: int dim1 = 256, dim2 = 512;
static float data[dim1][dim2]; // some gray scale image
float *mydata = &(data[0][0]);
float* predictions = new float[2 * dim1 * dim2]; // Two class segmentation: background and foreground
IExecutionContext* context = engine->createExecutionContext();
assert(engine->getNbBindings() == 2);
void* buffers[2];
int inputIndex, outputIndex;
printf("Bindings after deserializing:\n");
for (int bi = 0; bi < engine->getNbBindings(); bi++)
{
if (engine->bindingIsInput(bi) == true)
{
inputIndex = bi;
printf("Binding %d (%s): Input.\n", bi, engine->getBindingName(bi));
}
else
{
outputIndex = bi;
printf("Binding %d (%s): Output.\n", bi, engine->getBindingName(bi));
}
}
// Create GPU buffers on device
cudaMalloc(&buffers[inputIndex], dim1 * dim2 * sizeof(float));
cudaMalloc(&buffers[outputIndex], 2 * dim1 * dim2 * sizeof(float));
for (int i = 0; i < 1; i++) {
// Create stream
cudaStream_t stream;
cudaStreamCreate(&stream);
// DMA input batch data to device, infer on the batch asynchronously, and DMA output back to host
cudaMemcpyAsync(buffers[inputIndex], mydata, dim1 * dim2 * sizeof(float), cudaMemcpyHostToDevice, stream);
context->enqueueV2(buffers, stream, nullptr);
cudaMemcpyAsync(predictions, buffers[outputIndex], 2 * dim1 * dim2 * sizeof(float), cudaMemcpyDeviceToHost, stream);
cudaStreamSynchronize(stream);
} Because in GPU memory, the data is 1d array. An input data with shape 256 x 512 cannot be distinguished with an input data with shape 512 x 256. Thus have problems. Do you have any idea how to use TensorRT in my scenario? Thanks. |
Hi @zheng-xing,
Shouldn't the input shape in the ONNX model file be (-1, 1, -1, -1)? Otherwise, I don't think the optimization profiles would work correctly / make sense. Generally, flattening to 1-D and re-expanding later shouldn't cause any issues.
I think that's the point of setting the binding dimensions at runtime. You should specify something like this:
or
depending on the input. See this section: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#runtime_dimensions If you have a small number of possible input shapes, and enough memory, I believe it's more performant to create a context for each possible input shape and set the binding dimensions once on each context, then select your context to execute on based on the input shape rather than setting the binding dimension before every inference. Though this is certainly something you have to trade-off depending on resources and use case. |
Thanks @rmccorm4 for your detailed explanations! I believe setBindingDimension is the key here. Just one more question, I just tried it with a 3D segmentation case, the trtexec command gives me following error messages:
That means the dynamic shapes are not supported for 3D segmentations yet, right? Thanks. |
I don't know the specific ops that a 3D segmentation is comprised of, but given that I'm assuming your channel dimension is constant from the discussion above, sounds like maybe a
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#rest_dynamic_shapes Is there more to the error that says what op that assertion failed for? |
Hi @rmccorm4 , Yes, the input has fixed channel size of 1. And I only use batchsize=1. The only thing I want to change is the input width, height or depth. The network has no FullyConnectedLayer. It has convolution layers with different dilations.
|
@rmccorm4 I met a problem with onnx version when I am trying to run the dynamic shape mode of my model.
I have the following questions:
Here's my environment settings:
static nvinfer1::ICudaEngine*
create_onnx_engine(const std::string& model_file, int max_batch_size, nvinfer1::DataType dtype)
{
destroy_ptr<nvinfer1::IBuilder> builder(nvinfer1::createInferBuilder(gLogger));
destroy_ptr<nvinfer1::INetworkDefinition> network(builder->createNetworkV2(0));
destroy_ptr<nvonnxparser::IParser> parser(nvonnxparser::createParser(*network, gLogger));
if (!parser->parseFromFile(model_file.c_str(), static_cast<int>(nvinfer1::ILogger::Severity::kWARNING))) {
gLogger.log(
nvinfer1::ILogger::Severity::kERROR,
("Failed to parse ONNX in data type: " + to_string(dtype)).c_str());
exit(1);
}
builder->setMaxBatchSize(max_batch_size);
destroy_ptr<nvinfer1::IBuilderConfig> config(builder->createBuilderConfig());
config->setMaxWorkspaceSize((1 << 20) * 512); // TODO: A better way to set the workspace.
auto engine = builder->buildEngineWithConfig(*network, *config);
if (nullptr == engine) {
gLogger.log(nvinfer1::ILogger::Severity::kERROR, "Failed to created engine");
exit(1);
}
return engine;
} Thanks for your patience in advance. |
Hi @ganler ,
Probably from this line: destroy_ptr<nvinfer1::INetworkDefinition> network(builder->createNetworkV2(0)); Try setting the EXPLICIT_BATCH flag here instead: const auto explicitBatch = 1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
INetworkDefinition* network = builder->createNetworkV2(explicitBatch); |
More details on dynamic shapes in this post: https://forums.developer.nvidia.com/t/tensorrt-7-onnx-models-with-variable-batch-size/115302/6?u=nves_r |
@rmccorm4 Thanks for your help. I still want to know how to get the output shape of a dynamic shape input. Is there any material that I can reference? |
Hi @ganler ,
|
HI @rmccorm4, |
hi @rmccorm4 I did try your code to convert model from pytorch >> onnx >> tensorrt. But when inferrence with batchsize (smaller than max_batchsize), it throw an exception: [TensorRT] ERROR: Parameter check failed at: engine.cpp::enqueue::387, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 16, but engine max batch size was: 1 I try to convert dynamic shape onnx model (in batchsize demension) to tensorrt. So could you tell me the problem? |
@rmccorm4 I have the same question |
@rmccorm4 Is there a way export a model with one layer have dynamic channel which num determined by instances detected in image and it's can be get only when runtime? |
@ChauncyJin @chen-san https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleDynamicReshape is a C++ sample showing the Dynamic Shape concepts. @thancaocuong It looks like you're using @jinfagang you can generally export any of the dimensions to be dynamic as long as the layers in the model support that dimension being dynamic. To satisfy the behavior you describe, you would need to define the min/max instances you expect to see in your optimization profile dimensions, and it's up to your application code to set the IExecutionContext's dimensions at runtime based on the number of instances detected. |
@rmccorm4 thank for your answer. But I have one more question: input shape of my model is 5d tensor, so How can I set setBindingDimensions for your model? |
error:[TensorRT] ERROR: Parameter check failed at: engine.cpp::enqueue::393, condition: bindings[x] != nullptr How to solve this problem? |
I would recommend folks who are having trouble with importing their ONNX networks with dynamic shapes into TensorRT to first use our CLI binary trtexec with the latest TensorRT version to rule out the misuse of TRT APIs. If your model still cannot be parsed / run, please open a new issue and attach information about your model for our team to look at. I'll be closing this issue for now, as this issue has sort of branched off to many individual discussions. |
I am using
Which gives,
To recap, the model is generated as
|
Uh oh!
There was an error while loading. Please reload this page.
My goal is to export the resnet18 model from pytorch to tensorRT. For the sake of experimentation, I use the resnet18 from torchvision. In addition, I define the input/output data as variables of dynamic shape (batch_size,3,224,224).
After exporting my model to onnx, I use onnx-tensorrt to re-export export it to tensorrt and I got the following error: tensorrt failed to convert it and stated that the networ has dynamic or shape inputs, but no optimization profile has been defined.
I exported the resnet18 model from pytorch to onnix via the following code:
The text was updated successfully, but these errors were encountered: