Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tfio.experimental.image.draw_bounding_boxes has inconsistent shape constraints #2046

Open
AndreiMoraru123 opened this issue Jul 31, 2024 · 0 comments

Comments

@AndreiMoraru123
Copy link

AndreiMoraru123 commented Jul 31, 2024

DrawBoundingBoxesV3Op can essentially only draw one text output per image

I think this issue got first mentioned in #1088.

However, it got labelled as an enhancement, though it looks more like an imposed limitation in the implementation as it currently is.

A simple example that works is having a single box inside a single image, with a single color code and a single text output:

import tensorflow as tf
import tensorflow_io as tfio

width = 560
height = 320
channels = 3

images = tf.random.uniform((height, width, channels), dtype=tf.float32)
images = tf.expand_dims(images, axis=0)

boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9]]], dtype=tf.float32)
texts = tf.constant(["hello_world!"], dtype=tf.string)
colors = tf.constant([[255, 0, 0]], dtype=tf.float32)

print("Shapes of inputs:")
print("Images:", images.shape)
print("Boxes:", boxes.shape)
print("Texts:", texts.shape)
print("Colors:", colors.shape)

output = tfio.experimental.image.draw_bounding_boxes(images, boxes, texts, colors)
print("Output:", output.shape)
Shapes of inputs:
Images: (1, 320, 560, 3)
Boxes: (1, 1, 4)
Texts: (1,)
Colors: (1, 3)
Output: (1, 320, 560, 3)

We already know from tensorflow/io/tensorflow_io/core/kernels/image_font_kernels.cc that there is no point in trying without a batch dimension, as there is a check for the image rank to be 4:

OP_REQUIRES(context, images.dims() == 4, 
			errors::InvalidArgument("The rank of the images should be 4"));

This is also what the #254 PR by @yongtang that added this feature demonstrates as well

It's also spiritually the same as the one test available in the code at tensorflow/io/tests/test_image.py

Now, still within a batch size of 1 (one image), we could have more boxes, each with their own text labels and colors, but this does not work:

import tensorflow as tf
import tensorflow_io as tfio


width = 560
height = 320
channels = 3

images = tf.random.uniform((height, width, channels), dtype=tf.float32)
images = tf.expand_dims(images, axis=0)

boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9], [0.3, 0.3, 0.6, 0.6]]], dtype=tf.float32)
texts = tf.constant(["hello_world!", "hello_world_part_2"], dtype=tf.string)
colors = tf.constant([[255, 0, 0], [0, 255, 0]], dtype=tf.float32)

print("Shapes of inputs:")
print("Images:", images.shape)
print("Boxes:", boxes.shape)
print("Texts:", texts.shape)
print("Colors:", colors.shape)

output = tfio.experimental.image.draw_bounding_boxes(images, boxes, texts, colors)
print("Output:", output.shape)
Shapes of inputs:
Images: (1, 320, 560, 3)
Boxes: (1, 2, 4)
Texts: (2,)
Colors: (2, 3)
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
[<ipython-input-1-3eebb25ba193>](https://localhost:8080/#) in <cell line: 22>()
     20 print("Colors:", colors.shape)
     21 
---> 22 output = tfio.experimental.image.draw_bounding_boxes(images, boxes, texts, colors)
     23 print("Output:", output.shape)

1 frames
<string> in io_draw_bounding_boxes_v3(images, boxes, colors, texts, font_size, name)

[/usr/local/lib/python3.10/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in raise_from_not_ok_status(e, name)
   5881 def raise_from_not_ok_status(e, name) -> NoReturn:
   5882   e.message += (" name: " + str(name if name is not None else ""))
-> 5883   raise core._status_to_exception(e) from None  # pylint: disable=protected-access
   5884 
   5885 

InvalidArgumentError: {{function_node __wrapped__IO>DrawBoundingBoxesV3_device_/job:localhost/replica:0/task:0/device:CPU:0}} The batch sizes should be the same [Op:IO>DrawBoundingBoxesV3] name:

batch sizes should be the same refers to the batch size of images and texts, which required in tensorflow/io/tensorflow_io/core/kernels/image_font_kernels.cc:

        OP_REQUIRES(
            context, images.dim_size(0) == texts_tensor.dim_size(0),
            errors::InvalidArgument("The batch sizes should be the same"));

Yet, interestingly, not required for colors....

Okay, then let's try to make the shape batch size fit the image batch size, as the OP requires:

import tensorflow as tf
import tensorflow_io as tfio


width = 560
height = 320
channels = 3

images = tf.random.uniform((height, width, channels), dtype=tf.float32)
images = tf.expand_dims(images, axis=0)

boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9], [0.3, 0.3, 0.6, 0.6]]], dtype=tf.float32)
texts = tf.constant(["hello_world!", "hello_world_part_2"], dtype=tf.string)
colors = tf.constant([[255, 0, 0], [0, 255, 0]], dtype=tf.float32)

# let's also expand for text
texts = tf.expand_dims(texts, axis=0)

print("Shapes of inputs:")
print("Images:", images.shape)
print("Boxes:", boxes.shape)
print("Texts:", texts.shape)
print("Colors:", colors.shape)

output = tfio.experimental.image.draw_bounding_boxes(images, boxes, texts, colors)
print("Output:", output.shape)

But then we hit this error:

Shapes of inputs:
Images: (1, 320, 560, 3)
Boxes: (1, 2, 4)
Texts: (1, 2)
Colors: (2, 3)
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
[<ipython-input-2-f83aad0cb29d>](https://localhost:8080/#) in <cell line: 25>()
     23 print("Colors:", colors.shape)
     24 
---> 25 output = tfio.experimental.image.draw_bounding_boxes(images, boxes, texts, colors)
     26 print("Output:", output.shape)

1 frames
<string> in io_draw_bounding_boxes_v3(images, boxes, colors, texts, font_size, name)

[/usr/local/lib/python3.10/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in raise_from_not_ok_status(e, name)
   5881 def raise_from_not_ok_status(e, name) -> NoReturn:
   5882   e.message += (" name: " + str(name if name is not None else ""))
-> 5883   raise core._status_to_exception(e) from None  # pylint: disable=protected-access
   5884 
   5885 

InvalidArgumentError: {{function_node __wrapped__IO>DrawBoundingBoxesV3_device_/job:localhost/replica:0/task:0/device:CPU:0}} The rank of the texts tensor should be 1 [Op:IO>DrawBoundingBoxesV3] name:

The rank of the texts tensor should be 1 is required by another op: tensorflow/io/tensorflow_io/core/kernels/image_font_kernels.cc:

        OP_REQUIRES(context, texts_tensor.dims() == 1,
                    errors::InvalidArgument(
                        "The rank of the texts tensor should be 1"));

But does it work for colors only, no text? Yes

import tensorflow as tf
import tensorflow_io as tfio


width = 560
height = 320
channels = 3

images = tf.random.uniform((height, width, channels), dtype=tf.float32)
images = tf.expand_dims(images, axis=0)

boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9], [0.3, 0.3, 0.6, 0.6]]], dtype=tf.float32)
colors = tf.constant([[255, 0, 0], [0, 255, 0]], dtype=tf.float32)


print("Shapes of inputs:")
print("Images:", images.shape)
print("Boxes:", boxes.shape)
print("Colors:", colors.shape)

output = tfio.experimental.image.draw_bounding_boxes(images, boxes, None, colors)
print("Output:", output.shape)
Shapes of inputs:
Images: (1, 320, 560, 3)
Boxes: (1, 2, 4)
Colors: (2, 3)
Output: (1, 320, 560, 3)

To sum up, I think this is a limitation right now, because as it does work for colors, so it should work for texts across bounding boxes. I could not spot a limitation that would force only one text display per image.

If you also agree, I would volunteer to help with a fix attempt @yongtang @terrytangyuan

Here is a link to the demo notebook with the above cells:
https://colab.research.google.com/drive/1rSder84urmOGF21rtWGb7TDEu-7zq1MP?usp=sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant