Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CopyAndPaste transform #1225

Open
ternaus opened this issue Jul 13, 2022 · 6 comments
Open

Add CopyAndPaste transform #1225

ternaus opened this issue Jul 13, 2022 · 6 comments
Labels

Comments

@ternaus
Copy link
Collaborator

ternaus commented Jul 13, 2022

  1. Specify the object as PNG with the empty background.
  2. Specify background image.
  3. Cut out a region in the background image that is larger than the object.
  4. Paste the object to the cut.
  5. Use Poisson Blending to inpaint space between pasted object and image.
@ternaus ternaus added enhancement New feature or request feature request and removed enhancement New feature or request labels Jul 13, 2022
@ternaus ternaus added the good first issue Good for newcomers label Jul 22, 2022
@i-aki-y
Copy link
Contributor

i-aki-y commented Sep 3, 2022

@ternaus Hi, I'm interested in the feature.

How can we specify the pasted object in an albumentation's transform? Do we need to introduce a new target keyword like 'paste_image':

transform(image=image, paste_image=paste_image, ...)

, or sample segments from the same target image as yolov5 is doing?

https://github.com/ultralytics/yolov5/blob/15e82d296720d4be344bf42a34d60ffd57b3eb28/utils/dataloaders.py#L706

@Dipet
Copy link
Collaborator

Dipet commented Sep 3, 2022

I think we could support this:

  • Add past_image_key param for transform - if this key is set we will use provided images
  • Add past_image_dir parms for transform - is this key is set we will sample random image from this directory
  • Otherwise sample random segment from the image.

@i-aki-y
Copy link
Contributor

i-aki-y commented Sep 23, 2022

@Dipet thank you for your reply.

Do You mean that the paste_image_key is used to set a path to the png file as a target?

transform(image=image, paste_image_key="path/to/png", ...)

If so, I think using past_image_dir is a better choice since the past_image_dir can be used as a parameter of the constructor. And we can avoid introducing new targets and follow the standard usage.

Ex.

transform Compose([
    CopyAndPaste(paste_image_dir=objects_dir, …),
...
])

transform(image=image, …)   # we can still use only the standard targets bboxes, and masks.

To get a feel for it, I made a workable example in PR #1297 (still working).

@zetyquickly
Copy link
Contributor

Feature description

Add CopyAndPaste Augmentation from https://arxiv.org/abs/2012.07177

Used in github.com/ultralytics/yolov5/blob/ac6c4383bc0c7a2a4f7ca18f8733821b49e916bd/utils/augmentations.py#L19

Checked the yolov5 code, here. It looks they don't do the method the paper describes

Paper quote:

Our approach for generating new data using Copy-Paste is very simple. We randomly select two images and apply random scale jittering and random horizontal flipping on each of them. Then we select a random subset of objects from one of the images and paste them onto the other image. Lastly, we adjust the ground-truth annotations accordingly: we remove fully occluded objects and update the masks and bounding boxes of partially occluded objects.

What they do:

  • Take an image and its segments (masks of objects on the image).
  • Mirror these segments relative to the center of the whole image.
  • Paste the new mirrored segments back onto the image (if their IOA is low).

@zetyquickly
Copy link
Contributor

Regarding this PR.

I haven't checked the implementation in detail, but there are two points so far:

  • It implements more than what the "Simple Copy-Paste ..." paper suggests, which is great.
  • Its call signature could be improved. The parameters:
paste_image_dir=object_dir,
get_label_from_path=get_label_from_path

appear suboptimal. It would be better to handle everything in memory, working directly with already loaded images, labels, and masks.

@ternaus
Copy link
Collaborator Author

ternaus commented Apr 6, 2024

@zetyquickly

Loading everything to memory or loading from the disk could be of personal preference.

In the lastest transform that needed to load extra data from disk it looks like

https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/mixing/transforms.py

We have pair of:

reference_data (Optional[Union[Generator[ReferenceImage, None, None], Sequence[Any]]]):
            A sequence or generator of dictionaries containing the reference data for mixing
            If None or an empty sequence is provided, no operation is performed and a warning is issued.

and

read_fn (Callable[[ReferenceImage], Dict[str, Any]]):
            A function to process items from reference_data. It should accept items from reference_data
            and return a dictionary containing processed data:
                - The returned dictionary must include an 'image' key with a numpy array value.
                - It may also include 'mask', 'global_label' each associated with numpy array values.
            Defaults to a function that assumes input dictionary contains numpy arrays and directly returns it.

reference_data could be generator, sequence of ids, paths, images loaded into memory

and read_fn is function that maps from reference_data element to something that transform uses.

=> if person wants to load everything into memory beforehead => load it to reference_data and use lambda x: x as read_fn,

if you want to read on the fly => let all the work happen in read_fn


It looks like a lot of different functionality is added in that PR, I would probably split it into different PR's

@ternaus ternaus changed the title Add CutAndPaste transform Add CopyAndPaste transform Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants