Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_single_size_image_list removing duplicates also removing the corresponding filename/label #801

Open
LIEeOoNn opened this issue May 28, 2024 · 0 comments
Labels
enhancement 💡 New feature or request

Comments

@LIEeOoNn
Copy link
Contributor

Is your feature request related to a problem?

When removing duplicates of an ImageList the ImageList length will be different from the filenames/ labels, thus causing problems when training a nn.

Desired solution

the remove_duplicate_images() meth should also have a parameter for the filenames/labels list[str] so both are updated correctly
solution which is also much faster than the old one

def remove_duplicate_images (self, filenames: list[str]) ->tuple[ImageList, list[str]]:
        import numpy
        image_list = self.to_images()
        image_list_without_dubs: list[Image] = []
        images = ImageList
        filenames_new: list[str] = []
        unique_byte = set()
        for i in range(len(image_list)):
            tensor = image_list[i]._image_tensor
            tensor_byte = tensor.numpy().tobytes()
            if tensor_byte not in unique_byte:
                unique_byte.add(tensor_byte)
                image_list_without_dubs.append(image_list[i])
                filenames_new.append(filenames[i])
        images = images.from_images(image_list_without_dubs)
        return images, filenames_new

Possible alternatives (optional)

No response

Screenshots (optional)

No response

Additional Context (optional)

No response

@LIEeOoNn LIEeOoNn added the enhancement 💡 New feature or request label May 28, 2024
@LIEeOoNn LIEeOoNn changed the title ImageList removing duplicates also removing the corresponding filename/label _single_size_image_list removing duplicates also removing the corresponding filename/label May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💡 New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

1 participant