AI and Ethics: AI Image Generator and Data

TongRo Images
Apr 13, 2023
2 min read

Updated: Jul 6, 2023

We are living in the age of AI civil wars. Midjourney, DeepAI, OpenAI, and various companies are coming up with their own AI combined with TTI(Text to Image) systems. Overlapping the news that the American enterprise Shutterstock started to utilize DALL·E 2 to create stock images from January 25th onto this, it is not hard to realize that AI technology is getting closer and closer to our daily lives.

The reason why DALL·E 2 is at the center of attention is its ability to understand vision and language. To explain further, DALL·E 2 ‘understands’ texts written in natural language better than other models and is able to generate images that collectively incarnate diverse styles · objects · backgrounds · locations · concepts. If so, how does the AI Image Generator work? We will look into it by focusing on today’s main character, DALL·E.

https://www.youtube.com/watch?v=qTgPSKKjfVg

To make a child understand the concept of a ‘cat’, it needs to understand that it is still a ‘cat’ regardless of its size and kind. In order to do so, it needs to see and learn the cats in various sizes, forms, and kinds. Likewise, AI’s accuracy proportionally aligns with the amount of data it has. Therefore, the crucial assignment to develop an AI is collecting a ‘dataset’, A set of data stated above.

Among these, the dataset that is needed for the AI Image Generator is a pair of an image and a tag that describes the image. The antiquated way of creating these pairs was manually putting the data that suits the image, which took too much time and effort. However, the technology has evolved to the point of processing and learning a voluminous amount of data and then coming up with the output based on it in a short period of time.

But after pondering the source of this ‘data’, we cannot help but feel concerned. As Gil Appel and the other scholars pointed out in Harvard Business Review, DALL·E 2, Stable Diffusion, and Midjourney utilize the large dataset from the German non-profit organization ‘LAION(Large-scale Artificial Intelligence Open Network)’, based on the Common Crawl dataset which is a combination of a data scraped from the web “indiscriminately”, consequently meaning they cannot be free from the fact that they are violating the intellectual property rights.

Moreover, the demand for ‘detoxification’ of the dataset from Abeba Birhane’s paper 〈Multimodal datasets: misogyny, pornography, and malignant stereotypes〉, which analyzes DALL·E 2’s CLIP(Contrastive Language-Image Pre-trainings) model, suggests more than the violation of intellectual property rights. The paper depicts jarring, hate-fueled AI; eventually makes us realize that feeding AI data from the internet just because of its quantity without consideration would be “devastating on marginalized communities”.

In conclusion, the AI Image Generator is one of the best children of ‘the Sea of Information’ and is filled with endless possibilities. Nevertheless, due to its very own characteristics, it needs to be navigated with the utmost consideration.

References

1. Shutterstock Introduces Generative AI to its All-In-One Creative Platform

https://www.shutterstock.com/press/20465

2. DALL·E: Creating images from text

https://openai.com/research/dall-e

3. AI Art Generators and the Online Image Market

https://www.eff.org/deeplinks/2023/04/ai-art-generators-and-online-image-market

4. Stable Bias: Analyzing Societal Representations in Diffusion Models

https://arxiv.org/abs/2303.11408

5. Generative AI Has an Intellectual Property Problem

https://hbr.org/2023/04/generative-ai-has-an-intellectual-property-problem

6. Multimodal datasets: misogyny, pornography, and malignant stereotypes