Kickstarting A Generated Family of Man: Experimenting with Synthetic Photography
Ever since we created our Version 2 of A Flickr of Humanity, we’ve been brainstorming different ways to develop this project at the Flickr Foundation headquarters. Suddenly, we came across the question: what would happen if we used AI image generators to recreate The Family of Man?
- What could this reveal about the current generative AI models and their understanding of photography?
- How might it create a new interpretation of The Family of Man exhibition?
- What issues or problems would we encounter with this uncanny approach?
We didn’t know the answers to these questions, or what we might even find, so we decided to jump on board for a new journey to Version 3. (Why not?!)
We split our research into three main stages:
- Research into different AI image generators
- Exploring machine-generated image captions
- Challenges of using source photography responsibly in AI projects
And, we decided to try and see if we could use the current captioning and image generation technologies to fully regenerate The Family of Man for our Version 3.
Stage 1. Researching into different AI image generator softwares
Since the rapid advancements of generative artificial intelligence in the last couple of years, hundreds of image-generating applications, such as DALL-E 2 or Midjourney, have been launched. In the initial research stage, we tested different platforms by creating short captions of roughly ten images from The Family of Man and observing the resulting outputs.
Stage 1 Learnings:
- Image generators are better at creating photorealistic images of landscapes, objects, and animals than close-up shots of people.
- Most image generators, especially those that are free, have caps on the numbers of images that can be produced in a day, slowing down production speed.
- Some captions had to be altered because they violated terms and policies of the platforms; certain algorithms would censor prompts with potential to create unethical, explicit images (e.g. Section A photo caption – the word “naked” could not be used for Microsoft Bing)
We decided to use Microsoft Bing’s Image generator for this project because it produced images with highest quality (across all image categories) with most flexible limits on the quantity of images that could be generated. We’ve tested other tools including Dezgo, Veed.io, Canva, and Picsart.
Stage 2. Exploring image captions: AI Caption Generators
Image generators today primarily operate based on text prompts. This realisation meant we should explore caption generation software in more depth. There was much less variety in the caption-generating platforms compared to image generators. The majority of the websites we found seemed intended for social media use.
Experiment 1: Human vs machine captions
Here’s a series of experiments done by rearranging and comparing different types of captions—human-written and artificially generated—with images to observe how it alters the images generated, their different expression and, in some cases, meaning:
Stage 2 Learnings:
- It was quite difficult to find a variety of caption generating software that generated different styles of captions because most platforms only generated “cheesy” social media captions,
- In the platforms that generated other styles of captions (not for social media), we found the depth and accuracy of the description was really limited, for example, “a mountain range with mountains.”
Stage 3. Challenges of using AI to experiment with photography?!
Since both the concept and process of using AI to regenerate The Family of Man is experimental, we encountered several dilemmas along the way:
1. Copyright Issues with Original Photo Use
- It’s very difficult to obtain proper permission to use photos from the original publication of The Family of Man since the exhibition contains photos from 200+ photographers in different locations and for different publications. Hence, we’ve decided to not include the original photos of The Family of Man in the Version 3 publication.
- This is disappointing because having the original photo alongside the generated versions would allow us to create a direct visual comparison between authentic and synthetic photographs.
- All original photos of The Family of Man used in this blog post were photographed using the physical catalogue in our office.
2. Caption Generation
- Even during the process of generating captions, we are required to plug in the original photo of The Family of Man so we’ve had to take screenshots of the online catalogue available in The Internet Archive. This can still be a violation of the copyrights policies because we’re adopting the image within our process, even if we don’t explicitly display the original image. We also have a copy of The Family of Man publication purchased by the Flickr Foundation here at the office.
4. Moving Forward..
Keeping these dilemmas in mind, we will try our best to show respect to the original photographs and photographers throughout our project. We’ll also continuously repeat this process/experimentation to the rest of the images in The Family of Man to create a new Version 3 in our A Flickr of Humanity project.