On being a research fellow

By Jenn Phillips-Bacher

A half-year of fellowing

Six months is not a very long time when you’re working with a team whose ambition is to secure a vast digital archive for the next 100 years. That’s only two quarters of a single calendar year; twelve two-week sprints, if you’re counting in agile time. It’s said that it takes around six months to feel competent in a new job. Six months is both no time at all and just enough time to get a grasp of your work environment. 

Now it’s time for me, at the end of these six months, to look back at what I’ve done as the Flickr Foundation’s first Research Fellow. I’d like to share some reflections on my time here: what it was like to be a fellow, what I accomplished, and four crucial work-life lessons. 

How could I be a researcher?

When I joined Flickr Foundation, I was curious but apprehensive about what I could deliver for the organization. I didn’t come to this fellowship with an existing research practice. My last run of ‘proper’ education was at the turn of the century! 

The thing about being the first is that there is no precedent, no blueprint I can follow to be a good fellow. And what flavor of research should I even be doing? I was open to the fellowship taking its own course, and my research and other day-to-day activities were largely led by the team’s 2024 plan. Having an open agenda meant that I could steer myself where I would be most useful, with an emphasis on the Flickr Commons and Content Mobility programs. 

My mode of research followed my experience working in product management: conduct discovery and design research, synthesize it, and then extract and share insights that inform Flickr Foundation’s next steps. I had the freedom to adopt a research approach that clicked for me.

New perspectives, old inclinations

My main motivation in this fellowship was to get fresh perspectives on work I’d been doing for several years; that is, building digital services to help people find and use cultural heritage collections. I’d been doing it first as a librarian, then through managing several projects and products within the GLAM sector.

When working as a Product Manager on a multidisciplinary team within a large company, professional development was tightly tied to my role. When my daily focus was on team dynamic and delivering against a quarterly and annual plan, my self-development was geared toward optimizing those things. I didn’t get much time to look into the more distant future while working to shorter timelines. I relished the idea that I’d get some long-overdue, slow thinking time during my fellowship.

But I’ll be the first to admit that I couldn’t shake the habit of being in delivery mode. I was most drawn to practical, problem-solving work that was well within my comfort zone, like mapping an improved workflow for new members to register for Flickr Commons, or doing a content audit and making recommendations for improvements on flickr.org. 

I stretched myself in other ways, particularly in my writing practice. I won’t claim to be prolific in publishing words to screen, but I wrote several things for Flickr Foundation, including: 

In the midst of all the doing, I read. A lot. (Here’s some of what I covered). Some days it felt like I was spending all day looking at text and video, which is both a luxury and a frustration. I felt an urge to put new knowledge into immediate practice. But when this knowledge is situated in the context of a 100-year plan, ‘immediate’ isn’t necessary. It’s a mindset shift that can challenge anyone’s output-focussed ways of working.

“Aim low, succeed often” and other career lessons 

I want to conclude with four important takeaways from my fellowship. I’m confident they’ll find their way into the next stage of my career. Thank you to George Oates, Alex Chan, Jessamyn West, Ewa Spohn and all of the people I met through the Flickr Foundation for being part of this learning journey.

All technology is about people

Whether being a subject of data, being subjected to a technology, or even being the worker behind the tech, it’s always about people.

Take the Data Lifeboat project as an illustration. If the idea is to decouple Flickr photos from the flickr.com context in order to preserve them for the long term, you have to think about the people in the past who took the photos and their creative rights. What about the people depicted in the photos – did they consent? What would they think about being part of an archive? And what about the institutions who might be a destination for a Data Lifeboat? Guess what, it’s totally made of people who work in complex organizations with legacies, guidelines and policies and strategies of their own (and those governance structures made by people).

To build technology responsibly is a human endeavor, not a mere technical problem. We owe it to each other to do it well.

Slow down, take a longer view

One of the team mottos is “Aim low, succeed often”; meaning, take on only what feels sustainable and take your time to make it achievable. We’re working to build an organization that’s meant to steward our photographic heritage for a century and beyond. We can’t solve everything at once. 

This was a much needed departure from the hamster-wheel of agile product delivery and the planning cycles that can accompany it. But it also fits nicely into product-focussed ways of working – in fact, it’s the ideal. Break down the hard stuff into smaller, achievable aims but have that long-term vision in mind. Work sustainably.

A strong metaphor brings people along

How do you describe a thing that doesn’t exist yet? If you want to shape a narrative for a new product or service, a strong metaphor is your friend. Simple everyday concepts that everyone can understand are perfect framing devices for novel work that’s complex and ambiguous. 

The Data Lifeboat project is held afloat with metaphor. We dock, we build a network of Safe Harbors. If you’re a visual kind of person, the metaphor draws its own picture. Metaphor is helping us to navigate something that’s not perfectly defined.

Meeting culture is how you meet

Everyone in almost every office-based workplace complains about all of the meetings. Having spent time in a small office with three other Flickr Foundation staff, I’ve learned that meeting culture is just how you meet. If you sit around a table and have coffee while talking about what you’re working on, that’s a meeting. It’s been a joy to break out of the dominant, pre-scheduled, standing one-hour meeting culture. Let’s go for a walk. Let’s figure out the best way to get stuff done together. 

Flickr Turns 20 London Photowalk
This is me, captured by George at a Flickr Photowalk in London.

Data Lifeboat: Deeper research into the challenge of archiving social media objects

By Jenn Phillips-Bacher

For all of us at Flickr Foundation, the idea of Flickr as an archive in waiting inspires our core purpose. We believe the billions of photos that have amassed on Flickr in the last 20 years have potential to be the material of future historical research. With so much of our everyday lives being captured digitally and posted to public platforms, we – both the Flickr Foundation and the wider cultural heritage community – have begun figuring out how to proactively gather, make available, and preserve digital images and their metadata for the long term.

In this blog post, I’m setting my sights beyond technology to consider the institutional and social aspects that enable the collection of digital photography from online platforms.

It’s made of people

Our Data Lifeboat project is now underway. Its goal is to build a mechanism to make it possible to assemble and decentralize slivers of Flickr photos for potential future users. (You can read project update 1 and project update 2 for the background). The outcome of the first project phase will be one or more prototypes we will show to our Flickr Commons partners for feedback. We’re already looking ahead to the second phase where we will work with cultural heritage institutions within the wider Flickr Commons network to make sure that anything we put into production best suits cultural heritage institutions’ real-world needs.

We’ve been considering multiple possible use cases for creating, and importantly, docking a Data Lifeboat in a safe place. The two primary institutional use cases we see are:

  1. Cultural heritage institutions want to proactively collect born digital photography on topics relevant to their collections
  2. In an emergency situation, cultural heritage institutions (and maybe other Flickr members) want to save what they can from a sinking online platform – either photos they’ve uploaded or generously saving whatever they can. (And let me be clear: Flickr.com is thriving! But it’s better to design for a worst-case scenario than to find ourselves scrambling for a solution with no time to spare.)

We are working towards our Flickr Commons members (and other interested institutions) being able to accept Data Lifeboats as archival materials. For this to succeed, “dock” institutions will need to:

  • Be able to use it, and have the technology to accept it
  • Already have a view on collecting born digital photography, and ideally this type of media is included in their collection development strategy. (This is probably more important.)

This isn’t just a technology problem. It’s a problem made of everything else the technology is made of: people who work in cultural heritage institutions, their policies, organizational strategies, legal obligations, funding, commitment to maintenance, the willing consent of people who post their photos to online platforms and lots more.

To preserve born digital photos from the web requires the enthusiastic backing of institutions—which are fundamentally social creatures—to do what they’re designed to do, which is to save and ensure access to the raw material of future research.

Collecting social photography

I’ve been doing some background research to inform the early stages of Data Lifeboat development. I came across the 2020 Collecting Social Photography (CoSoPho) research project, which set out to understand how photography is used in social media in order to be able to develop methods for collection and transmission to future generations. Their report, ‘Connect to Collect: approaches to collecting social digital photography in museums and archives’, is freely available as PDF.

The project collaborators were:

  • The Nordic Museum / Nordiska Museet
  • Stockholm County Museum / Stockholms Läns Museum
  • Aalborg City Archives / Aalborg Stadsarkiv
  • The Finnish Museum of Photography / Finland’s Fotografiska Museum
  • Department of Social Anthropology, Stockholm University

The CoSoPho project was a response to the current state of digital social photography and its collection/acquisition – or lack thereof – by museums and archives.

Implicit to the team’s research is that digital photography from online platforms is worth collecting. Three big questions were centered in their research:

  1. How can data collection policies and practices be adapted to create relevant and accessible collections of social digital photography?
  2. How can digital archives, collection databases and interfaces be relevantly adapted – considering the character of the social digital photograph and digital context – to serve different stakeholders and end users?
  3. How can museums and archives change their role when collecting and disseminating, to increase user influence in the whole life circle of the vernacular photographic cultural heritage?

There’s a lot in this report that is relevant to the Data Lifeboat project. The team’s research focussed on ‘digital social photography’, taken to mean any born digital photos that are taken for the purpose of sharing on social media. It interrogates Flickr alongside Snapchat, Facebook, Instagram, as well as region-specific social media sites like IRC-Galleria (a very early 2000s Finnish social media platform).

I would consider Flickr a bit different to the other apps mentioned, only because it doesn’t address the other Flickr-specific use cases such as:

  • Showcasing photography as craft
  • Using Flickr as a public photo repository or image library where photos can be downloaded and re-used outside of Flickr, unlike walled garden apps like Instagram or Snapchat.

The ‘massification’ of images

The CoSoPho project highlighted the challenges of collecting digital photos of today while simultaneously digitizing analog images from the past, the latter of which cultural heritage institutions have been actively doing for many years. Anna Dahlgren describes this as a “‘massification’ of images online”. The complexities of digital social photos, with their continually changing and growing dynamic connections, combined with the unstoppable growth of social platforms, pose certain challenges for libraries, archives and museums to collect and preserve.

To collect digital photos requires a concerted effort to change the paradigm:

  • from static accumulation to dynamic connection
  • from hierarchical files to interlinked files
  • and from pre-selected quantities of documents to aggregation of unpredictably variable image and data objects.

Dahlgren argues that “…in order to collect and preserve digital cultural heritage, the infrastructure of memory institutions has to be decisively changed.”

The value of collecting and contributing

“Put bluntly, if images on Instagram, Facebook or any other open online platform should be collected by museums and archives what would the added value be? Or, put differently, if the images and texts appearing on these sites are already open and public, what is the role of the museum, or what is the added value of having the same contents and images available on a museum site?” (A. Dahlgren)

Those of us working in the cultural heritage sector can imagine many good responses to this question. At the Flickr Foundation, we look to our recent internet history and how many web platforms have been taken offline. Our digital lives are at risk of disappearing. Museums, libraries and archives have that long-term commitment to preservation. They are repositories of future knowledge, and expect to be there to provide access to it.

Cultural heritage institutions that choose to collect from social online spaces can forge a path for a multiplicity of voices within collections, moving beyond standardized metadata toward richer, more varied descriptions from the communities from which the photos are drawn. There is significant potential to collect in collaboration with the publics the institution serves. This is a great opportunity to design for a more inclusive ethics of care into collections.

But what about potential contributors whose photos are being considered for collection by institutions? What values might they apply to these collections?

CoSoPho uncovered useful insights about how people participating in community-driven collecting projects considered their own contributions. Contributors wanted to be selective about which of their photos would make it into a collection; this could be for aesthetic reasons (choosing the best, most representative photos) or concerns for their own or others’ anonymity. Explicit consent to include one’s photos in a future archive was a common theme – and one which we’re thinking deeply about.

Overall, people responded positively to the idea of cultural institutions collecting digital social photos – they too can be part of history!— and also think it’s important that the community from which those photos are drawn have a say in what is collected and how it’s made available. Future user researchers at Flickr Foundation might want to explore contributor sentiment even further.

What’s this got to do with Data Lifeboats?

As an intermediary between billions of Flickr photos and cultural heritage institutions, we need to create the possibilities for long-term preservation of this rich vein of digital history. These considerations will help us to design a system that works for Flickr members and museums and archives.

Adapting collection development practices

All signs point to cultural heritage institutions needing to prepare to take on born digital items. Many are already doing this as part of their acquisition strategies, but most often this born digital material comes entangled in a larger archival collection.

If institutions aren’t ready to proactively collect born digital material from the public web, this is a risk to the longevity of this type of knowledge. And if this isn’t a problem that currently matters to institutions, how can we convince them to save Flickr photos?

As we move into the next phase of the Data Lifeboat project, we want to find out:

  • Are Flickr Commons member institutions already collecting, or considering collecting, born digital material?
  • What kinds of barriers do they face?

Enabling consent and self-determination

CoSoPho’s research surfaced the critical importance of consent, ownership and self-determination in determining how public users/contributors engage with their role in creating a new digital archive.

  • How do we address issues of consent when preserving photos that belong to creators?
  • How do we create a system that allows living contributors to have a say in what is preserved, and how it’s presented?
  • How do we design a system that enables the informed collection of a living archive?
    Is there a form of donor agreement or an opt-in to encourage this ethics of care?

Getting choosy

With 50 billion Flickr photos, not all of them visible to the public or openly licensed, we are working from the assumption that the Data Lifeboat needs to enable selective collecting.

  • Are there acquisition practices and policies within Flickr Commons institutions that can inform how we enable users to choose what goes into a Data Lifeboat?
  • What policies for protecting data subjects in collections need to be observed?
  • Are there existing paradigms for public engagement for proactive, social collecting that the Data Lifeboat technology can enable?

Co-designing usable software

Cultural heritage institutions have massively complex technical environments with a wide variety of collection management systems, digital asset management systems and more. This complexity often means that institutions miss out on chances to integrate community-created content into their collections.

The CoSoPho research team developed a prototype for collecting digital social photography. That work was attempting to address some of these significant tech challenges, which Flickr Foundation is already considering:

  • Individual institutions need reliable, modern software that interfaces with their internal systems; few institutions have internal engineering capacity to design, build and maintain their own custom software
  • Current collection management systems don’t have a lot of room for community-driven metadata; this information is often wedged in to local data fields
  • Collection management systems lack the ability to synchronize data with social media platforms (and vice versa) if the data changes. That makes it more difficult to use third-party platforms for community description and collecting projects.

So there’s a huge opportunity for the Flickr Foundation to contribute software that works with this complexity to solve real challenges for institutions. Co-design–that is, a design process that draws on your professional expertise and institutional realities–is the way forward!

We need you!

We are working on the challenge of keeping Flickr photos visible for 100 years and we believe it’s essential that cultural heritage institutions are involved. Therefore, we want to make sure we’re building something that works for as many organizations as possible – both big and small – no matter where you are in your plans to collect born digital content from the web.

If you’re part of the Flickr Commons network already, we are planning two co-design workshops for Autumn 2024, one to be held in the US and the other likely to be in London. Keep your eyes peeled for Save-the-Date invitations, or let us know you’re interested, and we’ll be sure to keep you in the loop directly.

Introducing Eryk Salvaggio, 2024 Research Fellow

Eryk Salvaggio is a researcher and new media artist interested in the social and cultural impacts of artificial intelligence. His work, which is centered in creative misuse and the right to refuse, critiques the mythologies and ideologies of tech design that ignore the gaps between datasets and the world they claim to represent. A blend of hacker, policy researcher, designer and artist, he has been published in academic journals, spoken at music and film festivals, and consulted on tech policy at the national level.

Ghosts in the Archives Become Ghosts in the Machines

I’m honored to be joining the Flickr Foundation to imagine  the next 100 years of Flickr, thinking critically about the relationships between datasets, history, and archives in the age of generative AI. 

AI is thick with stories, but we tend to only focus on one of them. The big AI story is that, with enough data and enough computing power, we might someday build a new caretaker for the human race: a so-called “superintelligence.” While this story drives human dreams and fears—and dominates the media sphere and policy imagination—it obscures the more realistic story about AI: what it is, what it means, and how it was built.

The invisible stories of AI are hidden in its training data. They are human: photographs of loved ones, favorite places, things meant to be looked at and shared. Some of them are tragic or traumatic. When we look at the output of a large language model (LLM), or the images made by a diffusion model, we’re seeing a reanimation of thousands of points of visual data — data that was generated by people like you and me, posting experiences and art to other people over the World Wide Web. It’s the story of our heritage, archives and the vast body of human visual culture. 

I approach generated images as a kind of seance, a reanimation of these archives and data points which serve as the techno-social debris of our past. These images are broken down — diffused — into new images by machine learning models. But what ghosts from the past move into the images these models make? What haunts the generated image from within the training data? 

In “Seance of the Digital Image” I began to seek out the “ghosts” that haunt the material that machines use to make new images. In my residency with the Flickr Foundation, I’ll continue to dig into training data — particularly, the Flickr Commons collection — to see the ways it shapes AI-generated images. These will not be one to one correlations, because that’s not how these models work.

So how do these diffusion models work? How do we make an image with AI? The answer to this question is often technical: a system of diffusion, in which training images are broken down into noise and reassembled. But this answer ignores the cultural component of the generated image. Generative AI is a product of training datasets scraped from the web, and entangled in these datasets are vast troves of cultural heritage data and photographic archives. When training data-driven AI tools, we are diffusing data, but we are also diffusing visual culture. 

 

Eryk Salvaggio: Flowers Blooming Backward Into Noise (2023) from ARRG! on Vimeo.

 

In my research, I have developed a methodology for “reading” AI-generated images as the products of these datasets, as a way of interrogating the biases that underwrite them. Since then, I have taken an interest in this way of reading for understanding the lineage, or genealogy, of generated images: what stew do these images make with our archives? Where does it learn the concept of what represents a person, or a tree, or even an archive? Again, we know the technical answer. But what is the cultural answer to this question? 

By looking at generated images and the prompts used to make them, we’ll build a way to map their lineages: the history that shapes and defines key concepts and words for image models. My hope is that this endeavor shows us new ways of looking at generated images, and to surface new stories about what such images mean.

As the tech industry continues building new infrastructures on this training data, our window of opportunity for deciding what we give away to these machines is closing, and understanding what is in those datasets is difficult, if not impossible. Much of the training data is proprietary, or has been taken offline. While we cannot map generated images to their true training data, massive online archives like Flickr give us insight into what they might be. Through my work with the Flickr Foundation, I’ll look at the images from institutions and users to think about what these images mean in this generated era. 

In this sense, I will interrogate what haunts a generated image, but also what haunts the original archives: what stories do we tell, and which do we lose? I hope to reverse the generated image in a meaningful way: to break the resulting image apart, tackling correlations between the datasets that train them, the archives that built those datasets, and the images that emerge from those entanglements.

Research diary: long-term thinking and lots of reading

New Research Fellow Jenn Phillips-Bacher shares what she’s been working on at the Flickr Foundation

It’s hard to believe that it’s already been two months since I joined the Flickr Foundation as a Research Fellow. Now that I’m settled in at HQ, I’m ready to share what I’ve been working on.

My starting point for this fellowship was to explore the long-term implications of digital collections access. I wanted to spend some time on the idea of tending to an ‘end of life’ for a collection – whether that’s through intentional institutional policies like digital weeding, or catastrophic loss through climate change. 

One of the first pieces I read was Dr. Temi Odumosu’s article The Crying Child: On Colonial Archives, Digitization, and Ethics of Care in the Cultural Commons, where she writes:

“…the opportunities for intervening both in back-end collections practices and web user experience, which insists on a more conscientious data flow around the commons, feels like something approximating practical ethics.” 

The phrase conscientious data flow has become a generative force for my research so far—I might as well have it tattooed on my arm. It’s made me think about the whole lifecycle of a digital object: how a photo or other object is selected for digitization and public access, what happens to it when people view and interact with it, and what traces it leaves behind.

In focussing on the lifecycle of an object, my reading has coalesced around three main areas:

  • Ethics of care throughout the life of a digital object
  • Responsible data stewardship and radical transparency
  • Climate impacts of unconstrained digital collections

Alongside these themes, I’m also getting more familiar with AI (no, really, what have I missed?), the decentralized web and the indieweb, Personal Knowledge Management systems, and generally how to be a good, care-full citizen of the Web. 

Here are some highlights:

Ethics of care

Following on from Dr. Odumosu’s work, I delved into the brilliant work of The Shift Collective who work directly with small community-based archives and memory workers to explore the cultural, financial and technological systems in which they operate. Their extensive research demonstrates how those systems must change to enable autonomy, equity and sustainability for the communities they serve. 

I’ve also been digging into the CARE Principles for Indigenous Data Governance, principles that set out how, as researchers or institutions working with Indigenous or marginalized communities, we can put autonomy back into the hands of those whose data (or content, objects or cultural heritage) is in the public realm. 

These principles are crucial grounding for the Flickr Foundation. We need to be aware of potential imbalances of power as a non-profit tech company that builds software for Flickr Commons and its preservation. To embody the CARE Principles in Flickr Foundation’s work means to design interventions that allow community control over their digital heritage and its preservation. 

Responsible data stewardship 

Flickr Foundation’s mission to keep Flickr photos visible for 100 years implies that we need new mechanics to move content around the web (whatever the web looks like in 20, 50, 75 years) and keep it somewhere where people can find it. What contextual information needs to travel with the Flickr photos to enable future generations to use them? And are we at that point even now, given that there are Flickr APIs that allow programmatic access to the Flickr corpus? What documentation is needed to support the ethical use of any slice of Flickr’s content? 

My introduction to this topic was the hot-off-the-press Datasheets for Digital Cultural Heritage (October 2023) which proposes a standardized template for cultural institutions to collaboratively document their open data sets derived from the digitization process. I’ve been working my way back through the history of the datasheet as a method of transparency, looking at the work of Emily Bender, Timnit Gebru, Mahima Pushkarna and other significant researchers in academia and industry, and following it through to current uses by Hugging Face and the Smithsonian

I’ve been working on mocking up a datasheet specifically for Flickr Commons. I’ll be ready to make this available for feedback from the Flickr Commons community in January 2024. 

Climate impact of digital collections

Though perhaps not as directly connected to the work of the Flickr Foundation, I’m keen to find out what the GLAM sector is doing to understand and plan for the long-term preservation and short-term access to its digital collections in light of the climate crisis. My sense is that culture workers are in the early days of considering the carbon costs of digital activities. And while climate change is a systemic issue that must be addressed through global cooperation, government policy and regulation, every one of us will need to make changes in the future.

If every job is a climate job, what does that mean for people working in cultural heritage? Will energy considerations make their way into collection development and retention policies, for example? 

And that’s not all

I’m fortunate to be based in the London office with 2/3 of the permanent team. Being part of Flickr Foundation HQ gives me a well-rounded picture of the breadth of its activities, and gives me a chance to work on software projects. I’ve chipped in on some user interface design ideation, helped to test Flickypedia before its launch, and started working up some design ideas for a Flickr to IIIF toy to help Flickr Commons members make their Flickr photos interoperable with alternative platforms.

If you’re interested in following along with what I’m reading, I’m keeping a list on my Pinboard.

And if there’s something you think is a must-read, send it my way!

Image credit: Bedbril / Glasses for reading in bed. Nationaal Archief / Flickr Commons.