The Age of Augmented Media

What happened

We have been thinking about three interconnected technologies. The first is OpenAI’s report on the ongoing development of Sora, an artificial intelligence tool that can generate highly realistic synthetic video from text descriptions. As OpenAI collaborates with artists to explore its boundaries, Sora's capabilities are growing. One of those artists was Shy Kids, a Toronto-based collective   which set out to “paint a portrait of a single character using Sora.” The result was a short film they called Airhead. 

Not everything in Airhead was made by Sora. Spy Kids added the music and the voice-over. Combining the human with content the AI is what makes the film feel alive, says Walter Woodman, a member of Spy Kids. “The technology is nothing without you,” he says. “It is a powerful tool, but you are the person driving it.”

On the other hand, Paul Trillo created a video that is all Sora. “Working with Sora is the first time I’ve felt unchained as a filmmaker,” he said. “Not restricted by time, money, other people’s permission, I can ideate and experiment in bold and exciting ways.”

The second technology is another announcement from OpenAI, this one about its new Voice Engine, a tool that can mimic human voices with startling accuracy based on just a 15-second sample of someone speaking. The Voice Engine can even use a voice sample in one language to create a replica voice in other languages.

The last technology is Apple’s VisionPro, Apple’s new mixed reality headset - or  "spatial computer.". As Axios’s Ina Fried  concludes in her review, “the headset represents a significant improvement in the state of the art for mixed reality” and offers a glimpse of an intriguing future, one in which we may see ubiquitous AR glasses or even holographic projection systems that could render 3D visualizations seamlessly overlaying our environment

So what

Image created in Poe with the help of PromptGenius and DALL-E-3

Over the next 12 years, the convergence of highly realistic synthetic video from tools like Sora, the Voice Engine's ability to generate human-like voices, and extended reality technologies like AppleVision could open up entirely new creative and persuasive mediums that did not previously exist.

We may see the rise of "AI Storytellers" - engines that generate unique films, videos or interactive narratives tailored to each individual viewer's preferences, allowing them to shift perspectives within the story, or even to become part of the story themselves. This blurring of real and fictional media could catalyze new arts and expression. Entire new economies could arise around ubiquitous augmented reality and bespoke synthetic experiences. Contentious debates are also likely around data rights, consent protocols and ensuring equitable sharing of the economic benefits as AI is trained on our lives.

While enabling powerful new storytelling capabilities and new ways to express one’s identity, the normalization of hybrid realities could lead some to disconnect from the physical world and “analog” relationships. Such uncharted territory might necessitate new practices and mental health support systems to help people skillfully navigate coexisting synthetic/physical experiences.

The blurring of real and fictional media could also amplify misinformation by making it easier to create misleading content indistinguishable from reality. Sophisticated "reality hacking" - coordinated disinformation operations using deepfakes to incite chaos, undermine institutions and shape perception around major events. If we cannot discern authentic reality from an AI-generated simulation, how will be able to trust our institutions and each other?

It’s this challenge that many state governments are grappling with ahead of the 2024 election. Over the last year, state legislatures have considered 101 bills addressing AI and election disinformation, and five states – Oregon, Wisconsin, New Mexico, Indiana, and Utah – enacting laws. However, new decentralized models for curating credible information and fact-checking may emerge to replace traditional, centralized authorities.  

Artifacts from a Future of Augmented Media

Storyscape is an AI-powered "narrative engine" that allows users to co-create and immerse themselves in dynamic, evolving story worlds. It combines real-time video synthesis, audio generation, simulation, and mixed reality display technologies. The story evolves collaboratively – with the AI responding to the user’s actions through speech, gesture, and even emotional tracking.

Polyidra generates first-person narratives from different perspectives that attempt to reconstruct differing experiences and understandings of current and historical events. Polyidra creates each perspective as an interactive world that users can immerse themselves in. The goal is to promote empathy, develop a rich understanding of complex topics, and counteract the balkanization of information.

Image created in Poe with the help of PromptGenius and DALL-E-3

Reality Mediation Councils. These broadly representative councils, comprising experts and citizens from across the community, use technological forensics and an in-depth assessment of potential impact to certify content and uphold standards that align with the community's narratives, experiences, and value systems. The goal is to help communities develop a shared understanding of reality in a complex hybrid media landscape.

Learning in an Augmented Media Future

As synthetic audio/video becomes increasingly indistinguishable from reality, cultivating advanced multimedia literacy and critical reasoning abilities will be vital for navigating this complex landscape. Citizens will need:

  • Robust self-awareness and grounding practices to stay anchored in one's core identity and values.

  • Skills to detect misleading or fabricated media through multi-modal analysis (audio, visual, context cues)

  • Understanding of how to use multimodal generative tools to communicate effectively

  • Media fluencies to comprehend how synthetic media is constructed and its implicit biases

  • Critical thinking to distinguish truth from fiction and authentic vs orchestrated information flows

  • Ethical literacies around consent, IP rights and norms for responsibly creating/sharing synthetic media

Such skills are at the heart of Washington’s MisInfo Day. Founded in 2019, this year more than 500 students around the state participated in a day of activities designed to help them become better at navigating “the proliferation of exaggeration, spin, and outright lies that could pass for facts and evidence online.”

While it comes with serious risks related to misinformation and social cohesion, AI-driven video, audio, and mixed reality also enables powerful learning possibilities:

  • Interactive learning experiences driven by AI storytelling engines

  • Personalized tutors and learning that adapts to each student's background, interests, pace, etc.

  • New possibilities for student expression through generative multimedia

  • Simulation-based learning leveraging synthetic scenarios and environments

  • Making abstract concepts viscerally understandable through AI visualization

Moreover, it just might be, as Steve Johnson recently suggested in Inside Higher Ed, that these technologies “signal . . . essential changes in the realms of human thinking, learning, writing and communicating . . . that these new tools open paths to multimodal, nonlinear thinking and communication that humans have not fully explored.”

Food for thought

  • As generative AI allows infinite simulations of anyone's likeness or personal experiences, how do we instill robust understandings of one’s identity and what is authentically "real" vs. illusion?

  • How might we make learning to detect AI-generated synthetic media as engaging and intuitive as playing the latest videogame or AR app? What kinds of immersive experiences could help people build these skills from an early age?

  • With AI's generative potential extending into such ethically murky areas, what processes should communities use to collaboratively set boundaries and rules around synthetic media's role in education and society? How might we get student voices driving those norms?

  • Envisioning the future when AIs can simulate virtually any reality, what skills do students need to "hold their own" and intentionally render the worlds/narratives they want to experience? What skills will educators themselves need to become adept at so that they can be effective guides and mentors for students?

  • As AI-enabled virtual learning environments become more common, how might we redesign all hardware, software, and data policies to prioritize authentication, privacy, and ethical oversight from the ground up? What new technological architectures will be required?

 

Reply

or to participate.