This recent video demonstrates how Generative AI is changing gaming. Twitter post By @emmanuel_2m. He discusses how to use Stable Diffusion + Dreambooth (popular 2D Generative AI models) to create images of potions in a hypothetical game.
What’s transformative about this work is not just that it saves time and money while also delivering quality – thus smashing the classic “you can only have two of cost, quality, or speed” triangle. Artists can now create high-quality images in hours, compared to weeks if they were to do it manually. What’s truly transformative is that:
- Anyone can now have the creative power of creativity if they just need to learn a few tools.
- These tools allow for endless possibilities in a highly iterative fashion.
- Once trained, the process is real-time – results are available near instantaneously.
There hasn’t been a technology this revolutionary for gaming since real-time 3D. Talking to game developers is a great way to feel the excitement and wonder. Where is this technology taking us? How will this technology transform gaming? First, though, let’s review what is Generative AI?
What is Generative AI?
Generative artificial intelligence is a sub-category of machine learning that allows computers to generate new content based on user prompts. Although text and images are currently the most common applications of this technology today, there are still many creative projects underway. These include animations, sound effects and music, as well as creating virtual characters with fully fleshed personalities.
AI is not a new concept in games. Even early games, like Atari’s Pong, had computer-controlled opponents to challenge the player. These virtual foes were not AI, as we now know it. These virtual foes were just scripted programs created by game designers. They simulated an artificially intelligent opponent, but they couldn’t learn, and they were only as good as the programmers who built them.
What’s different now is the amount of computing power available, thanks to faster microprocessors and the cloud. With this power, it’s possible to build large neural networks that can identify patterns and representations in highly complex domains.
This blog post is divided into two parts:
- Part I is a compilation of our observations, predictions, and conclusions in the field Generative AI for gaming.
- Part II is our space market map, which outlines the different segments and names key companies within each.
Assumptions
First, let’s explore some assumptions underlying the rest of this blog post:
1. AI research will continue to expand, resulting in ever more efficient techniques.
This graph shows the number of papers on Machine Learning and Artificial Intelligence published in the arXiv archives each month.
As you can see the number of papers keeps growing at an exponential rate. And this just includes published papers – much of the research is never even published, going directly to open source models or product R&D. This has led to an explosion of interest and innovation.
2. Generative AI will have the greatest impact on games, which is the most entertainment form.
Due to the number of assets involved in games, they are the most complicated form of entertainment. Interactive games are the most interactive. They emphasize real-time interactions. This creates a significant barrier to entry for new game designers, as well a high price to produce a modern, top-selling game. Generative AI disruption is also possible.
Red Dead Redemption 2 was the most expensive game ever made, at nearly $500 million. It’s easy to see why – it has one of the most beautiful, fully realized virtual worlds of any game on the market. It took almost 8 years to create, has more than 1000 non-playable characters (each one with their own personality, artwork and voice actor), covers a large area of 30 miles, has more than 100 missions, and features nearly 60 hours worth of music by more than 100 musicians. This game has everything you need.
Now compare Red Dead Redemption 2 to Microsoft Flight Simulator, which is not just big, it’s enormous. Microsoft Flight Simulator allows players to fly all over the planet Earth, covering 197 million miles. How did Microsoft create such a huge game? An AI did it. Microsoft partnered up with blackshark.ai to train an AI to create a photorealistic, 3D world using 2D satellite images.
This is an example game that AI makes possible. Furthermore, these models can be improved continuously over time. For example, they can enhance the “highway cloverleaf overpass” model, re-run the entire build process, and suddenly all the highway overpasses on the entire planet are improved.
3. Each asset that is involved in game production will be covered by a generative AI model
Stable Diffusion and MidJourney are two-dimensional image generators that generate images. This is due to their attractiveness. There are Generative AI models available for almost all game assets, including 3D models, character animations, dialog, and music. The market map highlights companies that are focused on each type of content in the second part of this blog post.
4. In some cases, the price of content may drop to almost nothing.
Talking to game developers who have been experimenting with Generative AI in their production process, the most exciting aspect is the drastic reduction in cost and time. One developer shared with us that their time to produce concept art for a single image from start to end has decreased from 3 weeks to one hour. That’s a 120-to-1 drop. Similar savings can be expected across the entire production line, we believe.
Artists are not at risk of being replaced, it is clear. Artists no longer have to be creative. They can set the initial direction and then leave the technical execution and time-consuming tasks to an AI. In this, they are like cel painters from the early days of hand-drawn animation in which highly skilled “inkers” drew the outlines of animation, and then armies of lower-cost “painters” would do the time-consuming work of painting the animation cels, filling in the lines. It’s the “auto-complete” for game creation.
5. This revolution is still in its infancy and many practices need to be improved.
Despite all the excitement, we’re still only at the start. We still have a lot of work to do before we can harness the new technology for gaming. Companies who are quick to move into this space will reap enormous benefits.
Predictions
These assumptions are based on which predictions can be made about how the game industry might change.
1. Generative AI is a highly marketable skill.
We are already seeing Generative AI being used more efficiently by some than others. This new technology is best used with a variety tools and techniques. It’s important to know how to use them all. This will make it a marketable skill that combines the creative vision of an artists with the technical skills and programming knowledge of a programmer.
Chris Anderson is famous for saying, “Every abundance creates a new scarcity.” As content becomes abundant, we believe it’s the artists who know how to work most collaboratively and effectively with the AI tools who will be in the most short supply.
Generative AI is one example of the special challenges that Generative AI presents when it comes to producing artwork.
- Coherence. Any production asset should be capable of being edited or modified down the line. AI tools allow you to create a duplicate of the asset by using the same prompt. This allows you to make further changes. This can be difficult as different prompts can produce very different results.
- Style. It’s important for all art in a given game to have a consistent style – which means your tools need to be trained on or otherwise tied to your given style.
2. The ability to lower barriers will increase risk-taking and creativity
We may soon be entering a new “golden age” of game development, in which a lower barrier to entry results in an explosion of more innovative and creative games. This is not because production costs are lower, but because tools enable the creation of high-quality content that appeals to a wider audience. Which leads to the next prediction…
3. A rise in AI-assisted “micro game studios”
Armed with Generative AI tools and services, we will start to see more viable commercial games produced by tiny “micro studios” of just 1 or 2 employees. The idea of a small indie game studio is not new – hit game Among Us was created by studio Innersloth with just 5 employees – but the size and scale of the games these small studios can create will grow. This will result in…
4. An increase in the number games released each and every year
Roblox’s success has shown that more games can be built by providing powerful creative tools. Generative artificial intelligence will reduce the bar, creating even more games. The industry already suffers from discovery challenges – more than 10,000 games were added to Steam last year alone – and this will put even more pressure on discovery. However we will also see…
5. GenerativeAI created new game types that were not possible before.
New game genres will emerge that would not have been possible without Generative Artificial Intelligence. We already talked about Microsoft’s flight simulator, but there will be entirely new genres invented that depend on real-time generation of new content.
Take Arrowmancer by Spellbrush. This RPG game features AI-created characters that allow for almost unlimited gameplay possibilities.
Another game developer uses AI to allow players create their own avatar. Previously they had a collection of hand-drawn avatar images that players could mix-and-match to create their avatar – now they have thrown this out entirely, and are simply generating the avatar image from the player’s description. An AI that generates content for players is safer than players uploading their own content. The AI can also be trained to avoid offensive content and give players a greater sense ownership.
6. AI tools that are specific to the industry will have more value than foundational models.
Although the excitement surrounding foundational models such as Midjourney and Stable Diffusion are driving eye-popping valuations and attention, there is no denying that there will be new models in the future due to the constant flood of research. Consider the website search traffic to three popular Generative AI model: Stable Diffusion and Midjourney. Every model deserves its spotlight.
Another approach is to create industry-aligned tools that are focused on the Generative AI needs for a specific industry. This will include deep understanding of a particular audience and rich integration with existing production processes (such as Unity and Unreal for gaming).
Runway is a good example. It targets the needs and wants of video creators using AI-aided tools such as video editing, motion tracking, inpainting and green screen removal. These tools can help build and monetize an audience. They also allow for the creation of new models as they evolve. We have not yet seen a suite such as Runway for games emerge yet, but we know it’s a space of active development.
7. There are legal challenges ahead
The common thread in all these Generative AI models is the use of large datasets of content. Many of them are created from scraping the Internet. Stable Diffusion is an example of this. It uses more than 5 billion image/caption pair pairs that were scraped from various websites.
At the moment these models are claiming to operate under the “fair use” copyright doctrine, but this argument has not yet been definitively tested in court. It is clear that there will be legal challenges which will change the landscape for Generative AI.
It’s possible that large studios will seek competitive advantage by building proprietary models built on internal content they have clear right & title to. Microsoft is a good example of this with 23 studios that are first party today and 7 more after Activision’s acquisition.
8. Programming will not be disrupted as deeply as artistic content – at least not yet
Software engineering is the other major cost of game development, but as our colleagues on the a16z Enterprise team have shared in their recent blog post, Art Isn’t Dead, It’s Just Machine-Generated, generating code with an AI model requires more testing and verification, and thus has a smaller productivity improvement than generating creative assets. Coding tools like Copilot may provide moderate performance improvements for engineers, but won’t have the same impact… at least anytime soon.
Recommendations
These predictions have led us to the following recommendations:
1. Start exploring Generative AI now
It’s going to take a while to figure out how to fully leverage the power of this coming Generative AI revolution. Companies that act now will reap the benefits of this emerging Generative AI revolution later. We are aware of several studios that have ongoing experimental projects to test how these techniques could impact production.
2. Look for market map opportunities
Some parts of our market map are very crowded already, like Animations or Speech & Dialog, but other areas are wide open. We encourage entrepreneurs interested in this space to focus their efforts on the areas that are still unexplored, such as “Runway for Games”.
Market state at the moment
We have created a market map to capture a list of the companies we’ve identified in each of these categories where we see Generative AI impacting games. This blog post will go through each of these categories, providing more information and highlight the most interesting companies in each.
2D Images
One of the most common areas of generative AI, is the creation of 2D images from text. Stable Diffusion and Midjourney are all capable of creating high-quality 2D images from text. They have been used in game production at various stages of the game’s life cycle.
Concept Art
Generative AI tools are excellent at “ideation” or helping non-artists, like game designers, explore concepts and ideas very quickly to generate concept artwork, a key part of the production process. One studio, whose identity will not be revealed, uses several of these tools in combination to drastically speed up the concept art process. They can create images in a matter of days, instead of taking as many as three weeks.
- Their game designers use Midjourney first to explore new ideas and create images that they find inspiring.
- These get turned over to a professional concept artist who assembles them together and paints over the result to create a single coherent image – which is then fed into Stable Diffusion to create a bunch of variations.
- They discuss these variations, pick one, paint in some edits manually – then repeat the process until they’re happy with the result.
- At that stage, then pass this image back into Stable Diffusion one last time to “upscale” it to create the final piece of art.
2D Production Art
Some studios are already trying out the same tools to create in-game production artwork. Here’s an example from Albert Bozesan: Use Stable Diffusion to create 2D assets for in-game.
3D Artwork
3D assets are the core of modern games as well as the future metaverse. A virtual world (or game level) is basically a collection 3D assets. These assets can be modified to populate the environment. However, creating a 3D asset is more complicated than creating a 2D picture. It involves several steps, including creating a 3D model, adding textures, and effects. For animated characters, it also involves creating an internal “skeleton”, and then creating animations on top of that skeleton.
We’re seeing several different startups going after each stage of this 3D asset creation process, including model creation, character animation, and level building. This is not yet a solved problem, however – none of the solutions are ready to be fully integrated into production yet.
3D assets
Hypothetic, Mirage and Kaedim are three startups that have attempted to solve the 3D modeling problem. Larger companies are also looking at the problem, including Nvidia’s Get3D and Autodesk’s ClipForge. Kaedim is focused on image to-3D while Get3d and ClipForge are focused on text to-3D. Hypothetic is interested both in text-to-3D searching and image-to-3D.
3D Textures
The texture and materials used to create a 3D model are what make it look real. The appearance and feel of a scene can be drastically altered by the choice of weathered, mossy texture to be applied to it. Textures include metadata that describes how light reacts with the material (e.g. roughness, shininess, etc). Artists can easily create textures using text or images prompts, which will greatly increase the speed of iteration within the creative process. BariumAI, Ponzu and ArmorLab are all looking into this possibility.
Animation
The most time-consuming, costly, and skilled part of creating great animation is the game design process. Motion capture is a way to lower costs and create realistic animation. This involves putting an actor or dancer into a motion capture suit, recording them on a specially-equipped motion capture stage.
We’re now seeing Generative AI models that can capture animation straight from a video. This is a much more efficient method of capturing animation, as it eliminates the need to use expensive motion capture equipment. You can apply filters to existing animations using these models, making them look happy, old, or drunk. Plask, DeepMotion and RADiCAL are some of the companies that are pursuing this market.
Level design & world building
The most tedious aspect of creating a game is the creation of its world. This task should be easy for generative AI. Games like Minecraft, No Man’s Sky, and Diablo are already famous for using procedural techniques to generate their levels, in which levels are created randomly, different every time, but following rules laid down by the level designer. The new Unreal 5 game engine has a lot to offer in terms of open-world design tools, such as foliage placement.
We’ve seen a few initiatives in the space, like Promethean, MLXAR, or Meta’s Builder Bot, and think it’s only a matter of time before generative techniques largely replace procedural techniques. Academic research has been ongoing in this area for some time, including the development of generative techniques in Minecraft and level design in Doom.
A compelling reason to be excited about generative AI tools to level design is the possibility to create different levels and worlds. You could imagine asking tools to generate a world in 1920’s flapper era New York, vs dystopian blade-runner-esque future, vs. Tolkien-esque fantasy world.
The following concepts were generated by Midjourney using the prompt, “a game level in the style of…”
Audio
Sound and music are integral to the gameplay experience. We’re starting to see companies using Generative AI to generate audio to complement the work already happening on the graphics side.
Sound Effects
AI has a lot of potential in sound effects. There have been academic papers exploring the idea of using AI to generate “foley” in film (e.g. However, there are few products available in the gaming market.
We think this is only a matter of time, since the interactive nature of games make this an obvious application for generative AI, both creating static sound effects as part of production (“laser gun sound, in the style of Star Wars”), and creating real-time interactive sound effects at run-time.
Consider something as simple as generating footstep sounds for the player’s character. A majority of games include a limited number pre-recorded footstep sounds, such as walking on grass or running on grass. These sounds are repetitive and difficult to manage and generate.
Better would be to use real-time generative AI models for foley sound effects. These models can generate sound effects that are slightly different each time and that respond to game parameters like gait, weight of characters, and gait.
Music
Games have always had music as a challenge. It’s important, since it can help set the emotional tone just as it does in film or television, but since games can last for hundreds or even thousands of hours, it can quickly become repetitive or annoying. Also, due to the interactive nature of games, it can be hard for the music to precisely match what’s happening on screen at any given time.
Adaptive music has been a topic in game audio for more than two decades, going all the way back to Microsoft’s “DirectMusic” system for creating interactive music. DirectMusic was not widely adopted due to the difficulties of creating music in this format. Only a few games, like Monolith’s No One Lives Forever, created truly interactive scores.
Now we’re seeing a number of companies trying to create AI generated music, such as Soundful, Musico, Harmonai, Infinite Album, and Aiva. And while some tools today, like Jukebox by Open AI, are highly computationally intensive and can’t run in real-time, the majority can run in real-time once the initial model is built.
Speech and Dialog
Many companies are trying to make realistic voices for characters in video games. It is not surprising, given the history of speech synthesis being used to give computers voices. Sonantic, Coqui and Replica Studios are just a few of the companies.
The many benefits of using generative AI to speak are partly responsible for the high number of users in this space.
- Generate dialog on-the-fly. Most speech in games is recorded from voice actors. But these are limited to canned speeches. With generative AI dialog, characters can say anything – which means they can fully react to what players are doing. Combined with more intelligent AI models for NPC’s (outside the scope of this blog, but an equally exciting area of innovation right now), the promise of games that are fully reactive to players are coming soon.
- Role playing. Many players wish to pretend to be fantasy characters with little to no resemblance to real-life. However, this fantasy is broken when players use their own voices. Using a generated voice that matches the player’s avatar maintains that illusion.
Control. You can adjust the tone of the speech as it is generated. - Localization. It allows dialogue to be translated into other languages and spoken in the same language. Deepdub is a company that specializes in this niche.
NPCs and player characters
Many startups are exploring the use of generative AI to create believable characters that you can interact with. This is partly due to the fact that this market has so many applications beyond games like virtual assistants and receptionists.
Since the inception of AI research, efforts to make believable characters have been ongoing. In fact, the definition of the classic “Turing Test” for artificial intelligence is that a human should be unable to distinguish between a chat conversation with an AI versus a human.
There are currently hundreds of chatbot companies that have been developing general purpose chatbots. Many of these are powered by GPT-3-like language models. A smaller percentage of companies are building chatbots for entertainment, such Replika and Anima. They are also trying to make virtual friends. You may not be aware of the possibility of having a virtual girlfriend as described in Her.
Now, we are seeing the next generation of chatbot platforms like Charisma.ai and Convai.com. These platforms allow for fully rendered 3D characters. This is important if they’re going to fit within a game or have a narrative place in advancing the plot forward, versus purely being window dressing.
All-in one platforms
Runwayml.com is one of the most popular generative AI tools. It combines a wide range of creator tools into a single package. We believe this is a missed opportunity. We would like to invest in a solution which features:
- A complete set of generative AI software tools for the production process. (code, asset generation, textures, audio, descriptions, etc.)
- Integrated tightly with popular game engines, such as Unity or Unreal.
- This product is designed to be used in a typical game production line.
Conclusion
This is a fantastic time to be an game creator! Thanks in part to the tools described in this blog post, it has never been easier to generate the content needed to build a game – even if your game is as large as the entire planet!
It’s even possible to one day imagine an entire personalized game, created just for the player, based on exactly what the player wants. This has been in science fiction for a long time – like the “AI Mind Game” in Ender’s Game, or the holodeck in Star Trek. But with the tools described in this blog post advancing as quickly as they are, it’s not hard to imagine this reality is just around the corner.
We are interested in hearing from potential founders and founders who are interested in building an AI for Gaming business. We are interested in hearing from you!
***
These views are solely those of AH Capital Management L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Some information in this document has been obtained from third parties, including portfolio companies for funds managed by A16z. Although the information is believed to have been reliable, a16z does not verify it and cannot make any representations about its current or enduring accuracy or suitability for a particular situation. This content could also contain third-party ads. a16z is not responsible for reviewing such advertisements and has no endorsement of any advertising content.
This content is for informational purposes only and should not be regarded as legal, financial, investment or tax advice. For these matters, you should consult your advisors. These references to digital assets or securities are intended to be illustrative and not to serve as an investment recommendation. This content is not intended to be used by investors or potential investors and should not be relied on when making a decision not to invest in any fund managed under a16z. The private placement memorandum and subscription agreement of an a16z fund are the only documents that will offer to invest in such a fund. These documents should be fully read. The portfolio companies or investments mentioned or referred to are not representative for all investments in vehicles managed under a16z. There is no guarantee that future investments will have similar characteristics and results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.
These graphs and charts are provided for informational purposes only and should not be used to make investment decisions. Past performance does not necessarily indicate future results. The information is current as of the date indicated. These materials may contain projections, estimates or forecasts as well as targets, prospects, targets and opinions. They are subject to change without notice. Please see https://a16z.com/disclosures for additional important information.