Did We Just Change Animation Forever?
Wouldn’t it be cool if you could film yourself and easily turn into anything you want like a cartoon character? Like what if this could just be a means of capturing performance and you can visualize whatever your imagination wants afterward? There’s kind of no limits. Creativity is only accessible to films or animations with multi-million dollar budgets, but it’s part of our humanity to try to visualize things that don’t exist.
Let’s talk about traditional 2D animation cartoons. The most creatively liberated medium is also the least democratized. It takes incredibly skilled people drawing every single frame of your movie to make it happen. But I think we came up with a new way to animate – a way to turn reality into a cartoon. It’s one more step towards true creative freedom where we can easily create anything we want.
So how do we do this? How do we turn a video into a cartoon? Well, we’ve been making a series of videos where we experiment with AI image processing. This is similar to AI image generation, but in our case, we are transforming images rather than generating them from scratch. One of the newest pieces of technology is a machine learning process known as diffusion. At its core, the diffusion process lets a computer generate an image from noise, much like how we imagine an image from an inkblot or looking at clouds. So if we take a picture we already have and we put a little bit of noise on top, you can have the computer clear up that noise while drawing in new details that weren’t there before. It’s a lot like looking at a picture and squinting your eyes and trying to imagine you’re looking at something else. And if you squint a lot, well, what you’re looking at becomes really fuzzy and you can imagine that you’re looking at something different. So current technology can do this amazingly with one image. So why haven’t we seen it applied to video? Should we be seeing mind-blowing visuals all over the internet?
Fortunately, the moment people tried applying this to video everything fell apart. The first step of the process requires us to noise up our image, every single frame ends up looking different and the video gets super flickery. The very nature of this task seems to make it impossible to ever work with video and well I nearly gave up on using it but Dean here at the studio kept experimenting and he showed that with a little bit of VFX problem solving this might be something that we could actually overcome because at the time that Dean and Fenner went out and made that spider-verse short and I set off on a personal quest to use this tank to enable me to make my very own anime.
So it was time to Sherlock Holmes our way through this and engineer a solution this flickery problem. The first clue was Jurassic Park. Six months ago a user on YouTube named hops uploaded an experiment he had processed Jurassic Park to look like low-poly Zelda and he used an interesting new technique to noiseify the image and he had achieved excellent results so I’ll explain every time you process an image the noise that you put on top changes therefore the forms in the image get changed every frame if we freeze that noise Things become a lot more solid in the image but now it kind of feels like the image is moving underneath a weird warp layer and the details stay inconsistent so the new trick used in that little Jurassic Park video is simple if we can turn noise into an image let’s just reverse an image back into the noise it would have come from literally just do it backwards therefore the noise is no longer randomly changing every frame nor is it locked if the two frames are nearly the same so is their fuzzy noised up versions and therefore they get interpreted in their way so we had one piece to the puzzle solved no more random noise on our image but I tried to make my video look like a cartoon and every frame was being drawn with a different cartoon style so we fixed one problem just to encounter another problem dead end well no because around this time style models were becoming a thing and the stable diffusion space a person named nitrosock started creating some amazing diffusion models built to convert your image into one specific style so then it dawned on us this was the key to eliminating the style flicker like imagine telling a hundred different people to draw a cartoon dog you’re going to get a hundred different dogs back now imagine giving everyone a character style sheet saying draw a cartoon dog exactly like this the images are going to look a lot more similar so we had to train our own model specifically on only one style that we wanted to replicate another big step forward but it only uncovered another new problem on tests I performed on myself the features on the face were still changing and jumping all over the place so the idea us we made a video where we trained ourselves into a diffusion model so that we could tell an epic fantasy story why not do that here I train a model to not only replicate a specific style but also specifically know a character me I traded on images of me wearing the same clothes on the same green screen background that is using for my test sequence and suddenly boom everything locked in the consistency between frames was hugely improved well almost it still wasn’t quite perfect but we’re VFX artists who have worked with crappy video files surely we must have a tool in our Arsenal to deal with light flickering so we applied the D flicker plugin in DaVinci Resolve and we set it to remove flickering light and it was that simple suddenly we were there it was working we had a consistent moving emotive cartoon character and is all just driven with video of us on the green screen I think we’ve cracked a workflow here for getting something that looks like a cartoon and it’s pretty Bonkers it’s working you know I think we could do maybe a couple more experiments but it’s like ready to rock I think Nico has figured out the key to creating a consistent character so now it was time to make that anime that I’ve always wanted to make because as you’ve seen on the channel we have our anime videos anime fidget Spinners anime self-driving cars anime baseball but they’re all filmed for real because we don’t run an animation studio and it’s very hard to do so these days so I’ve had this idea forever anime rock paper scissors what is the ultimate rock paper scissors game two twin princes born at the same time equal claim to the throne it must be decided that day for the King has died and rock paper scissors must happen so we wrote this short film the next step is to treat it like a cartoon which means recording the dialogue first.
Nico, who are you playing today?
I’m playing Flip, and I have no idea what voice I’m going to use. I figured that we would start and say, “Just go through a few lines, and then we’ll throw a filter or two on it to help nail the vibe.”
My character is Jules. My research here I just remember the shot of Clint in here screaming into a microphone. I’m just going to Channel Clint today. Nico has said that “You go, you whisper, and then you scream, and there’s no in-between.” Full Dynamics – you go really quiet, and you get really loud, and you do full volume compensation on there, so it just sounds like somehow you have this incredible presence where even your quietest moments are as loud as your screams.
Then we designed the costumes for each character – basically just went on Etsy and bought some goofy clothes. Well, look at this high level, um, costuming here. Oh no, we lost like half the tape. What we’re trying to do is you see all these doodads here is not going to be great for the anime style because this kind of intricate detail wouldn’t be included in the illustrations because it would just be more pencil mileage for the animators. So we’re covering all that up with a color that’s similar to the rest of the costume. Once this is put into the animation space, all this Jank gets covered up with beautiful Illustrated lines.
What we did on the green screen is we basically imagined that we were puppets, right? We’re puppeteering a cartoon character, so we’re posing like the cartoon character, and we’re not doing any audio recording. Instead, the audio is already recorded like a real cartoon, and we’re out there just like we’re basically just puppets. Because once again, we’re trying to treat it like a cartoon, so there’s a specific way we have to film this. We’re basically Gathering assets โ obviously, we’re Gathering paper cutouts that we’re going to then pose in our comic book. A lot of these shots are just us standing in a position, and we just need a single frame because a lot of Animation is just single frames of characters standing there against the background while you hear dialogue.
And there are a couple of rules that we’ve discovered that we have to follow for this. First and foremost, you know the style that we’re matching basically has single Direction lighting. It’s not like you have characters that are painted with a beautiful Edge light and a key light and a fill light in a different direction for an under light. To actually have to paint that stuff onto a cell would be a lot of work, and so a lot of cartoons don’t do that. They just stick to a basic light tone-dark tone shading system, at most, and oftentimes, it’s just a single pink color.
At the end of the day, we get to take risks. We get to try things that might break it because, boy, look at this giant crew we have โ oh, wait, we have like two people and an actor, right? It’s like in Hollywood you’d have 100 people on set doing a big movie. So, we don’t need a costume department; we don’t need a makeup department; we don’t need a crazy camera department. Right now, we’re just being very, very simple and very, very straightforward, just focusing on the concept.
Sam and I spent a lot of time conceptualizing these shots because, at the end of the day, anime is just really all about tying your visual language to the story and really getting into their stylizing it, thinking about things metaphorically, all that kind of stuff. Once again, it’s your ideas that matter โ it’s your direction, your story โ because the AI takes it the rest of the way. I love it; it’s such an awesome way to work.
“Hey guys, so we are commemorating the launch of this anime rock paper scissors video with an awesome limited edition shirt. It’s only available for the next seven days, available in tee and long sleeve tee. It’s super cool, it’s designed by human, and once again, it’s seven days only. These are limited edition, so snag one right now.
Also, if you want a little tip here, come over here. So subscriptions to our website cost 3.99 a month, but subscribers get a 15% discount on all merch. So if you really want to get cool, you could go and subscribe to our website and then use that discount. And then basically it’s the same price, plus now you have a free one month of subscription to our website. Just saying, don’t tell anyone I told you that. This is between us, cool?
Alright, so now that I walked you guys through the theory and the techniques that we’re going to try to apply here, let me show you the workflow that we came up with that led us to 120 effect shots for this piece.
So here’s the video of me on the green screen in costume doing all the acting with the lip sync, right? So what we want to do is we want to have the AI basically trace this to trace it in a cartoon style. So we need to train a model to both know what I look like as well as what style we want to apply. So when we were on the green screen filming, once we were done, we also took a bunch of pictures of me, my face, my body, different poses, some full body shots, pictures of my back, different lighting, a picture of Nico Perringer, “Hey there, I am doing like the weird Fonzie point.”
So then we went and we took a bunch of frames from Vampire Hunter D Bloodlust, which is an anime that came out in about the year 2000. It’s actually free on YouTube, you can just go watch it if you want to, it’s actually pretty cool. And we tried to grab frames like different people, some face shots, some torso shots, full body shots, hands, hair, even some abstract things like flowers. Because with all these different objects, with each picture effectively being a different object, a different character, when we train the model, it’s not going to learn any single subject. Instead, it’s going to learn the style in which all these subjects were drawn.
One little fun little thing here, if you look at the Vampire Hunter D dataset, you’ll notice there’s no characters with full beards. There aren’t any characters that pull beards in the movie, and this was a problem actually when we first made the model, and it tried to represent me. Sometimes I’d have a mustache, sometimes I’d have mutton chops, it’d be all over the place. So I generated a bunch of pictures of me, some of them look good, most of them didn’t, but I took the good ones and I re-added them back into the dataset and then trained the model one more time. I made a new one, Nico Perringer, in the style of Vampire Hunter D. There it is, it’s me as an anime man, and notice it’s got the costume details correct, it’s got my facial features correct, my beard. It’s the same character every single time, at least consistent enough every single frame.
So we got the model done, and now we can use this for all the shots of me. I’ve just received a fantastic plate from Nico. What I’m going to do now is take the individual images, and we’re going to run each one through stable diffusion to get an animated character.
We have our positive prompts, which are vampunt D aesthetic style, cell animation of Nico Perringer, man, beard, profile, fist, hand. But we also have this negative prompt section which can steer us away from certain things like detailed intricate Lazy Eye Photography render CGI. These are all the things that we want to avoid coming out. And then basically there’s a bunch of freaking sliders, and it’s really boring, and there’s all that stuff that Nico goes into way more depth over on Corridor digital.com because we don’t want to bog you guys down here with the really silly specific stuff. But there’s a bunch of processes that are laying on top of that image to image process and the output is this. There is a really cool animated frame of Nico. There is actually an interpretive element to what the AI is doing. There’s blue shading on the fur on his shoulders and basically two-tone shading in his face. It’s a really incredible process that we’ve gotten way too used to at this point. We kind of forget how crazy this technology is. So what’s great is that we can just now run the entire image sequence with this prompt and these settings. It’s working really well but you know there’s a few janky frames like his head just pops in for a second and then pops back out little inconsistencies but they only exist on a single frame basis you know if only there was a way to kind of remove a little bit of this Flicker and smooth out this sequence into a nice consistent character.
Alright, so Dean’s giving me this whole image sequence now let’s check it out. Not bad it’s a little flickery but like it’s consistent right you can see the character you can see some of the mouth moving. Alright so we take the shot I’m going to pop it open in Fusion here and resolve so the next thing we need to add is the D flicker there’s two different modes one for time lapse and one for fluorescent lights and if we turn on the fluorescent light D flicker check out the difference much better and if it’s not good enough here’s how I fix it copy paste paste paste so now we have all these flickers turned on look how stable my face and my costume are the next step is just to pull a green screen key and remove the background and then finally reduce the frame rate from 24 frames per second to 12 to look a little bit more like animation and remove even more of that flickering feeling now that we have an anime man we need an anime world for him to live in that’s where Sam comes in so very early on in the process we’re thinking how do we get consistency in these shots we want one environment one location so I’m using a environment in Unreal Engine as the foundation for everything and just like the video how we’re taking a video still frame and then applying style to that I’m taking a render from Unreal and applying a style to that that allows us to have consistency we can go close up we can do wide angles all the same objects in the scene stay consistent here’s our Cathedral here I think it was called the gothic interior mega pack The Moment I Saw the screenshot of this environment on the marketplace I knew was perfect.”
I’ve done a little tweaking, redone some of the lighting, little modifications here and there to make it match the vibe a little bit better. It is just bleeding with sweet intricate detailed Gothic style, and it’s just ripe for taking cool pictures of. I have a camera placed for every shot in the piece, and if we go back out here, look at that, look at all these cameras. Every single one of these cameras is a different angle for a different background flight. So shot 250, it’s waste of shot, maybe something like here-ish, yeah, there we go. So I’m not even going to render these out with the normal renderer. I’m literally just going to take screenshots of this stuff. So screenshot, all right, it’s a whip pan, so let’s pan over. Pan over, Boop, and maybe one more just for kicks. All right, so now I’ve gotten four images, Stable Diffusion kinda takes it from there. Let’s go to my handy dandy prompt sheet of all the different little prompts I’ve tried out. I believe this one right here, expressive oil painting, dark beautiful Gothic Cathedral interior, hyper-detailed brush strokes, expressive Japanese 1990s anime movie background, oil painting matte painting, negative prompts, blurry. I’m all set, let’s take a look.
All right, so that’s basically what the process looks like for all these background plates. Now that these background plates are processed, the shot has everything it needs to get composited. Basically an anime, there’s no 3D moves, it’s all either painted backgrounds or cell-animated characters. So we wanted to stay true to that. So for these shots, we have a script that was created by Nico that has everything we need. It’s got lens distortion, it’s got clothes, it’s got light rays, it’s got all this cool stuff. Basically, we just drop our foreground, our background, and the final output is pretty close to what you want, and you just go in and animate little things and add a little bit of flourish.
Here you can see I have the backgrounds that Sam made, and I actually just went into Photoshop and blended them together. So you can see the edges kind of smooth out. I took those backgrounds and I just literally animated them scrolling past. And so you can see as well, we’ve added some directional blur, some lens blur, and then we land on these two windows and do a final little push in. So once we drop Nico on top of that, you can see there’s this very dynamic effect that’s achieved just by literally moving 2D images behind them. On top of that, we have this cool plugin called light rays, which creates these beautiful rays of light. And we’re using the plate of Nico to actually obscure those rays, so it really marries Nico to the environment.
What we’re also adding to emphasize this motion, we took some of the 3D elements of the candelabras from the Unreal scene, we isolated them, and we also turned them into kind of animated candelabras. So you can see here as the camera’s moving, you just whip a candelabra by as if it’s in the foreground to kind of emphasize the idea that the camera is literally orbiting around in classic anime fashion. To really emphasize the motion, we bring in these speed lines right here at the end. We have a couple more anime lines to emphasize the point because that’s a very important character moment. He’s challenging my character, so all of that together with a little bit of lens effects, and then the final piece are these glows, which kind of emulate a film camera because that’s how anime used to be shot back in the days. Animation cells would be filmed through a film camera. So we try to emulate those glows, and after all that, we get a sweet shot that I can send back to Nico, and he can drop it right in the edit.
One thing I want to talk about is the democratization of this process. This is a situation here where we’re looking at a piece of software that’s free that anyone has access to it. A process here that we’re sharing openly with everyone because everyone’s openly shared knowledge with us with everyone. And the way this technology is developing, I want to be able to create really cool things, but the only way I can create really cool things is with the help of other people helping make the technology to create those really cool things. And so, since we’ve learned and are utilizing an open-source program with a lot of contributions from people, I want to give back and I want to contribute our knowledge and put it out there so that you can go and make animations and people can experiment and improve upon the process, helping all of us get better.
So in this video, I’m trying to make a point of really going through the process in a way that’s interesting, but I’m not necessarily getting into a click-by-click tutorial. For those of you who want that, we actually have that on our website CorridorDigital.com. I’ll have a link in the description below. And of course, with every support, that’s why this video is even possible, that’s why we’re able to take four or five people and spend two months just working on an anime about rock-paper-scissors. It’s thanks to all the support and the members at CorridorDigital.com. So with practice, because we’ve got this process dialed in shot by shot, we just started getting faster and better, and at this point, we are blowing through the piece, and it is working. It’s incredibly exciting.
Now most people in the office haven’t actually seen much of this yet. So we’re gonna get wrapped up here with the sound of music and all that good stuff. I’m gonna show it to them, let’s see what they think, let’s see if we can blow their minds. It’s done, theoretically everything dropped in, haven’t washed yet, just exported it, and everybody’s gonna wash it with us. I think it’s safe to say that most of us here have not seen anything. I’ve seen one completed shot and it absolutely blew my mind, and I can’t wait to see the rest of it. I have not seen it with a sound design, and that is what I’m super excited for because Kevin Sinzaki is so good at what he does. And man, well, I’m here for Kevin. Damn right you are. Foreign. [Music]
Feels like a real anime in real life, and that’s what it is. There’s like all sorts of wacky AI jank that creeps in, but you’re so used to it that it doesn’t feel like jank anymore. The song that Sam found for the ending, Fenner made this amazing shot where my fist comes down and there’s just this anime blast that happens, and then you just have this drum roll coming. Yeah, it just all melts, and then you just get kind of lost. Are we gonna get another one? Dean is coming, vengeance at the end of all the anime videos. We always do one of those montages of like, “And now here’s all the things that happen afterwards,” which are like supposed to be like the end credit joke. But this one is the ultimate fast forward, a whole season in this montage. And like, you know, it subconsciously makes you go, “Oh yeah, there’s a whole story here,” but we were just like spitballing random shots. We get to do another one. It’s entirely up to CorridorDigital.com subscribers. We have a tutorial on how we do this whole thing, plus a whole bunch of other cool stuff. You want to check it out, support us, help us make more stuff like this.
Super cool, you know. This is a fun video and all, but I’m genuinely curious. Oh brother. I’m proud of you, though. Yeah, I’ll make you proud. Would you dare one in real life? It happens at normal speed, okay? One, two, three. One, two, three, shoot. [Music] Okay. [Music]