John Gaeta burst onto visual effects radar as part of the Oscar-winning visual effects team of "The Matrix." Sporting a costume not unlike the film's protagonist at the 1999 Oscars, Gaeta declared it the dawn of the age of virtual reality. "'The Matrix' is real," he said.
Five years later, Gaeta still champions the use of virtual reality as a filmmaking tool. And he has the technology to make a stong case. Using a method they call Universal Capture, Gaeta and his team at ESC peopled entire sequences in the "The Matrix Revolutions" with believable CG humans.
Relaxing at his home in Northern California, Gaeta talked to VFXPro about virtual cinematography, optical flow algorithms and other household tools employed in the making of 'Matrix' sequels 'Reloaded' and 'Revolutions.'
What was your role in the making of the film?
I was the senior visual effect supervisor within Eon. It is Larry and Andy's company and it is the same company that did all of the other media. My department oversaw the development, hiring/subcontracting of all of the visual effects facilities.
But those lines get fuzzy. On films this complicated, there is not one of anything. When you are talking about producing, you are talking about three producers -- two line producers, and an overall producer.
What did your job consist of, aside from hiring the subcontractors?
I see myself first as a designer. I was also an animation director, and the liaison between Larry and Andy and all of the other visual effects supervisors.
Click for Large Image
At times during the shoot they were unavailable because they were locked on a stage and just could not get out. We had a 270-day shooting period -- that is almost 270 days of blackout. But that does not mean information doesn't go out -- I would find them at any hour of the day or night and they would lay out their vision for me. Then I went back to my team and articulated it to them.
I worked closely with visual effects supervisors Dan Glass and John Desjardins. Desjardins was applied to all of the material related to the real world, and Dan Glass was on all of the material related to the Matrix.
And they both worked for Eon as well?
Yes. In terms of our communication structure, I could devise an Eon plan with either of those fellows with respect to coordinating the interactions between all of the vendors to the overall film departments. They were the logistics foundation, so to speak. Which allowed me to A, do what I mentioned, which was overall design, but also to administrate. And I could work on technical aims and objectives as opposed to having to scrutinize how to synchronize lightning flashes to motion control.
When you sat down to plan for the films, what did you do first?
There was a one-year development phase, which began in January of 2000. It was then that we launched conceptual art and design, storyboarding, 3D planning for scene and location concepts, and visualization and R&D.
That was a nice experience because all of the solid relationships among the designers and Larry and Andy were in place on day one. Also, by the time we finished the first 'Matrix,' we started becoming very engaged in the notion of creating some sort of information architecture for the production. We needed to have a very hard connection between production design and visual effects, the two departments that led a lot of the overall planning strategies.
Click for Large Image
Did you create that information architecture from scratch?
Yes, we did. We were originally investigating this software called Tea and Cakes which was created in the UK for ad firms. We would have had to modify it extensively because it was never used for features. We wanted to be able to use it for the two features, the game, the ten animes and anything else we were working on.
We wound up feeling like it was becoming so customized that we should just write it. So we hired a couple of fellows, one of them is Charles Henrich and the other is Tim Bicio. They wrote the whole digital dailies and asset management system.
Did that simplify things?
It was the foundation of order. We were doing nearly 2,000 visual effect shots and had a lot of multimedia going on all sides. There was really no other way to go to try to keep sanity there.
There were something like 150 sets, and they were a kind of jigsaw puzzle that we had to try to distribute through the FOX Studios lot in Sydney. Almost all of them needed some kind of digital extension or augmentation. The entire schedule for the film was dictated by the order in which the sets could come off the assembly line. Some of the sets appeared in both films, so we shot them together.
What were some of your aims and objectives technologically?
What is beautiful about this project is that Larry and Andy like a diversified slate of effects. And of course, they allow for creative freedom; they are pragmatic because they have a lot of things to do. So we work on a lot of experimental effects and a lot of stuff that is more reliable, that we know we will get results with.
Click for Large Image
The beginning of the process was to discard the Bullet-time technology from the first film, because it was not remotely capable of delivering the type of content that was being requested in the new scripts.
Bullet-time is a concept that was in the first script. The idea is that the time and space of the camera is detached from that of its subject, which makes it seem virtual. The object is real, but you have a sort of God's eye perspective or the control you might have in a game or a virtual reality simulation.
However, the bullet-time technology of the late nineties was restricted to the camera paths that you determined in advance, using pre-viz. There was no straying from that path. Thusly it was not really virtual, it just suggested the virtual.
What we wanted to do was try to create the technology that was being suggested, and what that really involved was attempting to create virtual components of human beings doing dynamic things within the locations and sets that were being made for the film. So we took the beginnings of the possibilities of the image-based rendering method that was developed by George Borshukov and Dan Pipone and Kim Libreri on the backgrounds in the first film and tried to create a real-time performance capture system. It needed to acquire the shapes of human beings and the textures and all of that, but also to process them into virtual humans that gave exactly the same performances that we were acquiring from the actors. The result of that is virtual cinema and virtual effects.
Would you say that is the cornerstone technology of the film?
It is not the technology, it is the content type. If I had to define virtual cinema, I would say it is somewhere between a live-action film and a computer-generated animated film. It is computer generated, but it is derived from real world people, places and things.
What do you think the value of that content is?
I think it is limitless. I think it is the beginning of the visual side of virtual reality. When you see a film, it is a passive experience, but that technology and that content type will quickly move into interactive media like games and simulation for any purpose you can imagine, whether it be military, scientific or pornographic. It is going to move that way in the next couple of decades.
I do not claim that the technology is done -- it is not. But you can have near total realism with humans in close-up shots, and they can speak articulately by way of the universal capture process -- this is what I think is the difference with regards to filmmaking. This is not an animator in a dark room creating a performance or pushing a performance through a shell. If you wanted to simulate a performance by Jack Nicholson, you could have the precise performance. That is different. It has not been done quite that way yet.
How is that valuable for filmmaking?
Well, of course you are not going to do "Shakespeare in Love" in this way. But there are definitely going to be visionary filmmakers out there that will approach subjects in ways that have not been done before. Someone like Kubrick or Hitchcock -- even Mamoru Oshii or Spike Jonze. Imagine Spike Jonze exploring the subconscious with the freedom to move the viewers perspective around events in a fashion that is not physically possible. The possibilities are as limitless as the visual imagination of the director and his team.
Click for Large Image
You do not have a projected outcome?
I think it is inevitable that there will be a form of animated film that is not necessarily computer generated animation. -- I think there will be virtual films at some point. And they could be interactive. For example, you could take a Pixar movie and allow the kids to move off of the original interest point.
But I think there is going to be a stage after that where there is going to be a virtual film. And a virtual film does not necessarily have to be realistic, it can blend stylistic elements. It is cinema, you want to make it visually entertaining and engaging.
For example, the Superpunch sequence is the idea of a virtual effect applied over a virtual human. As in, we have a performance captured from Hugo Weaving that we have augmented with a dynamic event to distort his face.
Or, for example, when Neo and Smith crash to the Earth after the flying fight, and the entire city starts coming down, that is a virtual environment taken from Sydney. It is an actual corner in Sydney. It looks photographed but it is not. It is a virtual location, 3D, and we applied destruction effects over top of it. That is an example of a virtual effect.
Virtual cinema is a new content type, virtual cinematography is a process. I think the term 'virtual cinematography' first appeared on VFXPro in 1999 -- we made it up at that time. We wanted to define what we were doing.
Virtual cinematography is flexible to have technologies change underneath the umbrella of the term. What we may do with universal capture today will certainly be a different type of technology a few years from now. It will probably be highspeed, higher resolving, high def cameras that are arrayed in a different pattern, but it will still be virtual cinematography, the process of it.
How did you create the Superpunch?
Universal Capture. The way it works, in a simple form, is this: It is 5 hi-def cameras, shooting realtime. They are all synchronized, they all have extreme zoom lenses, and it takes very heavy calibration to find their position. All the images are captured uncompressed to disk, which is like a small NASA set-up on stage. Every single frame of every view -- or at least most views that are relevant to the perspectives of the shot -- are analyzed through optical flow algorithms that were evolved up from algorithms that were applied in the making of "What Dreams May Come."
Model of Agent Smith
Those algorithms enabled us to extrapolate shape and form from complex objects such as leaves blowing in the wind. They are applied to calculate the spacial location of every pixel in the frame. And of course with the high-def views, you get textural information -- every detail, every color that a pixel will be over time. Those are the derivative components of the performance. It is every single pixel in the frame, not 400 points that are marked on a face. It is everything.
Was the Superpunch in the script?
Larry and Andy became focussed on the idea that you have never actually seen a punch, everything you have seen has been a fake. They were also thinking about the definitive moment where Neo appears to succeed over Smith. They were basing it on a punch called the Suzy-Q that was thrown by Rocky Marciano in the 50s. I got some picture of it for reference -- it looks totally devastating.
Geoff Darrow drew this extreme caricature of Smith that was so over the top -- his jaw was bending 20-degrees beyond any physical limit of a jaw. It was very funny. People do not understand that Larry and Andy have a real sense of humor about everything that they are making. There is a wit that has bubbled through all sorts of schlock science fiction, good science fiction, comic books, anime -- all of this stuff is filtering through their brains. They are not trying to make "Gone with the Wind."
Model of Neo
So, Geoff draws this hilarious caricature of a massive punch, which Larry and Andy showed to Warner Bros. as the 'future of filmmaking.' This, of course, drew placid, dumbfounded looks.
We got universal capture of the performance that Hugo had done, which was a directed piece. We had photosonics photography at 300 fps of Hugo being shot by air canons in the side of his face, so you could see every nuance and jiggle of his face and his lips. We used that for reference. We had a sculpture that was made by way of a head cast that was modified to look like Jeff's drawing, which was digitized and put in the U-cap system as well.
The shot was pre-visualized before any photography or U-cap occurred. We had the concepts for the camera move, which included an extreme contra-zoom -- that would create a vertigo effect as the camera approached the fist. The background for that was a virtual background. The rain, of course, was computer-generated rain. The Superbrawl scene that the punch is part of was conceived through the idea of zero-gravity, so there are stylizations occurring in it that are more apropos of flying.
It turns out that our virtual humans hold up much better in close-up shots than they do further away. They are built with extreme detail for that kind of a situation. The Superpunch sequence meant taking the camera and putting it as close as possible on a slow motion shot.
Click for Large Image
Were there other technological advances?
Our studios had an across-the-board escalation of semi-automatic effects generation. What I mean by that is, animation cues that trigger chains of events that have rules. If an APU fires a gun, as soon as that gun is fired in animation, the effects people have written code that says, the gun is going discharge smoke, it is going to have a flash, it is going to eject a projectile, which has certain a trajectory, it is going to have tracer smoke. Those things are procedurally guided.
I think Tippett Studios did the most sophisticated computer graphics work in their history on 'Revolutions.' If you dig below the surface of the film, you will se that they are doing large-scale procedural model generation in the making of Machine City. It is highly detailed, based on GEOFF Darrow's designs, but what is significant is that the details have been populated across landscapes in this sort of growth algorithm. I find it to be oddly reflective of how machines might reproduce themselves. Algorithmically growing, as if they had a biological architecture.
Imageworks has created some of the most capable atmospheric shaders that I have ever seen for the scenes with the ships flying in the sewer systems (they are called sewer craft). It is all moving atmosphere in there, and the entire tunnel system is procedurally generated, shaped, formed and textured.
The immensity of Geoff Darrow's designs -- they are extremely complicated and highly detailed -- were the foundation laid at our feet. It was a subject that confronted us in 2000 -- how on Earth do we create a Darrow shader? One that actually grows, algorithmically grows detail to infinitum, but is not so heavy that it crashes everything? All of the firms involved were asked to pursue methods that would allow us to semi-automatically create environments.
So there you go. Great creature animation, phenomenal environments, groundbreaking methods to acquire performance and create virtual humans. Along with 2,000 shots -- 800 for 'Revolutions' and 1200 for 'Reloaded' -- and 270 days of photography (the longest shoot in film history). And building a visual effects firm from the ground up that equals any in the industry.