March 28, 2011

Of Hacks and Helpers

What’s a Google Chrome Helper, and what has it ever done for me? Also: Google Chrome Framework?

When my team brought Google Chrome to Mac OS X, we were faced with some interesting engineering challenges. One of these was integrating Chrome’s multi-process architecture with the Mac’s application model.

Ping-Pong

Chrome’s multi-process design is such that there’s a single main process, called the “browser” process. (For the technically declined, a “process” is simply a program that’s running.) The browser process communicates with the user by displaying things on the screen, and by taking the appropriate action when the user moves the mouse or presses keys. It’s also a sort of hub connecting Chrome’s other processes.

Whenever Chrome is open, you’ll have exactly one browser process running. If any web sites are open, you’ll also have one or more “renderer” processes. Renderers are responsible for turning web sites into something that you can see and interact with. When you type a web address into the omnibox, the browser starts up a renderer if necessary, and then asks it to load the site. In turn, the renderer asks the browser to go out and get the data it needs from the network, so the browser makes these requests, and ferries the responses back to the renderer. As the data comes in, the renderer builds up an image of what you should see. It passes this back to the browser for display, which dutifully complies, and the web site shows up on your screen. When you click a link, the browser receives the click, and passes it off to the renderer, which might take action by giving the browser something new to display, asking the browser for more data from the network, or both. In the business, we call this “interprocess communication” (or IPC for short), but “ping-pong” is just as good a description.

Chrome’s design includes other process types, too. If you’re looking at a site that uses a plug-in like Flash, there’ll be a “plug-in process” in the mix, whose job is to load and run the plug-in. It communicates (IPC again) with both the browser and a renderer.

Capture the Flag

Chrome’s multi-process guts deviate from traditional application norms. Simple applications exist in a single process. On the Mac, an application’s process is associated with an icon in the Dock. One application, one process, one Dock icon. Easy. In Chrome’s case, our goal is to arrive at one application, many processes, one Dock icon. The special sauce is tying everything together so that this works seamlessly.

When a Chrome browser process needs to start up a renderer, plug-in, or any other child process, it does so by starting a new process that will launch Chrome from the very beginning again, but with a flag telling the newborn process how it should specialize itself. One of the first things that any Chrome process does is check this flag. If there’s no flag, it becomes a browser, which is what happens when you start Chrome up yourself. If the flag says “renderer,” it will become a renderer, find the browser that started it, and set up a new game of ping-pong.

Chrome performs this “capture the flag” protocol as a precursor to doing anything else, but nothing else on the Mac is aware of this convention, because we made it up. One aspect of software engineering is that when the only things you need to interact with are entirely within your own control, you can very easily invent whatever scheme you need to get the job done. Another aspect is that when you do something like this, you might miss another detail, like the fact that your application needs to run within a larger system, and you might need to interact with that system somehow, too.

Will the Real Chrome Icon Please Stand Up?

The problem in this case was that the rest of the Mac system—including the Dock—couldn’t distinguish between the Chrome browser and all of its child processes, like renderers. As far as the Dock was concerned, each Chrome process was just another instance of Chrome. Each would get its own icon. In fact, every time Chrome would start another process, another icon would show up, and it would persist as long as the new process was still running. Imagine seeing a half dozen Chrome icons down there, dancing around while you worked. If you clicked on any of them except for the one associated with the browser process, nothing interesting would happen, because they weren’t instructed to specialize as browsers. If the Dock couldn’t tell the difference between a genuine Chrome browser process and those impostors, how would you ever be able to?

You wouldn’t. And I like and respect you too much to subject you to that kind of harrowing experience.

The problem isn’t restricted to the Dock. If you use the Command-Tab keyboard shortcut to switch between applications, all of those extra Chromes would show up there, too. Without additional care, the Mac might even mistakenly assume that the extra processes are “stuck” or “hung” or “not responding” because they don’t behave as it expects proper UI applications to behave.

Let’s do better.

The Lying King

The first thing we tried, which was an embarrassing, messy, temporary stop-gap solution (known in the biz as a “hack”), was to just flat-out lie. Chrome identified itself to the system as something less than a fully-fledged UI application, because only fully-fledged UI applications qualify for Dock icons. (In Mac parlance, Chrome declared itself as an LSUIElement.) Of course, there was still no differentiation from the Dock’s perspective: now, instead of each Chrome process getting an icon, none would. Not even the browser. That was obviously bad.

The second step of the hack was to to admit having lied about the browser process not being a full UI application. Since the capture-the-flag strategy was able to distinguish between process types with ease, when Chrome detected that it was being launched as a browser, it turned itself into a proper UI application by calling TransformProcessType, essentially undoing what LSUIElement did and doing what LSUIElement didn’t.

This approach still had its share of problems:

  • Since the browser process started its life as an LSUIElement, certain things that should have happened really early on while it was starting up didn’t happen. For example, Chrome wouldn’t start up in the foreground. It’d take an extra click just to start working with the first Chrome window. We had to add another hack to account for that deficiency. (Hacks are nasty creatures. They have a tendency to multiply in this way.)
  • The hack that brought the Chrome window to the foreground caused Chrome to take a measurably longer time to launch. It’s slower to start out in the background and switch to the foreground than it is to just start out in the foreground.
  • The Dock icon wouldn’t bounce when Chrome was starting up. By the time the browser process was able to properly identify itself as a UI application, any bouncing would have concluded. Interestingly, our icon not bouncing caused people to perceive that Chrome was launching faster than it actually was, when in reality, the effects of the compound hacks made it slower.
  • Perhaps worst of all, it made me feel terrible.

Help!

The real solution to all of these problems was to split Chrome up so that the rest of the Mac system would see it as two applications: one for the browser, and one for everything else. We named the “everything else” application the “helper.” The browser application can only specialize as a browser process, and the helper application can specialize as anything else. The browser application declares itself as a UI application, the helper’s declared as LSBackgroundOnly (which is like LSUIElement with even more restrictions). Since applications on the Mac are actually just directories, the helper application is nested inside the browser, so that wherever the browser goes, it takes its helper with it.

Framers at Work

Having two copies of what is essentially the same application sounds like it might use twice as much space as a single-application approach. It doesn’t, because neither application actually contains any Chrome code. All of the true Chrome code exists in only one place, a shared library that we call the “framework.” The framework is another component nested inside the browser, alongside the helper. The only thing that the browser and helper applications do is find the framework, load it, and then jump to it.

This architecture means that the actual code that lives in the Chrome browser application is about as small as anything you’ll ever encounter.

Serving Process

You can see all of Chrome’s different processes at work on your computer by choosing Tools:Task Manager from the wrench menu. Everything listed in the Task Manager is actually a process. You can also use whatever tools your system provides for examining processes. On a Mac, the Activity Monitor will show you all of the Chrome processes.

You can also examine the innards of the Chrome browser application on the Mac. If you control-click the Chrome icon in the Finder and choose Show Package Contents, and then you poke around enough, you’ll find both the Google Chrome Helper and its companion, the Google Chrome Framework. You might also notice other unconventional aspects in its structure. I plan to explain some of those in a future article.

Postscript

While writing this article, I discovered that I had declared the helper app as an LSUIElement instead of LSBackgroundOnly. While both result in no Dock icon for the application, the subtle difference between these two is that LSUIElement applications are permitted to create user interface elements (putting things on the screen), where LSBackgroundOnly applications are not. Chrome’s helpers never need to create any UI, so LSBackgroundOnly is the more appropriate choice. I checked in a change to Chrome to correct this oversight.

Why had we used LSUIElement as opposed to LSBackgroundOnly in the first place? At this point, I don’t even remember. The hack was initially conceived over two years ago and was removed after six months; LSUIElement, eradicated earlier this month, may be its last remnant. That’s another problem with hacks: they tend to outstay their welcome, and can leave detritus in their wake long after they’re forgotten.

Post-Postscript

March 29, 2011

It seems that LSBackgroundOnly was a bit too restrictive. Some plug-ins have a legitimate reason to display windows. The Flash plug-in, for example, can show a file-open dialog. Gmail uses this feature to attach files to messages. With LSBackgroundOnly, the plug-in process was allowed to show the window, but the window couldn’t be brought to the foreground. This presented an obvious problem: the attachment window would most likely be hidden behind the main browser window, and even if a user did manage to expose it, it would have been difficult to interact with. As of earlier today, Chrome is back to using LSUIElement. Why LSBackgroundOnly processes are allowed to show any windows at all remains a mystery.

Columnize This

Interesting things happen to me. I thought you might find them interesting too, so I’m going to share them. Hi, I’m Mark Mentovai. I’m a software engineer at Google, and I work on Google Chrome, serving as the Mac version’s tech lead. Here’s a picture of me with my team. You can recognize me by my tech lead’s uniform, an all-black get-up complete with a helmet and a flag. In this picture, I’m standing on what appears to be a tank, technically leading my team to obliterate a garbage can.

As the Jackass of All Trades, I’ve got other interests. I’m not just going to be writing about Chrome, but the first few articles I’ve planned are about Chrome, and why certain things were designed the way they were. The most interesting (and enduring) aspects of these stories are the problems that my team encountered along the way and how we solved them. I’ve also got a theory that it’s possible to write for a technical and non-technical audience at the same time, and produce something that each group can take something away from without talking down to anyone or making anyone else feel like they’re in over their head. I’m going to put that idea to work in my articles as best as possible.

Finally, I don’t like the sound of “blog” or “blogger,” so I’m going to call myself a “columnist.” Affectation? Maybe, but thanks for indulging me.