Web Intents/Activities as The Future of Web Development - A Fundamental Shift

After reading both Paul Kinlan's article on Web Intents and Ben Adida's on what is happening with activities at Mozilla Labs, I thought it would be a great time to write an article about how this is currently being played out at Mozilla.

As Paul points out in the closing line of his article - "This project will fundamentally change and improve the way we build applications on the web today for our users.” His words perfectly describe how we are feeling at Mozilla as well. Change is coming, and the change is going to be nothing short of revolutionary.

If you have not read anything on the purpose of Web Intents/Activities, Mike Hanson from Mozilla has a very thorough article outlining the need for service discovery in web applications. Paul has written an excellent introduction on how intents can solve these issues. Ben from Mozilla shows a demo of what can be done using the New York Times and Flickr as an example.

The Problem - Lack of Flexibility, Cluttered Interfaces

To explain the problem, I'll continue with the example that Paul mentions.

Imagine that you are a developer writing an image editing web app at "imageeditor.com". You want to allow the user to select a photo to edit, perform some sort of modification, and then save the image. But, you want the user to be able to select a photo that is stored on either SmugMug, Picassa, Flickr, Instagram, or on their hard drive. Using current technologies a button is presented to import a photo from each of these potential services, whether the user has an account there or not. In this scenario five buttons are displayed, when the user may only be interested in one. The extra buttons clutter up the interface, increase cognitive load on the user, and in the end may not even provide the user actually uses - perhaps all their photos are stored on Facebook. So, as the app developer, you add a new button for Facebook. Now there are 6 buttons displayed. Another user tomorrow requests a button for Google+. This madness is referred to as the NASCAR effect. There has to be a simpler way.

Let's Simplify

From a user's perspective, a better way is to present the user with one button - "Get Photo". Pressing this button then presents the user with a list of services that are actually relevant to them. The user selects the photo from their site of choice and finally the photo data is returned to the image editing application.

From the developer's perspective, a better way is to write code once and only once. As a developer, I do not want to add a new button every time a new image storage application comes online. As a developer, I do not want to have to write login routines nor store OAuth information for a myriad of services. I want things to be clean, usable, secure, and automatic. As a developer, I want to say "the user needs an image", and let the rest of the work be taken care of for me.

A pipe dream? Thankfully, no.

The coming model is going to change all of this. Instead of every site having to build out/provide/include every necessary component, a page will be able to request information and functionality that is provided by a third-party app. What does this mean? The concept is simple. Currently, every component that a site needs must be built by the app developer or included from a third party library. Services and Intents aim to change this, instead of having to directly include code to provide a component, the app can ask the browser if any installed third party applications provide the necessary component.

The Coming Reality - Installable Apps and Web Activities/Intents

Both Mozilla and Google are working to make this a reality, and sooner than you think.

The idea that both companies have is that you should be able to "install" a web app into the browser. These apps are nothing more than standard HTML5 pages, pages that use open, standards based HTML5 technologies that you are already using today. The difference is that these pages provide some extra information that allows the browser to handle them as installed applications.

These apps can either provide or consume bits of functionality, for instance maybe an installed Flickr app provides functionality to get an image. When a second app needs to get an image, it can ask the browser to get it an image. The browser, knowing that the Flickr app is both installed and can provide images, uses it to get image data. If multiple apps are installed that provide image data, a dialog showing only the services that the user cares about is presented asking them to select from the list.

Where can this be used? The possibilities are endless, but common scenarios include "Share with", file retrieval/save, "save to calendar", contact list management, and profile management. Imagine having one definitive profile that can be used to feed profile information to all other apps requesting this information.

The Mozilla Open Web Apps API

Note, this API IS going to change. Both Mozilla and Google are working hard to create a common interface.

So far, Mozilla's take on using a service is slightly different than Google's, but both companies are hard at work trying to find a common API that is optimal from both a user and developer perspective.

Mozilla Labs is rapidly piecing together the OpenWebApps plugin for Firefox that provides native browser functionality for their proposed API. Unfortunately for us at Mozilla, not the entire world runs Firefox. For browsers that support a minimum of the HTML5 window.postMessage and localStorage, there is a polyfill that can be used to provide the same functionality. The polyfill can be included directly from https://myapps.mozillalabs.com/jsapi/include.js

The Mozilla API is documented on MDN. Getting Started gives a very good introduction into the goals of the project. The window.navigator object is augmented to provide the app API under:

navigator.apps.*

Application Manifest

The first portion of a web app that a developer will care about is the manifest. The application manifest is a completely separate beast than the W3C offline application manifest, but they can be used together.

The application manifest is a JSON object that provides the information necessary to install and run the app. It includes things like application name, author name, author page, as well as a list of services that the application "provides," and may include in the coming months a list of services that the application "consumes." More on what it means to provide or consume services later.

Application Installation

As an app developer, you are going to want users to install your app. To do this, the navigator.apps.install function must be called, generally this is done from a button press. Calling this will cause the browser to ask the user whether they want to install your app. If the user approves, your application's manifest will be requested and the app will be installed.

This call takes the form of (taken from MDN):

navigator.apps.install({
    url: "http://path.to/my/example.webapp",  /* path to the manifest */
    onsuccess: installCallback,
    onerror: errorCallback
});

So that you don't annoy users, before presenting them with an install button, you can call navigator.apps.amInstalled to check whether your app is installed or not.

/* Check if the current page is installed as an app */
navigator.apps.amInstalled(callback);

MDN provides information on many other functions, but these are mainly of interest to Dashboards and Stores.

Service Invocation

The most important feature not yet outlined in MDN relates to service invocation. This goes back to a photo editing site requesting a photo.

Earlier, I mentioned the notions of service providers and service consumers. What do these mean? In this scenario, "imageeditor.com" is the consumer of a service, the service of getting a photo. Flickr, Picassa, or SmugMug are providers of a service, the service of giving a photo. There is the notion of a mediator as well, this is normally taken care of by the browser.

Continuing with our example, this is how service invocation works:

  1. imageedit.com is the consumer. The site makes a request to invoke a service, we will call the service "image.get."
  2. The mediator (normally the browser) sees that a service invocation request came from imageeditor.com, it looks in its list of installed applications for any that provide "image.get." It sees that the user has Flickr and Picassa installed. The mediator asks the user to choose one of these, or to cancel the request. Cancelling effectively denies access to imageeditor.com to make use of the service. For this example, assume the user selects Flickr.
  3. Flickr is the provider. The Flickr app is informed that a request has been made for "image.get," without ever knowing which app/site the request came from. The Flickr app then presents the user with an interface where the user can select an image. Once the user selects an image, it then sends the image data back to the mediator who proxies the data back to the originating app.

An important thing to know is that Flickr and imageedit.com know nothing of each other unless they explicitly share this information. imageedit.com knows that it requested and received a photo. Flickr knows that somebody requested a photo so it sent one.

The name of the service, "image.get" must be agreed upon by the two applications for the invocation to happen. At this point there is no standards organization or process to decide on names and it is hoped that these will develop naturally and de facto standards will emerge.

Provider Side - Back to the Manifest

For an app to be discovered by the mediator, it must declare which services that it provides. Mozilla's proposal is to have this declaration take place in the application manifest. Services are still experimental so their declarations are placed under the "experimental" section. The exact form of this is still under heavy development and will likely change.

"experimental": {
    "services": {
        "service_name": {
            "endpoint": uri
        }
    }
}

An example of this for our Flickr provider could be:

"experimental": {
    "services": {
        "image.get": "/service_imageGet.html"
    }
}

What this means is that with the Flickr app installed, it can provide the service "image.get". "/service_imageGet.html" on the application's server will be called to complete the task whenever another app requests "image.get".

Setting Up Communication

Within "/service_imageGet.html" there has to be logic that says how to handle the service invocation. At the moment, this too is under heavy development and being cleaned up. For a provider to register which function to call within the provider page, it must register a handler for the service. This is done using navigator.apps.services.registerHandler.

The form of the call is:

navigator.apps.services.registerHandler(serviceName, serviceHint, callback);

More concretely:

navigator.apps.services.registerHandler('image.get', 'getImage', 
    function(args, callback) {
        // get image, place into imageData;
        var imageData = getImageData();
        callback({image: imageData});
    }
);

Consumer Side - Invoking a Service

For a consumer site or app to request to use a service, it uses navigator.app.invokeService. The form of the call is:

/**
* Request to use a service
* @param {string} serviceName - name of the service to invoke
* @param {object} callData (optional) - optional data to pass to service provider
* @param {function} onSuccess - function to call on success.
* @param {function} onFailure - function to call on failure.
*/
navigator.apps.invokeService({
    serviceName, 
    callData, 
    onSuccess, 
    onFailure
});

In our example, imageeditor.com wants an image that it can edit. It uses navigator.apps.invokeService:

navigator.apps.invokeService({
    "image.get", null, 
    function(data) {
        // handle success case.
        // data.image will be the imageData returned from above.
    },
    function(data) {
        // handle failure case
    }
});

Conclusion

So there it is, an overview of what is coming down the line. Unfortunately, at the moment, I don't have personally written examples of all of this in action. Mozilla Labs has an example application page that shows how app installation and manifests work. The Mozilla account on GitHub has several examples that can be run, most importantly is the site/tests directory in openwebapps and the openwebapps-photosite-connector.

Shorter posts with more detail are planned as these APIs are solidified. These posts will have fuller explanations and examples.

Finally, I agree with Paul, these new technologies are going to fundamentally change the webapps ecosphere as we know it - it is an exciting time to be a developer. See you soon!