Logo_white
 

Loading binary data from different sites in GreaseMonkey

One benefit of the GreaseMonkey function GM_xmlhttpRequest over the regular XMLHttpRequest is the ability to load content from different domains. This feature allows creating scripts that load information from third-party sites and display them on a different site. This feature can also be used to load binary data like images from other sites, for example to get around referer restrictions.

Recently I tried to extend the ULTIMATE Newzbin IMDB Suite to display poster images from IMDB on Newzbin. The basics are simple: I created a regular expression that parsed the output from IMDB for the poster URL, and created a JavaScript Image object to load the image. It looked like this:

var posterURL = responseDetails.responseText.match(posterRegex);
var poster = document.createElement("img");
poster.src = posterURL[1];

But the loading of the image always failed. It seems that IMDB doesn’t not allow embedding their images on other sites. Obviously, they check the referer when the image is loaded.

To overcome this restriction, you have to do three things:

  1. Use GM_xmlhttpRequest to load the image with a fake referrer while preserving the byte values. This was the hardest part, since it seems that GreaseMonkey is mangling binary data. There is a way to deal with it, though
  2. Convert the result to Base64
  3. Use the data URI scheme to display the image

So here’s the code to load an image from IMDB. Note the line defining the overrideMimeType. This one is crucial to get proper binary data. The parameter imdbURL can be any URL starting with “http://www.imdb.com”. The parameter poster should be the img-element that was created above.

function loadPosterImage(poster, posterURL, imdbURL) {
	GM_xmlhttpRequest( {
		method:"GET",url:posterURL,headers: { Referer:imdbURL,"User-agent":"Mozilla/4.0 (compatible)"},
		overrideMimeType:'text/plain; charset=x-user-defined',
		onload:function(response) {
			if(response.status==200) {
				var text = response.responseText;
				var b64 = Base64.encode(text);
				poster.src = "data:image/jpeg;base64,"+b64;
			}
			else {
				alert("failed loading image " + response.status);
			}
		}
	});
}

For the Base64 encoding, I used a modified version of the webtoolkit.base64.js to deal with the restrictions described in http://mgran.blogspot.com/2006/08/downloading-binary-streams-with.html. Here’s the code for that:

var Base64 = {
    // private property
    _keyStr : "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=",

    // public method for encoding
    encode : function (input) {
        var output = "";
        var chr1, chr2, chr3, enc1, enc2, enc3, enc4;
        var i = 0;

        while (i < input.length) {

            chr1 = input.charCodeAt(i++) & 0xff;
            chr2 = input.charCodeAt(i++) & 0xff;
            chr3 = input.charCodeAt(i++) & 0xff;

            enc1 = chr1 >> 2;
            enc2 = ((chr1 & 3) << 4) | (chr2 >> 4);
            enc3 = ((chr2 & 15) << 2) | (chr3 >> 6);
            enc4 = chr3 & 63;

            if (isNaN(chr2)) {
                enc3 = enc4 = 64;
            } else if (isNaN(chr3)) {
                enc4 = 64;
            }
            output = output +
            this._keyStr.charAt(enc1) + this._keyStr.charAt(enc2) +
            this._keyStr.charAt(enc3) + this._keyStr.charAt(enc4);

        }
        return output;
    }
}

I’ll provide the complete script soon.

This approach allows basically loading everything from everywhere, but also has some downsides. One of them is the missing caching functionality. Images that are loaded this way, are not stored in the browser cache. Keep that in mind, if you’re planning to load lots of images from different sites.

Hinterlasse eine Antwort

Du musst angemeldet sein, um einen Kommentar zu erstellen.