It's Sean!

UK freelance journalist, author
and writer Sean McManus

Printed from www.sean.co.uk. © Sean McManus.
Home > Articles > Webmaster resources > Javascript examples > Javascript string compression

August 2008

Javascript string compression tutorial, including search and replace

While I was creating my Writing Wisdom widget, I spent some time looking at how I could optimise the size of a Javascript file that consisted predominantly of text data.

The approach I used was simple, and a project that included much more text might justify a more advanced approach. But using these simple techniques, I was able to chop about 15% off my file size (and therefore off my hosting bill for that particular file).

Note that I'm not talking about compressing the Javascript code itself - I'm talking about compressing text data. There's plenty written elsewhere about optimising Javascript code.

This tutorial includes instructions on how to use split and join in Javascript.

Doing the splits with Javascript

One of the most useful commands in Javascript is split, which is used to divide a string into an array. So you could take a sentence, and then use split to put all the words into different elements of an array, like this:

<script language="Javascript">

var sentence="This is where music goes to die.";
var words=sentence.split(" ");

for (i=0; i<words.length; i++)
{
document.write( i+ words[i] +"<BR>");
}

</script>

Below is the output you get when that script runs. The number at the start is the element number of the array, so the array words[0] contains "This", for example.

Here we've separated words where the spaces are, but you can use any character or sequence of characters.

Compressing array declarations in Javascript

I often see Javascript routines that define array pairs, like this:

question=new Array();
answer=new Array();

question[0]="What do you call a donkey with three legs?";
answer[0]="A wonkey";

question[1]="What do you get if you cross an elephant with a rhinoceros?";
answer[1]="An eleoceros";

There's a lot of repetition in the array definitions there. As I said, this article isn't really about compressing Javascript itself, but in array and text heavy scripts it can save a lot of space if you define the arrays like this instead:

qanda=new Array("What do you call a donkey with three legs?|A wonkey", "What do you get if you cross an elephant with a rhinoceros?|An eleoceros");

You've kept the question and answer pairs together, while also taking advantage of the shorter way to declare arrays. When you need to use the data, you can then separate the question and answer by splitting at the bar character("|"). Note that the split command discards the separator, in this case the bar character.

var chosenjoke=1;
var spl=qanda[chosenjoke].split("|");

document.write("Question:"+spl[0]+"<BR>");
document.write("Answer:"+spl[1]);

Spotting the Join with Javascript

There is an opposite to split: join will take the contents of an array and combine it into a string, using whatever separator you specify between the different data items.

fruits=new Array("apples","pears","bananas","oranges");
var fruitlist=fruits.join(", ");
document.write("I like "+fruitlist);

This produces the following output. Note the space after the comma in both the code above, and in the resulting output:

I like apples, pears, bananas, oranges

Javascript search and replace

By using split and join together, it's possible to create a search and replace routine.

var oldstring="The Spectrum is the best computer ever. Spectrums rule!";

newstring=oldstring.split("Spectrum").join("Amstrad");
document.write(newstring);

The second line there creates a new string. It starts by separating the old string, using the word 'Spectrum' as the separator. The separator is removed automatically by the split command, which means the word 'Spectrum' is removed and the parts of the sentence on each side of that word are put into different array elements.

We then use join to combine all those array elements together into a single string again, but we use the separator 'Amstrad' between the different sentence parts. The end result is that the word Spectrum is replaced by Amstrad in the string. Here's the resulting output.

Javascript text compression using search and replace

I compressed the text in my quotes widget by looking for frequently repeated sequences of words or characters and replacing them with symbols. It's a technique I first saw used in the Amstrad game Spellbound years back.

I used this text analyser to identify those words that were used most often. I also looked at sequences of letters that were frequently repeated and replaced those with symbols. The extent to which you'll be able to compress your text will depend on how many recurring patterns there are. In my case, the words writing, book and author came up very often.

For the purposes of an example, let's use this nursery rhyme, which has plenty of repetition in it:

Peter Piper picked a peck of pickled peppers
A peck of pickled peppers Peter Piper picked
If Peter Piper picked a peck of pickled peppers
Where's the peck of pickled peppers Peter Piper picked ?

This might be how you would normally handle it in Javascript:

var sourcestring="Peter Piper picked a peck of pickled peppers<BR>A peck of pickled peppers Peter Piper picked<BR>If Peter Piper picked a peck of pickled peppers<BR>Where's the peck of pickled peppers Peter Piper picked ?"

document.write(sourcestring);

Now, here it is using a search and replace routine:

replacestring=new Array("1|Peter Piper ","2| picked ","3| pickled peppers ","4| peck ");

var sourcestring="12a4of3<br>A4of312<br>If 12a4of3<br>Where's the4of312?";

for (i=0;i<replacestring.length;i++)
{
tempstring=replacestring[i].split("|");
sourcestring=sourcestring.split(tempstring[0]).join(tempstring[1]);
}
document.write(sourcestring);

The text string has been compressed from 207 characters to 59 characters for sourcestring plus 62 characters for the new replacestring. That represents a saving of about 38% on the text space, but clearly the Javascript required to decompress the text will carry a file size penalty. In this example, the code has gone from 254 characters for the uncompressed text to 345 characters for the compressed version (it can be cut to 234 using shorter variable names).

As more text is added, though, the compression routine starts to pay for itself. New words can be easily added to the search and replace routine, using punctuation symbols and short character combinations (eg !1, !2, !3) to replace recurring phrases in new text that's added.

You'll need to weigh up the circumstances in which this script is useful to you, but for applications that include a lot of natural language text, this script could create substantial savings.

As an additional benefit, it can be a handy way to make Javascript text strings difficult to understand. It falls way short of Javascript encryption, but if you were writing an adventure game, it could be enough to stop people working out where the treasure's buried.

More Javascript and webdesign resources

Please browse my other website design tutorials and Javascripts here.

Books by Sean McManus

Scratch Programming in Easy Steps

Scratch Programming in Easy Steps

Raspberry Pi For Dummies

Raspberry Pi For Dummies

Learn to program with the Scratch programming language, widely used in schools and colleges.

Set up your Pi, master Linux, learn Scratch and Python, and create your own electronics projects.

iPad for the Older and Wiser

iPad for the Older and Wiser

Web Design in Easy Steps

Web Design in Easy Steps

Get the most from your iPad. Written in a friendly and accessible tone, this bestselling book is packed with handy tricks and tips.

Learn the layout, design and navigation techniques that make a great website. Then build your own using HTML, CSS, and JavaScript.

More books

Recommended articles

Paper Raspberry Pi case

Download free book chapters and more!

Download a free case for your Raspberry Pi, and free chapters from my books about the Raspberry Pi, Scratch, web design and more!

Scratch cat and rainbow circle

10-block demos for Scratch programming

Lots of people of all ages are learning to program with Scratch. Discover some useful tricks and techniques in my 10-block Scratch demos, including special effects you can drop into your games.

Sean cartoon in 3D glasses

Make a 3D website using CSS

Learn how to make a 3D anaglyph website using CSS, that pops out of the screen when you wear red/green glasses. More webdesign tutorials.

©Sean McManus. www.sean.co.uk.