The joys of regex

December 11th, 2007

So a couple of days ago someone (forgotten who though, sorry!) put up an interesting post about using Regex (Regular Expressions) in Eclipse to save you loads of repetative typing. As a follow up I thought I’d show a real life example of using regex that just saved me a whole heap of laborious grunt work.

The problem

A client asked me to update the images on a site I’m building for them. The image details are all in a simple xml file with some details and a link to the image file, e.g…

<images>
<image file="img1.jpg"><client>Foobar</client><by>Someone</by></image>
<image file="img2.jpg"><client>Barfoo</client><by>Somebody Else</by></image>
...
</images>

The client however supplied me with a text doucument that looked something like…

1. Foobar/Someone
2. Barfoo/Somebody Else
....

…and a bunch of jpeg files that had been named to match the document, so they were actually called ‘1. Foobar_Someone.jpg’ etc and needed to be renamed for safe use on the web (I never like having mixed case and spaces in web filenames).

Now as there were around 80 of these files it could have been a long and boring ‘rename & save’ job, then a whole bunch of cutting and pasting, so instead I used Eclipse and Perl’s regex powers.

Eclipse solution

The first thing I did was load the text file in Eclipse, then hit CTRL + F for the find & replace dialogue and checked the ‘Regular Expressions’ box. In the ‘find’ box I put

^(\d*)\. (.*)/(.*)$

This is a fairly simple pattern match using the braces to capture matching ‘groups’ that we can use later. Taking it from the begining…

  • the ^ character matches the start of a line, the (\d*) matches the first numbers
  • the \. matches a litteral dot (the slash is an escape character as the dot normally means match anything)
  • It may be hard to see here, but there’s then a space which we ignore
  • the (.*)/(.*) matches the two groups of words around the slash
  • and the $ matches the end of the line

Then in the replace box I put

<image file="img$1.jpg"><client>$2</client><by>$3</by></image>

The dollar+number means use the contents of capture group n, so you can see I’m simply ‘pasting’ the captured bits in the correct places.

Then just hit ‘replace all’ and job done!
(Hint: You can also use CTRL+SPACE in the find & replace input boxes to remind you of the regex syntax)

Perl solution

Of course I still needed to rename all the files, so next I used the ‘rename’ command. I’m working on Linux, but I believe ‘rename’ comes as part of Perl, so it should be somewhere on your system and work the same no matter what platform.

The syntax for the rename command is…

rename perlexpr [ files ]

…and basically runs the regular expression ‘perlexpr’ on the filename of all files matching [files]. The expression I used here is…

rename -v 's/(d*)\..*/img$1.jpg/' *.jpg

The regular expression is the messy looking bit inside the inverted commas, and it’s matching all the .jpg’s in the folder. Again taking the regex from the top…

  • The “s” means substitute. The syntax is s/old/new/ — substitute the old with the new
  • The (d*) captures the intial number in the filename
  • The \. matches a litteral dot
  • The .* matches anything else after it
  • We then discard all the other crap apart from the captured number, and use it in the substituted filename.

(Hint: more info : http://tips.webdesign10.com/how-to-bulk-rename-files-in-linux-in-the-terminal)

Conclusion

Well it’s taken me a hell of a lot longer to write this post than it did to rename all those files, I wouldn’t like to guess how long it would have taken by hand but I’m sure it was much easier this way.

Try bit of regex yourself, you just might like it! :)

LFPUG Linux Presentation Slides

January 30th, 2007
flinux.jpg

Way back in November last year I did a presentation at the London Flash Platform Usergroup. I ran through a quick intro to Linux, what makes it different, and why you may (or may not) want to try it out, as well as giving a demo of my Beryl desktop.

I also showed how to install the (back then brand new) Beta Linux Flash Player, and how to set up a completely Open Source Flash development environment using Eclipse, MTASC & ASDT.

I had a lot of fun doing the pres, and gave out a whole stack of Ubuntu CD’s afterwards. The slides were supposed to go up on the LFPUG site at the same time as the video, but that all got rather delayed… so I thought I’d post my slides here instead. Better late than never :)


Download “Getting Flash on Linux” presentation slides

(The useful Flash info is from about slide 17 onwards)

Edit - the video is up on the LFPUG site too now, bit dark but check it out

FDT - Templates are your friend!

July 22nd, 2006

Jason Nussbaum suggests on his blog the use of a standard naming convention in all your get and set functions to save you a few seconds thinking time. This is all good, but how about taking it one step further by ensuring you always use standard names while also saving some typing? If you’re using FDT I just might be able to help…

A nifty feature of FDT that not many people seem to know about is Templates. Templates allow you to easily insert and customise chunks of code, anything from one line to whole classes, in a few keystrokes. Being a lazy git at heart I have a lot of love for this feature, and use it often. The best thing is you can also easily make your own Templates to automate your regular tasks, and in this post I’ll show you to add a simple Template to write a pair of ‘get’ & ’set’ data access functions in a record time…

Read the rest of this entry »

Thanks for the dead tree!

June 25th, 2006

(Update: it was of course Stuart Eccles not Stefan Richter who did the Rails talk, my bad!)

Wanted to say thanks to Stuart Eccles and O’Reilly for the copy of Agile Web Development With Rails I won at the LFPUG the other night.

I already owned the PDF version and it’s always nice to have a the ‘real’ version to flick through. It goes without saying that this is the book to get when you want to get started with Rails, and will remain a handy reference long after you’ve finished reading it. The second version is available now as a constantly evolving ‘beta’ PDF. Kinda like ‘Agile Authoring’ :)

There seemed to be a lot of interest in Rails at LFPUG, I’ll try to get proper posts up on some things Stefan didn’t have time to mention, for instance I use RadRails which is an extremely cool Eclipse plugin, and there’s a new Live CD that may be worth a play with.

Stay tuned…