Tuesday, January 10, 2012

Find and move files on macosx lion hard disk using perl File::Find

There are times in your life when doing a bit of programming can really be a convenience, a time saving convenience. These are the times when even your wife would agree that being a programmer has its advantages.

Now a couple of aeons ago I had a G4 macbook running panther. When I switched from that machine, I just dumped that hard disk on a western digital external drive. Then I was using a Thinkpad and again the Thinkpad disk got dumped on same WD external hard drive when I moved to my current MBP.

Problem now is that locating photos or music on this WD external drive can be a pain. I had really created a deeply nested "I am organized" directory structure on my old G4 and Thinkpad. (The argument is settled in my mind now, never created directories more than 2 level deep no matter what people say. Search is always faster than locating a file by traversal. All your little ontological schemes are totally arbitrary and you are certain to forget your arbitrary "conventions" after 3 years)

I am looking for a way to "flatten" this nested directory tree and move all the photos scattered on this disk in one place. Ditto for music and PDF files and whatever I care about. I want to create a new directory structure that is flat and for that I need to find files in old tree and move them to new tree. Find files on a hard disk and move to a new directory tree. what can be simpler?

First I try bash find and mv command using xargs



find /Volumes/Elements/iBook/Music/ -name *.MP3 -print0 | xargs -0 -I {} mv {} /Volumes/Elements/Music/ibook/{}




The problem with this scheme is

  • There can be two files with same name but different content (result of importing from 2 digital cameras)
  • Extracting base name can be difficult when file contains spaces (none of the suggested trick worked for bash shipped with my mac osx lion)
  • You may want to run some rules on source as well as target, with Bash programming is difficult

find and mv with xargs will work for simple cases and I suggest using them. However for my find and move case I found perl File::Find to be a better fit. 

  • Perl File:Find is fast - no complaints with speed
  • I can access perl and all the programming logic, like attaching a counter to file name etc. Programming perl is preferable to programming bash
  • With perl File::Find and closures, I can easily reuse the logic across different sources
  • World is full of perl File::Find ready made examples

with about 20 mins. of work I was able to whip up my scripts and find and moves files in desired directory structure. Now I can browse the dump of my old hard disks easily and decide on what to keep and what to throw away




#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
use File::Find;

our $count = 0 ;

my $ibook_dir = "/Volumes/Elements/iBook";
my $thinkpad_dir = "/Volumes/Elements/Thinkpad-R60";

my $wanted = make_wanted(\&move_media,'/Volumes/Elements/media');
find($wanted, $thinkpad_dir);


#http://www.perlmonks.org/?node_id=109068

sub make_wanted { 
 my $wanted= shift;                      # get the "real" wanted function
    my @args= @_;                           # "freeze" the arguments
    my $sub= sub { $wanted->( @args);  };   # generate the anon sub
    return $sub;                            # return it
}


sub move_media {
 my @args = @_ ;
 my $file = $File::Find::name;
 
 $file =~ s,/,\\,g;
 #return unless -f $file;
 return if $file =~ /THUMBS/i ;
 return unless $file =~ /\.mp3|\.avi|\.mov|\.mpeg|\.mpg|\.mpeg4|\.mp4|\.3gp|\.3gpp|\.h264|\.wmv|\.flv/i;

 #replace backslash with slash
 # we are getting backslash from File::Find on macosx
 $file =~ s/\\/\//g;
 my $mvname = fileparse($file);
 #quote source - otherwise mv command fails
 print "mv \"$file\"  \"$args[0]/$mvname\" \n" ;
}

sub move_photo {
 my @args = @_ ;
 my $file = $File::Find::name;
 
 $file =~ s,/,\\,g;
 #return unless -f $file;
 return if $file =~ /THUMBS/i ;
 return if $file =~ /\.svn/ ;
 return if $file =~ /gloodev/ ;
 return if $file =~ /pgsem/i ;
 return if $file =~ /DMC/i ;

 return unless $file =~ /\.JPG|\.jpg|\.jpeg|\.JPEG/;

 $file =~ s/\\/\//g;
 my $x = fileparse($file);
 $count++ ;
 my $mvname = $count."_".$x;
 #quote source - otherwise mv command fails
 print "mv \"$file\"  \"$args[0]/$mvname\" \n" ;
}

  1. fileparse routine is to get base name out of full file name
  2. we have a global count variable declared in "our namespace"
  3. mv cannot handle spaces in names so we need to quote such file names
  4. The closure to create custom perl File find functions that accept our parameters from outside is taken from perl monk site



© Life of a third world developer
Maira Gall