Some thoughts on file names

Posted in file storage by Paul Butzi on January 8, 2007

“What’s in a name?  That which we call a rose/by any other word would smell as sweet.”

While my internet connection was down, I was motivated to do things that didn’t require the ‘net, like cleaning up.  One of the things I started to do was to organize the VAST lump of computer files.

Far and away the most daunting thing was that I’ve been very bad about avoiding duplication, so it’s often the case that I have the very same image, stored in multiple locations.  I decided that this was a Bad Thing, and I set about finding duplicates and eliminating them.  And to my dismay, this was a Very Hard Thing to Do, because often the duplicates would have different names.

Here are some lessons I learned:

  1. Every image should have a unique name, even if it become removed from the directory that provides identifying context.  So it’s no good to have a directory named “July 2001” and a directory named “August 2001”, each of which has files named PICT0001.JPG, PICT0002.JPG, etc. in it, especially if the files in the directory are not identical.  More on this later.
  2. Files which are derived from some original file should make it clear what that original file was in the file name if at all possible.
  3. When you’re reading files off CF cards, you should read the files off the card, and immediately ensure that they are stored in TWO locations on separate hard disks, for safety.  Having done that, you should immediately format the darn CF card, so that you don’t end up duplicating the files when you copy files off the card the next time.
  4. Unix is far better than Windows/XP for letting me cobble together tools to find all the files that are byte for byte the same.
  5. It will be far easier to avoid duplication in the future (and easier to detect dupes) than it has been to eradicate them on a filesytem with 10’s of thousands of files and some large but unknown number of duplicates.
  6. Myrate of accumulation of files has increased exponentially over the past few years, so that I am accumulating files much much faster now than I was even just one year ago.
  7. Don’t tackle tasks like this when you already have a headache.  Trust me on this one.

Right now, for image files which I am creating now, when I copy the files off the CF card (or SD card, whatever.  Sit down in back, there, and be quiet) I immediately rename ALL the files using the following scheme: <camera name>-<YYYYMMDD>-<sequence number>.

Camera name is a unique, short name I attach to each camera I own.  For my Canon EOS-5d, the name is “5d”.  For the Canon A95 I own, the name is “a95”.  You get the idea.

YYYYMMDD is the date the image was captured.  This is recorded in the EXIF data, so it’s easy for the tool I use (Adobe Bridge) to find it and use to rename the file.  It’s also easy for any other tool people use to do it, so if I switch from using Bridge, I don’t have to switch naming conventions.  I use the order year, month, day, with fixed widths, because it makes it easy to get various tools to sort the files in date order.

Finally, the sequence number is the one attached to the file by the camera.  On the 5d, it’s a four digit number and when it hits 9999, it just wraps back to 0000.  You’ll note that as long as I don’t have a camera body which wraps the number in a single day, I’ll never get different files with names that collide.

  1. Ed Richards said, on January 8, 2007 at 7:12 am

    Breeze Systems makes a terrific utility – Downloader Pro – that automatically manages downloads from cards, renames any way you want, and prevents dupicates. Was free, but well worth the current low cost:

  2. Ed Richards said, on January 8, 2007 at 7:15 am

    You should carefully plan your directory structure, and try to set it up so that it can be refined without having to completely redo the whole thing. I think it useful to have a limited number of root directories – I use one for all images, with as many subs off of that as necessary. Then I can use a directory sync utility to keep that root in sync with the backup devices. If you are scanning from negatives, make sure that the filename ties back to the negative, and that this is preserved with subsequent edits.

  3. Gordon said, on January 8, 2007 at 8:45 am

    I went through this about a year ago and came up with something similar, but slightly different.

    I have a folder : Pictures

    in that folder is a folder for each year

    In the appropriate year folder, goes a directory:


    This way (YYYYMMDD) means that they sort in date order, when you sort by filename.

    In each of these date/ shoot folders, I copy the files to a directory called ‘originals’
    Each capture gets renamed to

    Starts from 0 for each directory. The software I use to copy from a CF card does this automatically, to a working directory and to network storage.

    I also create two other directories, ‘edits’ and ‘output’

    The ‘edits’ directory is where I save any photoshop edits to the file, all the layers still there, full resolution. filename_as_above.psd

    The output directory gets web versions, print versions etc. I typically wipe that directory when I run out of space, but keep the originals and edits directories. The ‘output’ can be quickly recreated most times from the version in the ‘edits’ folder, and I have the originals separate as well.

    I find with this naming convention, I can quickly re-locate files, based on the name and also work out duplicates and which files to get rid of.

  4. theoldmoose said, on January 11, 2007 at 1:23 pm

    I highly recommend The DAM Book (Digital Assessment Management).

    It tends to be centered around the supposition that you will eventually get iView Media Pro to organize all your files, but since that tool doesn’t cost all that much, I decided to spring for it.

    Peter’s book is one you will want to study thoroughly, as it provides all the tools and ideas you’ll need to get a handle on this.

