Command Line Fun: How to Flatten a Folder Hierarchy

Updated March 7, 2012: A reader pointed out that the command line solution presented here only works if all files in the folder hierarchy have unique names.  Please see the comments to this post for two ways to work around this limitation.

Once upon a time, I knew most every command line operation that was available on a computer running Windows.  Back in the day I even held a Microsoft Certified Professional title for MS-DOS.  However, as time went by and Windows matured, I paid less and less attention to the command line and the enhancements that were made to it.

Recently I was asked if I knew how to “flatten” a folder hierarchy.  That is, copy all of the files contained within a folder tree to the root of the tree (or to any single folder).  Off the top of my head I didn’t know how to achieve this, but assumed that there was a utility app out there that would do the trick.

Well, I did find a way to do it, and lo-and-behold, no utility app is needed.  Instead, all you need is the good old command prompt.  I found this post on StackOverlow that shows how to flatten a folder hierarchy:

http://stackoverflow.com/questions/1502170/how-to-copy-files-from-folder-tree-dropping-all-the-folders-with-robocopy

I’ve successfully tested the solution described there on both Windows XP and Windows 7.

Basic Command (the details)

I set up a hierarchy of folders with a few files at various levels of the hierarchy.  The folder hierarchy looked something like this:

Folder0
          Folder1
                    File1.1.txt
                    File1.2.txt
                    FolderA
                              FileA.txt
                    FolderB
                              FileB.1.txt
                              FileB.2.txt
          Folder2
                    FolderC
                              FileC.txt

Then, at a command prompt I navigated to the root of the hierarchy (Folder0).  From there, I ran the command…

for /r %f in (*) do @copy "%f" .

… which copied all of the files scattered throughout the hierarchy to the root (Folder0).  (Note that the trailing period in the command is significant… it designates the current folder as the destination for the copy operation.)

The contents of the root folder of my folder hierarchy now looked like this:

Folder0
          File1.1.txt
          File1.2.txt
          FileA.txt
          FileB.1.txt
          FileB.2.txt
          FileC.txt
          Folder1
          Folder2

So how does this work?  If you type “for /?” at a command prompt, you find quite extensive information about the “FOR” command.  To understand the specific command I used to flatten the hierarchy, this is the key information about the FOR command:

Runs a specified command for each file in a set of files.

FOR %variable IN (set) DO command [command-parameters]

  %variable  Specifies a single letter replaceable parameter.
  (set)      Specifies a set of one or more files.  Wildcards may be used.
  command    Specifies the command to carry out for each file.
  command-parameters
             Specifies parameters or switches for the specified command.

And this is also relevant:

FOR /R [[drive:]path] %variable IN (set) DO command [command-parameters]

    Walks the directory tree rooted at [drive:]path, executing the FOR
    statement in each directory of the tree.  If no directory
    specification is specified after /R then the current directory is
    assumed.  If set is just a single period (.) character then it
    will just enumerate the directory tree
.

So in other words,

for /r %f in (*) do @copy "%f" .

means to start at the current folder and walk the folder tree (/r), and for every (*) file (%f) found, copy the file to the current folder (@copy ‘%f’ .).

Specify a Source and Destination Folder

A start folder and a destination folder can also be specified.  To try this out, I created an additional folder named DestFolder at the same level as Folder0.  With those folders in place, I executed the following command to copy all of the files in the Folder0 hierarchy into DestFolder:

for /r C:\Folder0 %f in (*) do @copy "%f" C:\DestFolder

Notice that this time a source folder (C:\Folder0) and a different destination folder (C:\DestFolder) were included in the command that was executed.

Move Instead of Copy

Finally, copying the files is useful, but sometimes the better operation is to move the files, rather than preserving the original files in their original locations.  The Windows command prompt provides a command for moving files (MOVE), and it can be used in place of the “COPY” command used in the earlier examples.

The following command will move all of the files from the Folder0 hierarchy into DestFolder:

for /r C:\Folder0 %f in (*) do @move "%f" C:\DestFolder

There you have it. Simple, fast, and requires nothing more than a command prompt.  Wonder what other command line goodies I’ve been ignoring?

About these ads

31 Responses to Command Line Fun: How to Flatten a Folder Hierarchy

  1. a says:

    thanks for the excellent article!
    can you advise please how to rename instead of overwriting files, when flattening folders?

    • mlichtenberg says:

      Good catch! I should have noted that my solution assumes that all files have unique names. I’ll update the post to make this clear.

      There is really no fullproof way to do what you want using the Windows command line tools, though there is one possibility. The documentation of the FOR command shows how the variable references can be replaced with more than just a filename. For example…

      for /r %F in (*) do copy %F “.\%~znxF”

      …will copy each file (%F) to the current directory (.) with a new file name that incorporates the filename, extension, and file size (.\%~znxF, where z = file size, n = filename, and x = file extension). This may get you what you need, but it does rename EVERY file and it relies on the assumption that files with the same name do not also have the same size.

      PowerShell is a better tool for flattening a folder hierarchy when you have files with the same name. See http://powershell.com/cs/media/p/1253.aspx or http://www.vistax64.com/powershell/207284-copy-move-file-rename-if-destination-exist.html for some PowerShell scripts that may help you get started.

      • karlhale says:

        Can this be modified to add the containing folder to the final file name? As in:

        Current structure:
        RootFolder
        – FolderA
        – – File1.jpg
        – – File2.jpg
        – FolderB
        – – File1.jpg
        – – File3.jpg

        Final structure:
        RootFolder
        – FolderA_File1.jpg
        – FolderA_File2.jpg
        – FolderB_File1.jpg
        – FolderB_File3.jpg

        Thanks for your help!

      • mlichtenberg says:

        I’ve looked at this for a bit this evening, and unfortunately do not see a way to accomplish what you ask. Essentially you need a way to manipulate the path and filename values that are available via the variable references. Perhaps something like this…

        for /r %f in (*) do @copy “%f” replace(%f, “\”, “-“)

        … would work if the “replace” function was a real thing. You might have to investigate something like Powershell.

      • karlhale says:

        Bummer. Thanks for looking into it!

    • Tim says:

      I am so sorry to change the subject slightly here, but I know you guys can help based on the complexity of the solutions provided in this thread.

      I am using the following as a .bat file to update two different text files with file names in a directory:

      cd “Graphics\Manufacturing – Holding Area>”
      dir /b>..\filelist.txt
      dir /b>filelist.txt

      My question is, is there a way to output as CSV file instead of text file (so there are comma’s like this FIELDTERMINATOR = ‘,’ and ROWTERMINATOR = ‘\n’?

      Also, is it possible to return two columns of data in the text/csv file? I would like to put the immediate folder name in one column (without the full path) and the filename in the other column.

      Is this possible using windows CMD prompt?

  2. Soap Distant says:

    Thank you for this, very useful

  3. Chris D says:

    Yes! This was fantastic and easy! Know if there’s a way to force overwrite existing files if duplicates exist in the target dir? Thanks! -Chris

    • mlichtenberg says:

      To force overwrite of existing files in the destination folder, just include the /Y switch to the COPY/MOVE command. That switch tells both the COPY and MOVE commands to suppress the overwrite prompt. In the examples I’ve given, the /Y switch belongs right after the @copy command. For example,

      for /r C:\Folder0 %f in (*) do @copy /Y “%f” C:\DestFolder

      Hope that helps.

  4. Tim says:

    Is there a way surpress the overwrite prompt to overwrite the file, but only if the copied file is newer?

    • mlichtenberg says:

      Not to my knowledge. That may require something a bit more sophiticated. I suspect PowerShell could do the job, although I’m not too familiar with that tool.

      • Tim says:

        Sorry, I don’t mean just supress the prompt only if it’s newer. I just mean instead of copying all of the files everytime, can it copy files that have a new modification date than that of the destination folder? Is this what you are saying would need to be done in some utility app, as opposed to command line?

      • mlichtenberg says:

        Turns out there is a way to do this… using XCOPY instead of COPY. For suppressing the prompt, the XCOPY command accepts the same /Y switch that the COPY command does. In addition, XCOPY accepts the /D switch, which “copies only those files whose source time is newer than the destination time”.

        So, an example of using the FOR…IN…DO commands to copy only newer files is:

        for /r %f in (*) do @xcopy “%f” . /D /Y

  5. Tim says:

    Is there a way to do this and instead of copy the files or move the files, instead get a single (recursive) list of all of the (*) files…simliar to the dir /b>filelist.txt command?

    • mlichtenberg says:

      Yep. Just plug your “DIR” command into the FOR…IN…DO command in place of the COPY/MOVE. For example:

      for /r %f in (*) do @dir /B > Filelist.txt

      Or, if you prefer to list only files (and not folders), use this:

      for /r %f in (*) do @dir /B /A-d > FilelistNoDir.txt

      • mlichtenberg says:

        Wow, no, I got that completely wrong. That’s what I get for trying to reply to a blog comment when I should be working. :-)

        Let’s try that again.

        You can get a list of the files by executing the following command:

        for /r %f in (*) do @echo %~nxf >> filelist.txt

        Instead of replacing the COPY/MOVE commands with DIR (as I had mistakenly replied earlier today), replace them with ECHO and use the >> operator to append the output to filelist.txt. Essentially, this command operates on each file in the folder hierarchy, echoing the filename and redirecting the output to a file. The %~nxf parameter results in an output of only filenames and extensions… to include file paths, use simply %f.

        Hope that helps.

  6. Tim says:

    Absolutely outstanding. This is great. Thanks so much for the help!

  7. Rick Bombaci says:

    Mike,

    I found this thread quite interesting, and appreciated your clear descriptions. How about this for a twist? Suppose I have a multi-level directory structure (let’s say there are three levels), and I want to aggregate all of the third level folders along with their contents into the first (top) level folder so that I can then delete the now-empty second level folders? I’d love to see command prompt solution!

    Rick

    • mlichtenberg says:

      Hmmm… before I think about a solution, let me make sure I understand the scenario.

      Is this what you are describing… a root folder (Folder1) that contains only a subfolder (Folder2) that contains only another subfolder (Folder3) that contains one or more files? And you want to move Folder3 (and its contents) into Folder1, and then delete Folder2?

      So in a sense you want to rearrange the folder hierarchy, rather than flattening it?

      • Yes, although there are multiple Level 2 folders, each of which contains multiple Level 3 folders. End product is all of the Level 3 folders residing directly beneath Level 1 (so they are now Level 2 folders), with their former parent folders (now empty) being their siblings. Which reminds me: Ever hear of the song “I’m My Own Grandpa”?

        Thanks for thinking about this!

      • mlichtenberg says:

        I think I have a solution for you, but I’ve only attempted a few basic tests, so PLEASE remember to back up your files before givng this a try.

        My test case (for which the following worked fine) had a root folder with two subfolders. The first subfolder had two subfolders of its own, and the second subfolder had one subfolder of its own. The three “leaf” subfolders contained random numbers of files.

        First, to move all of the subfolders and their contents to the root folder, navigate to the root folder and execute the following:

        for /r %f in (.) do @move “%f” .

        The key is to use the period rather than the asterisk to indicate that the %f variable should represent folders rather than files. This ensures that the “move” command moves folders, rather than just files.

        Now that everything has been moved to the root, the root folder contains both empty and non-empty subfolders. To clean up the empty subfolders, execute this:

        for /r %f in (.) do @rmdir “%f”

        In my case, each of these commands produced a few error messages due to the “move” and “rmdir” commands attempting to work on folders to which they did not have rights (for example, “rmdir” without options will not remove a non-empty folder). Those errors are ignored by the “for” command, and can be safely ignored by you as well.

        As I mentioned, this worked for my single basic test case; you may not have the same results. Let me know if you have success.

      • mlichtenberg says:

        One more thing… it turns out that the solution I gave should work fine for a three-level folder hierarchy, but needs a slight modification if there are more than three levels.

        Consider the case of three levels: folder1 is the root, and it has a subfolder of folder2 that has a subfolder of its own named folder3. The first “for” command executes a “move” on each folder in the hierarchy. folder1 is considered first, but as it is the root the “move” command has no affect. folder2 is next. The “move” command again has no affect, as folder2 is already a subfolder of folder1. folder3 is processed last, and is moved to folder1. Just what we want.

        Now consider a four-level folder hierarchy… extend our example to have a folder4 that is a subfolder of folder3. Once folder3 is moved to folder1, folder4 is not parsed by the “for” command. The “for” command was parsing folder1 > folder2 > folder3 > folder4. But, moving folder3 to folder1 takes folder4 along with it. Because folder4 is no longer where it was when the “for” command started processing the folder hierarchy, it gets “missed”. The end result after the “for” command completes is that folder1 has subfolders of folder2 and folder3, and folder3 still has a subfolder of folder4.

        Hope you followed that… read it a couple times if necessary.

        The solution, it seems, is that the first “for” command in my original solution (the one that executes the “move” command) needs to be run N – 2 times, where N is the highest number of levels in your folder hierarchy. Consider the previous example, which has four levels. If the “for” command was run a second time, folders 1, 2, and 3 would be unchanged, and folder4 would be moved into folder1. That would give us the result we were trying to achieve.

        Does this help, or have I lost you along the way?

  8. I understand you perfectly. Thanks for working on this. When I tried it, I got an error message (from FOR or MOVE): “. [period] was unexpected at this time.” So the “dot” is not being recognized properly. I’ll test this some more, but I’m going for a quick walk, as the sun has just come out for the first time in a few days. FYI, I’m running Win 7, ran this in a cmd window batch file. Maybe I should test it manually at the command prompt, because, if memory serves me right, it’s necessary to add some addtional % symbols or something when you run this inside a batch file. Will let you know if I have success.

  9. When entered manually at the command line, your solution works great. But I’ve been unable to get it to work properly in a batch file, despite adding the extra % symbols, in which case the command pauses as if processing, but ends up doing nothing, and yields no error message.

    Any ideas?

    • mlichtenberg says:

      These are great questions; I should probably create a new blog post on just this topic.

      By simply doubling up on the percent signs (%) I was able to get things to work in a batch file (Windows 7). Specifically, placing a batch file containing just the following two lines into the root of a three-level folder heirarchy worked fine:

      for /r %%f in (.) do @move "%%f" .
      for /r %%f in (.) do @rmdir "%%f"

      Expanding on that to make it work on folder hierarchies of any depth was easy as well. Using the post found at http://tom.paschenda.org/blog/?p=26 as a guide, I expanded the batch file to the following:

      @echo off
      SET /a i = 0

      :loop
      IF %i%==%1 GOTO END

      for /r %%f in (.) do @move "%%f" .

      SET /a i=%i%+1
      GOTO LOOP

      :end

      for /r %%f in (.) do @rmdir "%%f"

      To handle hierarchies of any depth, place that batch file into the root folder and call it with the maximum number of levels in the hierarchy. For example, use “batch.bat 4″ for a four-level hierarchy. (If you look close, you’ll see that a value of “2” is really all that is needed to process a four-level hierarchy. That’s the N-2 executions of the “move” command that I mentioned in a earlier comment. Since the two extra executions do nothing, it is easiest just to remember to supply the total number of levels rather than following the N-2 rule.)

      • Rick says:

        Well, I fiddled with my batch file – exactly like your test case, and the darn thing just doesn’t work. Suspecting a rights issue, I ran the batch file as admin. Still didn’t work. Clueless at this point …

      • mlichtenberg says:

        Sorry to hear that you are still having problems. I’m not sure what else to suggest… you mentioned Windows 7, which is the same platform I am using. I created the test files/folders and batch file using a “regular” (non-admin) command prompt, and ran the batch file the same way. So I’m not sure what the problem could be.

        Let me know if you can (or want to) provide more specifics about your situation; we can continue the conversation via email if you prefer.

      • Howdy and thanks for your offer. Unless you prefer, I’m fine using the blog as a medium. I’ve had to backburner his for a bit, catching up on other stuff. I’ll shoot you a line when I have a chance. Thanks again.

  10. Patti says:

    Just wanted to drop a line and say – WOW, did I NEED this! Thank you! I have been struggling with robocopy, but with 1000’s of sub-folders, I was slowly losing my mind. Many thanks.

  11. Mike says:

    Thankyou, I have been struggling to flatten a directory structure with 300k files (unique names) in 300k folders. Everything else i tried ended up crashing.

  12. Pingback: Flatten files in a folder hierarchy « Nerd Fever

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: