Login

Please fill in your details to login.





collect files from drive into one folder

This funky Google Apps script searches drive for files and collects them in a folder you specify.
Scenario: There are literally thousands of PDF documents in my Google drive folder and I want to be able to search for them and collect a copy together in a single folder.
Problem: Searching manually in Google drive, especially when you've got lots of results, can be slow as can copying. The other issue is that if I don't complete this all in one go, I can't remember which files I've copied and which I haven't!
Solution: Embedded Google Apps script, of course!

Part of the solution to this is to list the files I've found with their original location into a spreadsheet, so it seems to make perfect sense to use an embedded Google Apps Script. I can also use the spreadsheet to provide the parameters and also dialogue feedback along the way. Perfick.

Here is a link to make a copy of my spreadsheet so you can have a play yourself.

The Spreadsheet


The Google Sheet has two tabs - one for setting up the search and cataloguing the results and the second to hold a list of all the common media/MIMEtypes for the files which may be on drive. You don't have to search using mediatype but it may be easier to use this to narrow down your search, especially if you are looking just for Google drive files which have the mediatypes...

application/vnd.google-apps.audio
application/vnd.google-apps.document (Google Docs)
application/vnd.google-apps.drive-sdk (3rd party shortcut)
application/vnd.google-apps.drawing (Google Drawing)
application/vnd.google-apps.file (Google Drive file)
application/vnd.google-apps.folder (Google Drive folder)
application/vnd.google-apps.form (Google Forms)
application/vnd.google-apps.fusiontable (Google Fusion Tables)
application/vnd.google-apps.map (Google My Maps)
application/vnd.google-apps.photo
application/vnd.google-apps.presentation (Google Slides)
application/vnd.google-apps.script (Google Apps Scripts)
application/vnd.google-apps.shortcut (Shortcut)
application/vnd.google-apps.site (Google Sites)
application/vnd.google-apps.spreadsheet (Google Sheets)
application/vnd.google-apps.unknown
application/vnd.google-apps.video

In fact, I've had to add these vendor specific mediatypes into the list as IANA doesn't seem to list them 🤔...

image

If you wish to use media types, select the category in cell
B1
and then the media type from cell
B2
.

Type the search term in cell
B3
- this can be any part of the title of the file including the file extension. I find that sometimes, searching for partial filenames is a little hit and miss. Sometimes it finds all the files, sometimes some and often none. No idea why this is...

The Google Drive ID for the folder in which you would like to collect the copies of all the files going into cell
B4
. This is easy to find in the URL of the folder you create to collect the files...

image

Simply copy and paste the ID (only the part after https://drive.google.com/drive/folders/) into cell
B4
.

Next, specify a 'Collected prefix' in cell
B5
- a flag which the script will place at the start of the filename of each file it finds after it has copied it into the collection folder. This serves two purposes. If you see a file with this prefix, it means that you already made a copy of it. Secondly, the script ignores files with this prefix when it searches again with the same parameters. As a tip, don't use anything too obvious, or something that might already be in a filename. For instance, I might use something like
x__
or something like that.

If you've got hundreds of files and you'd like to perform the search in batches, then specify a throttling limit in cell
B6
. The maximum number of files (I think) is 1000 so it has to be less than or equal to that. You have to have something in here or there will be no results.

Lastly, specify whether you would like 'Shared with me' results in cell
B7
. Bear in mind that I think you need this to be TRUE for Shared Drives.

Now it's time to run the script!

Click the button 'Search and Collect Files'.
You'll most likely have to authorise the script to access your documents. Choose an account to authorise. Because it's not verified by Google, you'll also get a scary warning...

image

Just continue to 'collect_files' - all is safe, promise. On the next dialogue, just click 'Allow' and we're ready to go. You may have to click 'Search and Collect Files' once more to get the script running.

Firstly, if this is not the first time you've run the script and there are results on the sheet already, you will be warned that they will be deleted. If you are happy, click 'OK' otherwise cancel.

If you've not specified the collection folder ID or the supplied folder ID is invalid, then you will get a warning dialogue. You will *not* be warned if you haven't specified a 'collected prefix' - this is entirely up to you to supply; I would suggest it or else, if you run the script again with the same search, you will collect the same files again. It's up to you. But yeah.

So, if the script finds any files matching your query, it will tell you how many and give you one last chance to back out before it makes the copies. Remember, it's hard (but not impossible) to undo the operation.

So, if you are ready, click 'OK'...

The script will run and populate the results table with every file it finds and copies, including the filename, media type and *original* file location in case it's relevant. For instance, you may have lots of files with the same name stored in different locations.

Finally, if you wish, you can copy the results onto a new sheet in the spreadsheet by clicking
Copy results to new sheet
. The sheet will be named with the date and time so it will never be overwritten in the future.

The Script


I'm particularly pleased with this script. I've used lots of different techniques to achieve this including a little recursive function. I've added comments below the script so as not to clutter up the magic. I've tested it in a 'private' Google Drive environment and in a Workspace environment as well. There were a few tricky bits concerning shared files but I think it's working ok. As always, feedback is welcome.

function collect_files() {

  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var ui = SpreadsheetApp.getUi();

  // Grab settings from spreadsheet
  var mediatype = ss.getRangeByName("list_mediatype").getValue();
  var search = ss.getRangeByName("list_search").getValue();
  var collection_folder_id = ss.getRangeByName("list_collection_folder_id").getValue();
  var collected_prefix = ss.getRangeByName("list_collected_prefix").getValue();
  var limit = ss.getRangeByName("list_limit").getValue();
  var shared_with_me = ss.getRangeByName("list_shared_with_me").getValue();
  var list = ss.getRangeByName("list_results");

  // Clear the results range
  if(!list.isBlank()){
    var result = ui.alert(
      "STOP!",
      "There are still results in this sheet. \
      Continuing with clear the list. Are you ok with that? \
      If you click 'Cancel', you can copy the existing results to a new sheet if you like.",
      ui.ButtonSet.OK_CANCEL
    );
    if(result == ui.Button.CANCEL){
      return;
    }
  }
  list.clear();
  SpreadsheetApp.flush();
  var row = list.getRow(); // Start of the range.

  // Check collection folder exists
  try{
    var collection_folder = DriveApp.getFolderById(collection_folder_id);
  } catch(e){
    ui.alert(
      "Whoops!",
      "Sorry, the collection folder ID seems to be wrong. \
      Check it and try again.",
      ui.ButtonSet.OK
    );
    return;
  }

  // Set up variables for script
  var query = "title contains '" + search + "' " 
            + "and not '" + collection_folder_id + "' in parents "
            + "and trashed=false "
            + "and not mimeType='application/vnd.google-apps.folder'"; // Exclude folders
  if(collected_prefix.length > 1){
    query = query + " and not title contains '" + collected_prefix + "' ";
  }
  if(mediatype.length > 1){
    query = query + " and mimeType = '"+mediatype+"'";
  }

  // Actually perform the search!
  var search = Drive.Files.list( // This will also return items 'shared with me'.
  {
    corpora: "allDrives",
    includeItemsFromAllDrives: true, 
    supportsAllDrives: true, 
    maxResults: 1000, 
    q: query
  }).items;

  // Clean up search results for shared files and limit
  var found = [];
  var i = 0;
  var j = 0;

  while(j <= limit-1 && i <= search.length-1){
    if(search[i].parents.length == 0){ // I think this is a good indicator of shared files
      if(shared_with_me){
        found.push(search[i]);
        j++;
      }
    } else {
      found.push(search[i])
      j++;
    }
    i++;
  }

  // Now do the copy, if there are any results to copy
  if(found.length > 0){
    var result = ui.alert(
      "Continue?",
      "I'm going to copy "+found.length+" item(s). \
      This operation cannot be undone. Is this OK?",
      ui.ButtonSet.OK_CANCEL
    );
    if(result == ui.Button.CANCEL){
      return;
    } else {
      found.forEach(e => {
        var item = DriveApp.getFileById(e.id);
        var drive_mime_type = item.getMimeType();
        if(e.parents.length > 0){ // This file has parents
          var parent = item.getParents().next();
          var parent_folder_name = parent.getName();
          var parent_folder_link = parent.getUrl();
          var parent_folder_id   = parent.getId();
          var parent_folder_tree = get_folder_tree(parent_folder_id,[parent_folder_name])
                                   .reverse()
                                   .join(" > ");
          var parent_folder_cell = '=HYPERLINK("'+parent_folder_link+'","'+parent_folder_tree+'")'
        } else {
          var parent_folder_cell = 'File has no parents (probably shared)';
        }

        var copy = item.makeCopy(collection_folder); // Make a copy and grab the object.

        try{ // Might not be able to rename shared files
          if(copy.getMimeType().startsWith("application/vnd.google-apps.")){ // It's a Google document
            copy.setName(copy.getName().slice(8)); // Remove 'copy of' from Google files
          }
        } catch(error) {
          var shared = true; // Don't need this but have to have something in the catch
        }

        try{ // Might not be able to rename shared files
          item.setName(collected_prefix + item.getName());
        } catch(error) {
          var shared = true; // Don't need this but have to have something in the catch
        }

        var data = [[
          '=HYPERLINK("'+copy.getUrl()+'","'+copy.getName()+'")',
          drive_mime_type,
          parent_folder_cell,
          '=HYPERLINK("'+item.getUrl()+'","'+item.getName()+'")',
        ]]
        ss.getRange("A"+row+":D"+row).setValues(data);
        SpreadsheetApp.flush();
        row += 1;
      });
    }
  } else {
    ui.alert(
      "Nowt!",
      "I can't find anything with those search parameters. \
      Change them and try again if you like.",
      ui.ButtonSet.OK
    );
    return;
  }
  ui.alert(
    "Done!",
    "That's it. Copies of all the files I found are in the collection folder \
    and all the originals are marked with the collection prefix.",
    ui.ButtonSet.OK
  );
  return;
}

// Funky little recursive function to populate the folder tree.
function get_folder_tree(folder_id,tree=[]){
  var folder = DriveApp.getFolderById(folder_id);
  var parent = folder.getParents();
  if(parent.hasNext()){
    var next_parent = parent.next();
    tree.push(next_parent.getName());
    get_folder_tree(next_parent.getId(),tree);
  }
  return tree;
}

// Copy results to new sheet
function copy_results(){
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var ui = SpreadsheetApp.getUi();
  var new_sheet = ss.insertSheet();
  var name = Utilities.formatDate(new Date(), "GMT", "yyyy/MM/dd HH:mm:ss");
  new_sheet.setName(name);
  var results_range = ss.getRangeByName('list_results');
  var results = results_range.getValues();
  results_range.copyTo(new_sheet.getRange(1,1,results.length,results[0].length));
  SpreadsheetApp.flush();
  ui.alert(
    "Done.",
    "Your results have been copied to a new sheet.",
    ui.ButtonSet.OK
  );
}


Prerequisites

DriveAPI
SheetsAPI

Comments

3 : Get the active spreadsheet object so we can manipulate it using it's methods.
4 : Get the UI (user interface) object of the spreadsheet so we can display dialogue boxes and stuff.
7-13 : Grab the settings from the named ranges in the sheet.
16-27 : Check to see whether the results list in the spreadsheet is blank. If it isn't give the user a warning and allow them to back out.
28 : Clear the results range.
29 : Write changes to the spreadsheet.
30 : Get the first row of the results range so we can write the results into the sheet.
33-43 : Check whether the collection folder exists. We try to grab the collection folder object (we need this later anyway) and, if it doesn't exist, it will throw an exception which we 'catch' and give a one way out dialogue box which stops the script.
46-49 : Set up the main part of the query.
50-52 : If the collected prefix is specified, exclude matches from the search results.
53-55 : If the media type is specified, add this to the search query.
58-65 : Perform the search. Notice that
maxResults
is set to 1000 here because we filter the results down to the limit supplied in the next step.
68-83 : Filter the results down for shared and limit.
87-95 : Give the user one last chance to back out.
96-137 : Perform the copy operation.

Last modified: February 26th, 2022
The Computing Café works best in landscape mode.
Rotate your device.
Dismiss Warning