Python Scripts to Delete the Files Regularly

Cleaning file system regularly manually is not good. Automate them!

Deleting files and folders manually is not an exciting task, as one may think. It makes sense to automate them.

Here comes Python to make our lives easier.is an excellent programming language for scripting. We are going to take advantage of Python to finish our task without any obstacle. First, you should know why Python is a good choice.

  • Python is an all-time favorite language for automating tasks
  • Less code compared to other programming languages
  • Python is compatible with all the operating systems. You can run the same code in Windows, Linux, and Mac.
  • Python has a module called os that helps us to interact with the operating system. We are going to use this module to complete our automation of deleting the files.

We can replace any annoying or repetitive system tasks using Python. Writing scripts for completing a specific system task is a cupcake if you know Python. Let’s look at the following use case.

Note: the following are tested on Python 3.6+

Removing files/folders older than X days

Often you don’t need old logs, and you regularly need to clean them to make storage available. It could be anything and not just logs.

We have a method called stat in the os module that gives details of last access (st_atime), modification (st_mtime), and metadata modification (st_ctime) time. All the methods return time in seconds since the epoch. You can find more details about the epoch.

We will use a method called os.walk(path) for traversing through the subfolders of a folder.

Follow the below steps to write code for the deletion files/folders based on the number of days.

  • Import the modules time, os, shutil
  • Set the path and days to the variables
  • Convert the number of days into seconds using time.time() method
  • Check whether the path exists or not using the os.path.exists(path) module
  • If the path exists, then get the list of files and folders present in the path, including subfolders. Use the method os.walk(path), and it will return a generator containing folders, files, and subfolders
  • Get the path of the file or folder by joining both the current path and file/folder name using the method os.path.join()
  • Get the ctime from the os.stat(path) method using the attribute st_ctime
  • Compare the ctime with the time we have calculated previously
  • If the result is greater than the desired days of the user, then check whether it is a file or folder. If it is a file, use the os.remove(path) else use the shutil.rmtree() method
  • If the path doesn’t exist, print not found message

Let’s see the code in detail.

# importing the required modulesimport osimport shutilimport time# main functiondef main():	# initializing the count	deleted_folders_count = 0	deleted_files_count = 0	# specify the path	path = '/PATH_TO_DELETE'	# specify the days	days = 30	# converting days to seconds	# time.time() returns current time in seconds	seconds = time.time() - (days * 24 * 60 * 60)	# checking whether the file is present in path or not	if os.path.exists(path): # iterating over each and every folder and file in the path for root_folder, folders, files in os.walk(path): # comparing the days if seconds >= get_file_or_folder_age(root_folder): # removing the folder remove_folder(root_folder) deleted_folders_count += 1 # incrementing count # breaking after removing the root_folder break else: # checking folder from the root_folder for folder in folders: # folder path folder_path = os.path.join(root_folder, folder) # comparing with the days if seconds >= get_file_or_folder_age(folder_path): # invoking the remove_folder function remove_folder(folder_path) deleted_folders_count += 1 # incrementing count # checking the current directory files for file in files: # file path file_path = os.path.join(root_folder, file) # comparing the days if seconds >= get_file_or_folder_age(file_path): # invoking the remove_file function remove_file(file_path) deleted_files_count += 1 # incrementing count else: # if the path is not a directory # comparing with the days if seconds >= get_file_or_folder_age(path): # invoking the file remove_file(path) deleted_files_count += 1 # incrementing count	else: # file/folder is not found print(f''{path}' is not found') deleted_files_count += 1 # incrementing count	print(f'Total folders deleted: {deleted_folders_count}')	print(f'Total files deleted: {deleted_files_count}')def remove_folder(path):	# removing the folder	if not shutil.rmtree(path): # success message print(f'{path} is removed successfully')	else: # failure message print(f'Unable to delete the {path}')def remove_file(path):	# removing the file	if not os.remove(path): # success message print(f'{path} is removed successfully')	else: # failure message print(f'Unable to delete the {path}')def get_file_or_folder_age(path):	# getting ctime of the file/folder	# time will be in seconds	ctime = os.stat(path).st_ctime	# returning the time	return ctimeif __name__ == '__main__':	main()

You need to adjust the following two variables in the above code based on the requirement.

days = 30 path = '/PATH_TO_DELETE'

Removing files larger than X GB

Let’s search for the files that are larger than a particular size and delete them. It is similar to the above script. In the previous script, we have taken age as a parameter, and now we will take size as a parameter for the deletion.

# importing the os moduleimport os# function that returns size of a filedef get_file_size(path):	# getting file size in bytes	size = os.path.getsize(path)	# returning the size of the file	return size# function to delete a filedef remove_file(path):	# deleting the file	if not os.remove(path): # success print(f'{path} is deleted successfully')	else: # error print(f'Unable to delete the {path}')def main():	# specify the path	path = 'ENTER_PATH_HERE'	# put max size of file in MBs	size = 500	# checking whether the path exists or not	if os.path.exists(path): # converting size to bytes size = size * 1024 * 1024 # traversing through the subfolders for root_folder, folders, files in os.walk(path): # iterating over the files list for file in files: # getting file path file_path = os.path.join(root_folder, file) # checking the file size if get_file_size(file_path) >= size: # invoking the remove_file function remove_file(file_path) else: # checking only if the path is file if os.path.isfile(path): # path is not a dir # checking the file directly if get_file_size(path) >= size: # invoking the remove_file function remove_file(path)	else: # path doesn't exist print(f'{path} doesn't exist')if __name__ == '__main__':	main()

Adjust the following two variables.

path = 'ENTER_PATH_HERE' size = 500

Removing files with a specific extension

There might be a scenario where you want to delete files by their extension types. Let’s say .log file. We can find the extension of a file using the os.path.splitext(path) method. It returns a tuple containing the path and the extension of the file.

# importing os moduleimport os# main functiondef main(): # specify the path path = 'PATH_TO_LOOK_FOR' # specify the extension extension = '.log' # checking whether the path exist or not if os.path.exists(path): # check whether the path is directory or not if os.path.isdir(path): # iterating through the subfolders for root_folder, folders, files in os.walk(path): # checking of the files for file in files: # file path file_path = os.path.join(root_folder, file) # extracting the extension from the filename file_extension = os.path.splitext(file_path)[1] # checking the file_extension if extension == file_extension: # deleting the file if not os.remove(file_path): # success message print(f'{file_path} deleted successfully') else: # failure message print(f'Unable to delete the {file_path}') else: # path is not a directory print(f'{path} is not a directory') else: # path doen't exist print(f'{path} doesn't exist')if __name__ == '__main__': # invoking main function main()

Don’t forget to update the path and extension variable in the above code to meet your requirements.

I would suggest testing the scripts in the NON PRODUCTION environment. Once you are satisfied with the results, you can schedule through(if using Linux) to run it periodically for maintenance work. Python is great to achieve this stuff and if interested in learning to do more then check out this.