
Python Practice for Intermediates
Program to extract dates and types of a list of files
This Python practice for intermediates consists of writing a script that counts the number of different types of files, extracts the distinct dates of the files and print them. It is a Python project to practice strings along with lists.
The "filelists" list contains multiple files of some directory of your laptop.
The names of the files always follow the same pattern: name + “_” + date + “.” + file extension:
filelists = [‘salescars_20240331.csv’, ‘bookingshotels_20231209.csv’, ‘visitorsmuseumscities_20240501.xlsx’,
‘helathcareassistants_20231209.csv’,’regulatorylaws_20240615.json’,
‘unknown_20220914.csv’,
‘iris-datascience_20240218.csv’,’universityprofessors_20240731.xlsx’,’
studentsgrades_20240501.json’,’residentscountrieseurope_20231209.xlsx’,
‘carregistrations_20231231.xlsx’,’cellphonesusers_20231117.xlsx’,
‘surveyresponses_20240218.xlsx’,’pollresults_20240615.csv’,
‘defaultratings_20240423.csv’,’trafficfigures_20240104.xlsx’
‘hairstudy_20240806.xlsx’,’electricityconsumption_20231027.xlsx’,
‘gasinventories_20240501.csv’,
‘quartersales_20231001.csv’]
Your task is to create a Python program that checks the list and shows the distinct dates included in the file names and counts how many csv, Excels (xlsx) or json there are in the list.
Guidelines
- Create an empty dictionary to store the count of each file type.
- Create an empty set to store the distinct dates.
- Iterate over each file in the
filelistslist:- Split the file name using the underscore (
_) as the separator: - Extract the file type by splitting the second part of
"file_parts"using the dot (.) as the separator. - Extract the date by removing the file type from the second part of
"file_parts". - Increment the count of the file type in the
"file_type_count"dictionary: - Add the date to the
"distinct_dates"set
- Split the file name using the underscore (
- Print the number of each type of file:
- Iterate over the items in the
"file_type_count"dictionary
- Print the file type and its count:
- Iterate over the items in the
- Print the distinct dates:
- Iterate over the dates in the
"distinct_dates"set and display the results.
- Iterate over the dates in the
Solutions
filelists = ['salescars_20240331.csv', 'bookingshotels_20231209.csv', 'visitorsmuseumscities_20240501.xlsx','helathcareassistants_20231209.csv','regulatorylaws_20240615.json','unknown_20220914.csv', 'irisdatascience_20240218.csv','universityprofessors_20240731.xlsx',
'studentsgrades_20240501.json','residentscountrieseurope_20231209.xlsx',
'carregistrations_20231231.xlsx','cellphonesusers_20231117.xlsx',
'surveyresponses_20240218.xlsx','pollresults_20240615.csv','defaultratings_20240423.csv',
'trafficfigures_20240104.xlsx','hairstudy_20240806.xlsx','electricityconsumption_20231027.xlsx',
'gasinventories_20240501.csv','quartersales_20231001.csv']
file_type_count = {}
distinct_dates = set()
for file in filelists:
file_parts = file.split("_")
file_type = file_parts[1].split(".")[1]
date = file_parts[1].split(".")[0]
file_type_count[file_type] = file_type_count.get(file_type, 0) + 1
distinct_dates.add(date)
for file_type, count in file_type_count.items():
print(f"{file_type}: {count}")
for date in distinct_dates:
print(date)
Expected Output
csv: 9 xlsx: 9 json: 2 20231117 20240806 20240423 20240615 20231001 20240331 20240731 20231209 20240501 20231231 20240218 20220914 20231027 20240104