Program to extract dates and types of list of files

Python Practice for Intermediates

Program to extract dates and types of a list of files

 

This Python practice for intermediates consists of writing a script that counts the number of different types of files, extracts the distinct dates of the files and print them. It is a Python project to practice strings along with lists.

The "filelists" list contains multiple files of some directory of your laptop.
The names of the files always follow the same pattern: name + “_” + date + “.” + file extension:

filelists = [‘salescars_20240331.csv’, ‘bookingshotels_20231209.csv’, ‘visitorsmuseumscities_20240501.xlsx’,
‘helathcareassistants_20231209.csv’,’regulatorylaws_20240615.json’,
‘unknown_20220914.csv’,
‘iris-datascience_20240218.csv’,’universityprofessors_20240731.xlsx’,’
studentsgrades_20240501.json’,’residentscountrieseurope_20231209.xlsx’,
‘carregistrations_20231231.xlsx’,’cellphonesusers_20231117.xlsx’,
‘surveyresponses_20240218.xlsx’,’pollresults_20240615.csv’,
‘defaultratings_20240423.csv’,’trafficfigures_20240104.xlsx’
‘hairstudy_20240806.xlsx’,’electricityconsumption_20231027.xlsx’,
‘gasinventories_20240501.csv’,
‘quartersales_20231001.csv’]

Your task is to create a Python program that checks the list and shows the distinct dates included in the file names and counts how many csv, Excels (xlsx) or json there are in the list.

Guidelines

  1. Create an empty dictionary to store the count of each file type.
  2. Create an empty set to store the distinct dates.
  3. Iterate over each file in the filelists list:
    • Split the file name using the underscore (_) as the separator:
    • Extract the file type by splitting the second part of "file_parts" using the dot (.) as the separator.
    • Extract the date by removing the file type from the second part of "file_parts".
    • Increment the count of the file type in the "file_type_count" dictionary:
    • Add the date to the "distinct_dates" set
  4. Print the number of each type of file:
    • Iterate over the items in the "file_type_count" dictionary
    • Print the file type and its count:
  5. Print the distinct dates:
    • Iterate over the dates in the "distinct_dates" set and display the results.

 

Solutions


filelists = ['salescars_20240331.csv', 'bookingshotels_20231209.csv', 'visitorsmuseumscities_20240501.xlsx','helathcareassistants_20231209.csv','regulatorylaws_20240615.json','unknown_20220914.csv', 'irisdatascience_20240218.csv','universityprofessors_20240731.xlsx',
'studentsgrades_20240501.json','residentscountrieseurope_20231209.xlsx',
'carregistrations_20231231.xlsx','cellphonesusers_20231117.xlsx',
'surveyresponses_20240218.xlsx','pollresults_20240615.csv','defaultratings_20240423.csv',
'trafficfigures_20240104.xlsx','hairstudy_20240806.xlsx','electricityconsumption_20231027.xlsx',
'gasinventories_20240501.csv','quartersales_20231001.csv']

file_type_count = {}
distinct_dates = set()

for file in filelists:
    file_parts = file.split("_")
    file_type = file_parts[1].split(".")[1]
    date = file_parts[1].split(".")[0]
    
    file_type_count[file_type] = file_type_count.get(file_type, 0) + 1
    distinct_dates.add(date)

for file_type, count in file_type_count.items():
    print(f"{file_type}: {count}")

for date in distinct_dates:
    print(date)

Expected Output

csv: 9
xlsx: 9
json: 2
20231117
20240806
20240423
20240615
20231001
20240331
20240731
20231209
20240501
20231231
20240218
20220914
20231027
20240104

Conclusion

We will be happy to hear your thoughts

Leave a reply

Python and Excel Projects for practice
Register New Account
Shopping cart