Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1284 +/- ##
==========================================
- Coverage 97.49% 96.58% -0.92%
==========================================
Files 234 235 +1
Lines 2793 2866 +73
==========================================
+ Hits 2723 2768 +45
- Misses 70 98 +28 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull Request Overview
This PR introduces time span detection functionality to the dateparser library, allowing it to recognize and parse expressions like "past month" and "last week" to return start and end date ranges instead of single dates.
- Adds a new
RETURN_TIME_SPANsetting to enable time span detection - Implements configurable start-of-week and days-in-month settings for customizing span calculations
- Creates comprehensive test coverage for various time span scenarios
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test_search.py |
Adds comprehensive test cases for time span detection functionality |
docs/settings.rst |
Documents the new time span settings |
docs/introduction.rst |
Adds introductory documentation and examples for time span detection |
dateparser_data/settings.py |
Adds default values for new time span settings |
dateparser/utils/time_spans.py |
New utility module implementing time span detection and date range generation |
dateparser/search/search.py |
Integrates time span detection into the main search functionality |
dateparser/conf.py |
Adds validation rules for new settings |
README.rst |
Updates feature list to mention time span detection |
| end_date = base_date | ||
|
|
||
| if span_type == "month": | ||
| start_date = end_date - relativedelta(days=days_in_month) |
There was a problem hiding this comment.
Using relativedelta(days=days_in_month) for month calculations can lead to unexpected behavior. The relativedelta class's days parameter adds a fixed number of days, which doesn't account for varying month lengths. Consider using relativedelta(months=1) for more accurate month calculations or document this intentional behavior.
| start_date = end_date - relativedelta(days=days_in_month) | |
| start_date = end_date - relativedelta(months=1) |
There was a problem hiding this comment.
Would this make sense?
I wonder if we can get rid of DEFAULT_DAYS_IN_MONTH altogether and be smarter, but no strong opinion.
| start_date = base_date | ||
|
|
||
| if span_type == "month": | ||
| end_date = start_date + relativedelta(days=days_in_month) |
There was a problem hiding this comment.
Similar to line 120, using relativedelta(days=days_in_month) for future month calculations may not behave as expected. Consider using relativedelta(months=1) for month-based calculations.
| end_date = start_date + relativedelta(days=days_in_month) | |
| end_date = start_date + relativedelta(months=1) |
| start_date = end_date - relativedelta(days=days_in_month) | ||
| elif span_type == "week": | ||
| week_start = get_week_start(end_date, start_of_week) | ||
| start_date = week_start - timedelta(days=7) |
There was a problem hiding this comment.
The calculation for 'last week' appears incorrect. For 'last week', the start should be the beginning of the previous week, and the end should be the end of the previous week. Currently, end_date = week_start - timedelta(days=1) makes the end date the day before the current week starts, but start_date = week_start - timedelta(days=7) would be 7 days before the current week starts, creating an 8-day span instead of 7 days.
| elif span_type == "week": | ||
| week_start = get_week_start(end_date, start_of_week) | ||
| start_date = week_start - timedelta(days=7) | ||
| end_date = week_start - timedelta(days=1) |
There was a problem hiding this comment.
The end date calculation for 'last week' should be the end of the previous week, not the day before the current week starts. Consider using end_date = week_start + timedelta(days=6) - timedelta(days=7) or similar logic to ensure a proper 7-day week span.
| end_date = week_start - timedelta(days=1) | |
| end_date = start_date + timedelta(days=6) |
Gallaecio
left a comment
There was a problem hiding this comment.
I also wonder about the return type, i.e. I wonder if it should be new classes, and it maybe each pair should be a single instance (we could implement __iter__ to support something like start, end = *span). No strong opinion either, though; we can always allow a different, more complex output in the future with some new setting.
Close #388 #1270