-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
API: to_datetime(ints, unit) give requested unit #63347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
73c582c
0eda6d9
916da40
f9fd868
a5d7e51
a7955da
b62cc4f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1312,7 +1312,11 @@ def _try_convert_to_date(self, data: Series) -> Series: | |
| date_units = (self.date_unit,) if self.date_unit else self._STAMP_UNITS | ||
| for date_unit in date_units: | ||
| try: | ||
| return to_datetime(new_data, errors="raise", unit=date_unit) | ||
| # Without this as_unit cast, we would fail to overflow | ||
| # and get much-too-large dates | ||
| return to_datetime(new_data, errors="raise", unit=date_unit).dt.as_unit( | ||
| "ns" | ||
|
Comment on lines
+1315
to
+1318
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not directly understanding that comment.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is inside a block that tries large units and if they overflow then tries smaller units. This PR makes the large units not-overflow in cases where this piece of code expects them to. Without this edit, e.g. |
||
| ) | ||
| except (ValueError, OverflowError, TypeError): | ||
| continue | ||
| return data | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -955,7 +955,7 @@ def test_date_format_frame_raises(self, datetime_frame): | |
| ], | ||
| ) | ||
| def test_date_format_series(self, date, date_unit, datetime_series): | ||
| ts = Series(Timestamp(date).as_unit("ns"), index=datetime_series.index) | ||
| ts = Series(Timestamp(date), index=datetime_series.index) | ||
| ts.iloc[1] = pd.NaT | ||
| ts.iloc[5] = pd.NaT | ||
| if date_unit: | ||
|
|
@@ -964,7 +964,7 @@ def test_date_format_series(self, date, date_unit, datetime_series): | |
| json = ts.to_json(date_format="iso") | ||
|
|
||
| result = read_json(StringIO(json), typ="series") | ||
| expected = ts.copy() | ||
| expected = ts.copy().dt.as_unit("ns") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not for this PR, but so this is another case where we currently return ns unit but could change to use us by default? |
||
| tm.assert_series_equal(result, expected) | ||
|
|
||
| def test_date_format_series_raises(self, datetime_series): | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -502,7 +502,7 @@ def test_groupby_resample_empty_sum_string( | |
| result = gbrs.sum(min_count=min_count) | ||
|
|
||
| index = pd.MultiIndex( | ||
| levels=[[1, 2, 3], [pd.to_datetime("2000-01-01", unit="ns")]], | ||
| levels=[[1, 2, 3], [pd.to_datetime("2000-01-01", unit="ns").as_unit("ns")]], | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did this PR change that? (that this no longer returns nanoseconds) |
||
| codes=[[0, 1, 2], [0, 0, 0]], | ||
| names=["A", None], | ||
| ) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me wonder about a potential corner case: what if the original actually has values for a lower resolution, like actual microseconds or nanoseconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, this needs fixing:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this actually happens on main too. The offending line is