Closed
Description
I'm working with a MS SQL Server 2019. When reading a datetime field, read_sql
correctly captures it in a pandas dataframe. When I convert that dataframe to an arrow table, datetimes are also retained correctly. However, loading directly to an arrow table truncates the datetime fields to midnight. I'd like to remove the pandas dependency and load directly to an arrow table. Is there a way to do this without truncating the datetime?
Example Code:
import connectorx as cx
import pyarrow as pa
con_string = 'mssql://user:[email protected]%5CG:1439/database'
print('------- Pandas Table -------')
query = 'SELECT top 5 Datum FROM Termine WHERE Datum>getdate()'
pandas_table = cx.read_sql(con_string, query, return_type='pandas')
print(pandas_table)
print('------- Arrow table from pandas -------')
arrow_table_from_pandas = pa.Table.from_pandas(pandas_table)
print(arrow_table_from_pandas)
print('------- Arrow Table -------')
arrow_table = cx.read_sql(con_string, query, return_type='arrow')
print(arrow_table)
Example Output:
------- Pandas Table -------
Datum
0 2022-02-06 07:30:00
1 2022-02-07 00:00:00
2 2022-02-07 00:00:00
3 2022-02-07 07:00:00
4 2022-02-07 07:30:00
------- Arrow table from pandas -------
pyarrow.Table
Datum: timestamp[ns]
----
Datum: [[2022-02-06 07:30:00.000000000,2022-02-07 00:00:00.000000000,2022-02-07 00:00:00.000000000,2022-02-07 07:00:00.000000000,2022-02-07 07:30:00.000000000]]
------- Arrow Table -------
pyarrow.Table
Datum: date64[ms]
----
Datum: [[2022-02-06,2022-02-07,2022-02-07,2022-02-07,2022-02-07]]
Metadata
Metadata
Assignees
Labels
No labels