[2023-04-04] Pandas 2.0 릴리즈 되었습니다.
안녕하세요. 클스 입니다.
Pandas는 Python으로 데이터 분석을 위해 많이 사용하고, 생태계도 넓습니다.
지속적으로 업데이트가 이뤄지기도 합니다. 대용량 빠른 처리를 위해 Polars로 나왔지만 아직 생태계는
Pandas가 넓기 때문에 기본적으로는 Pandas를 알아야 합니다
plotly 같이 시각화를 하는 library도 pandas가 설치되어 있지 않으면 아예 설치가 안됩니다.
그리고 Polars도 to_pandas 와 같이 호환되는 함수를 제공합니다.
다만, 아쉬웠던 점은 대용량 처리, 메모리 그리고 속도였습니다. 이런 것을 해결하기 위해 Polars가 나왔습니다.
그러나 Pandas 가 놀지만 않죠... 2.0에 속도를 향상시키기 위해 PyArrow 가 들어왔습니다.
주의) Pandas 1.5.3이 삭제가 됩니다. 그래서 virtual env를 하나 생성해서 설치하는 것을 권장 드립니다.
가상환경 준비하기
(py3.11.2) [~/quasar]$ pyenv virtualenv 3.11.2 pd20
(py3.11.2) [~/quasar]$ pyenv local pd20
(py3.11.2) [~/quasar]$ pyenv global pd20
(pd20) [~]$ pyenv activate pd20
(pd20) [~/quasar]$ pip install --upgrade setuptools
(pd20) [~/quasar]$ pip install "pandas[all]>=2.0.0"
설치 하기
polars : https://pola-rs.github.io/polars-book/user-guide/introduction.html
https://towardsdatascience.com/pandas-vs-polars-a-syntax-and-speed-comparison-5aa54e27497e
https://levelup.gitconnected.com/pandas-vs-polars-vs-pandas-2-0-fight-7398055372fb
(py3.11.2) [~/quasar]$ pip install "pandas>=2.0.0"
Collecting pandas>=2.0.0
Using cached pandas-2.0.0-cp311-cp311-macosx_10_9_x86_64.whl (11.6 MB)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas>=2.0.0) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas>=2.0.0) (2022.7.1)
Requirement already satisfied: tzdata>=2022.1 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas>=2.0.0) (2022.7)
Requirement already satisfied: numpy>=1.21.0 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas>=2.0.0) (1.24.2)
Requirement already satisfied: six>=1.5 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas>=2.0.0) (1.16.0)
Installing collected packages: pandas
Attempting uninstall: pandas
Found existing installation: pandas 1.5.3
Uninstalling pandas-1.5.3:
Successfully uninstalled pandas-1.5.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
streamlit 1.20.0 requires pandas<2,>=0.25, but you have pandas 2.0.0 which is incompatible.
datasets 1.14.0 requires huggingface-hub<0.1.0,>=0.0.19, but you have huggingface-hub 0.13.3 which is incompatible.
Successfully installed pandas-2.0.0
잘 설치가 되었네요~ 파란색은 무시하시면 됩니다.
(py3.11.2) [~/quasar]$ pip freeze | grep pandas
pandas==2.0.0
(py3.11.2) [~/quasar]$
설치 문제
pip install "pandas>=2.0.0" 은 설치가 잘 되는데, 공식 문서에 나온 것처럼 performance 추가하면 아래 오류가 나옵니다.
(py3.11.2) [~/quasar]$ pip install "pandas[performance]>=2.0.0"
Requirement already satisfied: pandas[performance]>=2.0.0 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (2.0.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas[performance]>=2.0.0) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas[performance]>=2.0.0) (2022.7.1)
Requirement already satisfied: tzdata>=2022.1 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas[performance]>=2.0.0) (2022.7)
Requirement already satisfied: numpy>=1.21.0 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas[performance]>=2.0.0) (1.24.2)
Requirement already satisfied: bottleneck>=1.3.2 in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (from pandas[performance]>=2.0.0) (1.3.7)
Collecting numba>=0.53.1
Using cached numba-0.56.4.tar.gz (2.4 MB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/private/var/folders/jp/p8jz6zy904b_8ygt30z_cxgw0000gp/T/pip-install-ss86xpj5/numba_76b310847f61409091ec20307360fc52/setup.py", line 51, in <module>
_guard_py_ver()
File "/private/var/folders/jp/p8jz6zy904b_8ygt30z_cxgw0000gp/T/pip-install-ss86xpj5/numba_76b310847f61409091ec20307360fc52/setup.py", line 48, in _guard_py_ver
raise RuntimeError(msg.format(cur_py, min_py, max_py))
RuntimeError: Cannot install on Python version 3.11.2; only versions >=3.7,<3.11 are supported.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
(py3.11.2) [~/quasar]$
(py3.11.2) [~/quasar]$ pip install --upgrade setuptools
Requirement already satisfied: setuptools in /Users/keunsookim/.pyenv/versions/3.11.2/envs/py3.11.2/lib/python3.11/site-packages (65.5.0)
Collecting setuptools
Downloading setuptools-67.6.1-py3-none-any.whl (1.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 7.6 MB/s eta 0:00:00
Installing collected packages: setuptools
Attempting uninstall: setuptools
Found existing installation: setuptools 65.5.0
Uninstalling setuptools-65.5.0:
Successfully uninstalled setuptools-65.5.0
Successfully installed setuptools-67.6.1
음 계속 기존 환경에서는 오류가 나오네요~
변경내용 : https://pandas.pydata.org/docs/whatsnew/v2.0.0.html
댓글
댓글 쓰기