-
[pandas] SortRoad to Data Analyst/Python 2023. 1. 18. 09:50
Basically, in pandas, there are two basic ways to sort - (1) sort by index and (2) sort by value.
1) Sort by index
df = pd.DataFrame({'a': [2,3,2,7,4], 'b': [2,1,3,5,3], 'c': [1,1,2,3,5]}) # ascending print(df.sort_index()) # descending print(df.sort_index(acending=False))
If you want to reset the index information as the one we sorted,
df.sort_index(ascending=False, inplace=True) # (1) print(df.reset_index()) # (2) print(df.reset_index(drop=True))
(1) You can see we created a new index information. However, original index information still exists
(2) When you drop=True, then the original index information is completely replaced by newly created index information.
2) Sort by value
df = pd.DataFrame({'a': [2,3,2,7,4], 'b': [2,1,3,5,3], 'c': [1,1,2,3,5]}) # sort by value - standard: a specific column e.g., by a column 'a' df.sort_values(by=['a'], inplace=True) print(df) # descending print(df.sort_values(by=['a'], ascending=False, inplace=True)) print(df) # sort by value - standard: several columns df.sort_values(by=['a', 'b'], inplace=True) print(df)
If we sort by value as a standard of several columns, we might misunderstand that both column 'a' and 'b' are ascendingly sorted such as 'a': [2,2,3,4,7] and 'b': [1,2,3,3,5].
However, code is written as "...by=['a', 'b']..." The program sorts by values according to 'a'. When there are same value in 'a' (e.g., 2 in 'a'), then within those value 'b' will be sorted.
Here is the result.
As we sorted the value, it is recommended to reset the index information.
df.reset_index(drop=True, inplace=True) print(df)
'Road to Data Analyst > Python' 카테고리의 다른 글
[pandas] Data type conversion (astype etc...) (0) 2023.01.21 [pandas] Missing value (0) 2023.01.18 [pandas] value_counts() 특정변수 least occurrence로 정렬하기 (0) 2022.09.24 [NumPy] np.ceil(), np.copysign(), np.intersect1d() (0) 2022.09.12 [pandas] str.slice() & lambda - 데이터셋에서 맨 앞에 있는 화폐단위 삭제 및 float로 변환 (0) 2022.09.12