-
[pandas] add new column / delete columnRoad to Data Analyst/Python 2023. 1. 21. 10:37
1. Column
import pandas as pd df = pd.DataFrame({'a': [1,1,3,4,5], 'b': [2,3,2,3,4], 'c': [3,4,7,6,4]}) # 1. to simply add new column df['d'] = [1,3,6,4,8] # 2. add one number in that column (will be filled out with that number) df['e'] = 1 # +) calculated result can be also created as a new column. # check datatype first print(df.dtypes) df['f'] = df['a'] + df['b'] - df['c'] # delete a column df.drop(['d', 'e', 'f'], axis=1, inplace=True)
when you delete a column, be aware that you need to write 'axis=1' to notify program that the user wants to delete the column.
2. Row
import pandas as pd df = pd.DataFrame({'a': [1,1,3,4,5], 'b': [2,3,2,3,4], 'c': [3,4,7,6,4]}) # simply add a new row - 'ignore_index=True' should be written!!! df = df.append({'a':6, 'b':7, 'c':8}, ignore_index=True) # add a new row by using loc (it will be added after the last row of the original dataset) df.loc[6] = [7,8,9] # however, if the number n in loc[n] is between the existing dataframe, # then the new row data will be replaced that existing data df.loc[1] = [7,8,9]
there are several ways to delete/drop the row of the data
# delete a certain row - similar to list index info. df.drop(1) := drop/delete the second row df = df.drop(1) print(df) # delete several rows in a time df = df.drop([0,1]) print(df) # delete a range of rows in a time # limitation: only can delete from first row to nth row df = df.drop([i for i in range(4)]) # more flexible way df = df.drop(df[df['a'] < 4].index) # more than one conditions df = df.drop(df[(df['a'] < 3) & (df['c'] == 4)].index)
'Road to Data Analyst > Python' 카테고리의 다른 글
[pandas] merge/concat data-frame (0) 2023.01.21 [pandas] Data transformation - using current data to classify (0) 2023.01.21 [pandas] Data type conversion (astype etc...) (0) 2023.01.21 [pandas] Missing value (0) 2023.01.18 [pandas] Sort (0) 2023.01.18