按大小与计数进行汇总
size
和 count
的区别在于:
df = pd.DataFrame(
{"Name":["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"],
"City":["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"],
"Val": [4, 3, 3, np.nan, np.nan, 4]})
df
# Output:
# City Name Val
# 0 Seattle Alice 4.0
# 1 Seattle Bob 3.0
# 2 Portland Mallory 3.0
# 3 Seattle Mallory NaN
# 4 Seattle Bob NaN
# 5 Portland Mallory 4.0
df.groupby(["Name", "City"])['Val'].size().reset_index(name='Size')
# Output:
# Name City Size
# 0 Alice Seattle 1
# 1 Bob Seattle 2
# 2 Mallory Portland 2
# 3 Mallory Seattle 1
df.groupby(["Name", "City"])['Val'].count().reset_index(name='Count')
# Output:
# Name City Count
# 0 Alice Seattle 1
# 1 Bob Seattle 1
# 2 Mallory Portland 2
# 3 Mallory Seattle 0