데이터분석 2

drink_df['continent'] = drink_df['continent'].fillna('ETC')

fillna : 결측값을 특정값으로 채운다.

plt.pie(데이터의 실질적인 값, labels=데이터의 레이블 리스트)

plt.pie(pie_values, labels=pie_labels, autopct='%.02f%%')

plt.title('Percentage of each continent')

이름.groupby('보고자 하는 열')['그룹핑 기준이 되는 열'].통계 함수

drink_df.groupby('continent')['beer_servings'].mean()

drink_df.groupby('continent')['wine_servings'].describe()

# 전체 평균보다 많은 알코올을 섭취하는 대륙을 구합니다.

total_mean = drink_df.total_litres_of_pure_alcohol.mean()

continent_mean = drink_df.groupby('continent')['total_litres_of_pure_alcohol'].mean()

continent_over_mean = continent_mean[continent_mean >= total_mean]

print(continent_over_mean)

# 평균 wine_servings이 가장 높은 대륙을 구합니다.

beer_continent = drink_df.groupby('continent').wine_servings.mean().idxmax()

print(beer_continent)

drink_df.groupby('continent').wine_servings.agg(['mean', 'min', 'max', 'sum'])

# agg()함수는 그룹에 대한 여러가지 연산 결과를 동시에 얻을 수 있는 함수입니다. 대륙별 spirit 소비의 통계적 벙보를 구하기 위해서는 agg에 구하고자하는 통계 파라미터를 입력하는 것으로 탐색이 가능합니다.

beer_servings_table.index.to_list()

딱봐도 인덱스를 리스트로 변환해서 출력함.

beer_servings_table['mean'].to_list()

마찬가지로 beer_servings_table의 mean열을 리스트로 출력함.

continents = beer_servings_table.index.to_list()

values = beer_servings_table['mean'].to_list()

plt.bar(index, values, width=0.2, color='g')

plt.xticks(index, continents) <<x축의 눈금 값 설정

plt.show() << 얘는필요없음 코랩기준

Cloud&Database