目录
[1 散点图 plt.scatter()](#1 散点图 plt.scatter())
[2 条形图](#2 条形图)
[2.1 竖着的条形图 plt.bar()](#2.1 竖着的条形图 plt.bar())
[2.2 横着的条形图 plt.barh()](#2.2 横着的条形图 plt.barh())
[2.3 多条条形图](#2.3 多条条形图)
[2.3.1 横向](#2.3.1 横向)
[2.3.2 纵向](#2.3.2 纵向)
[3 直方图](#3 直方图)
[3.1 频数直方图](#3.1 频数直方图)
[3.1.1 基本使用](#3.1.1 基本使用)
[3.1.2 调整组距](#3.1.2 调整组距)
[3.2 占比直方图](#3.2 占比直方图)
[3.3 条形图模拟直方图](#3.3 条形图模拟直方图)
1 散点图 plt.scatter()
- x_value_list[::3] 是指在x_value_list列表中,每三个取一个
- 3月31日和10月1日之间随便找一个数隔开就行了
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
plt.figure(figsize=(15,6))
march_value = [11,17,16,11,12,11,12,6,6,7,8,9,12,15,14,17,18,21,16,17,20,14,15,15,15,19,21,22,22,22,23]
october_value = [26,26,28,19,21,17,16,19,18,20,20,19,22,23,17,20,21,20,22,15,11,15,5,13,17,10,11,13,12,13,6]
plt.scatter(range(1,32),march_value,label='3月')
plt.scatter(range(41,72),october_value,label='10月')
x_value_list = []
for i in range(1,72):
if i > 31 and i < 41:
continue
x_value_list.append(i)
y_value_list = []
for i in range(1,32):
y_value_list.append('3月'+str(i)+'日')
for i in range(1,32):
y_value_list.append('10月'+str(i)+'日')
plt.xticks(x_value_list[::3],y_value_list[::3],rotation=45,fontproperties=my_font)
plt.legend(prop=my_font)
plt.xlabel("时间",fontproperties=my_font,size=24)
plt.ylabel("温度",fontproperties=my_font,size=24)
plt.title("三月与十月温度情况",fontproperties=my_font,size=24)
plt.show()
2 条形图
2.1 竖着的条形图 plt.bar()
- plt.bar()默认的width是0.8,默认的颜色是蓝色
- 如果觉得x的名字太长了,建议换行,像下面的变形金刚一样
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
plt.figure(figsize=(15,6))
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
x = ["战狼2","速度与激情8","功夫瑜伽","西游伏妖篇","变形金刚5:\n最后的骑士","摔跤吧!爸爸","加勒比海盗5:死无对证","金刚:骷髅岛","极限特工:终极回归","生化危机6:终章","乘风破浪","神偷奶爸3","智取威虎山","大闹天竺","金刚狼3:殊死一战","蜘蛛侠:英雄归来","悟空传","银河护卫队2","情圣","新木乃伊"]
y = [56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,6.86,6.58,6.23]
plt.xticks(range(1,21),x,fontproperties=my_font,rotation=90,size=15)
plt.bar(range(1,21),y,width=0.2,color='orange')
plt.ylabel("票房(单位:亿)",fontproperties=my_font,size=12)
plt.title("2017年内地电影top20",fontproperties=my_font,size=20)
plt.show()
2.2 横着的条形图 plt.barh()
- 如果感觉左侧太挤就把图像拉长就行了
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
plt.figure(figsize=(10,10),dpi=200)
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
y = ["战狼2","速度与激情8","功夫瑜伽","西游伏妖篇","变形金刚5:最后的骑士","摔跤吧!爸爸","加勒比海盗5:死无对证","金刚:骷髅岛","极限特工:终极回归","生化危机6:终章","乘风破浪","神偷奶爸3","智取威虎山","大闹天竺","金刚狼3:殊死一战","蜘蛛侠:英雄归来","悟空传","银河护卫队2","情圣","新木乃伊"]
x = [56.01,26.94,17.53,16.49,15.45,12.96,11.8,11.61,11.28,11.12,10.49,10.3,8.75,7.55,7.32,6.99,6.88,6.86,6.58,6.23]
plt.yticks(range(1,21),y,fontproperties=my_font,size=15)
plt.barh(range(1,21),x,height=0.8,color='orange')
plt.xlabel("票房(单位:亿)",fontproperties=my_font,size=12)
plt.title("2017年内地电影top20",fontproperties=my_font,size=20)
plt.grid(alpha=0.3)
plt.show()
2.3 多条条形图
多条条形图放到一起,实际上是相邻刻度放个小刻度,比如我要放在1的两边,那么就在1旁边搞1.8和2.2这种
2.3.1 横向
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
plt.figure(figsize=(15,6))
bar_width = 0.2
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
x = ["猩球崛起3:终极之战","敦刻尔克","蜘蛛侠:英雄归来","战狼2"]
b_16_y=[15746,312,4497,319]
b_15_y=[12357,156,2045,168]
b_14_y=[2358,399,2358,362]
b_14_x = []
b_15_x = []
b_16_x = []
for i in range(1,5):
b_15_x.append(i)
b_14_x.append(i-bar_width)
b_16_x.append(i+bar_width)
plt.xticks(b_15_x,x,fontproperties=my_font,size=15)
plt.bar(b_14_x,b_14_y,width=bar_width,color='orange',label="14日票房")
plt.bar(b_15_x,b_15_y,width=bar_width,color='red',label="15日票房")
plt.bar(b_16_x,b_16_y,width=bar_width,color='green',label="16日票房")
plt.legend(prop=my_font)
plt.ylabel("票房(单位:亿)",fontproperties=my_font,size=12)
plt.show()
2.3.2 纵向
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
plt.figure(figsize=(15,6))
bar_width = 0.2
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
x = ["猩球崛起3:终极之战","敦刻尔克","蜘蛛侠:英雄归来","战狼2"]
b_16_y=[15746,312,4497,319]
b_15_y=[12357,156,2045,168]
b_14_y=[2358,399,2358,362]
b_14_x = []
b_15_x = []
b_16_x = []
for i in range(1,5):
b_15_x.append(i)
b_14_x.append(i+bar_width)
b_16_x.append(i-bar_width)
plt.yticks(b_15_x,x,fontproperties=my_font,size=15)
plt.barh(b_14_x,b_14_y,height=bar_width,color='orange',label="14日票房")
plt.barh(b_15_x,b_15_y,height=bar_width,color='red',label="15日票房")
plt.barh(b_16_x,b_16_y,height=bar_width,color='green',label="16日票房")
plt.legend(prop=my_font)
plt.xlabel("票房(单位:亿)",fontproperties=my_font,size=12)
plt.show()
3 直方图
直方图是对出现数字的内容进行统计的图像(所以直方图用到的数据一定很多,处理很多数据的时候,画图的时间也相对较长),比如我们下面有很多个电影的时长
3.1 频数直方图
3.1.1 基本使用
bin_width是间隔的数值差,num_bins是设置多少组,对于普通的直方图我们一般设置5-6组
用下面这个代码的时候,需要保证列表的最大值和最小值的差可以被bin_width整除,不然会出现图像错位的现象
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
time_list = [131, 98, 125, 131, 124, 139, 131, 117, 128, 108, 135, 138, 131, 102, 107, 114, 119, 128, 121,142, 127, 130, 124, 101, 110,116,110, 128, 128, 115, 99, 136, 126,134, 95, 138, 117, 111,78, 132, 124, 113, 150, 110, 95, 144, 105, 126,130,126, 130, 126, 116, 123, 106, 112,138,123,86, 101,99, 136,123,83, 94, 146,133, 101,131,116,111,84,137,115,122,106,144,109,123, 116, 111,111,133, 150]
bin_width = 12
num_bins = int((max(time_list)-min(time_list))/bin_width)
print(max(time_list)-min(time_list))
plt.hist(time_list,num_bins)
plt.xticks(list(range(min(time_list),max(time_list)+bin_width))[::bin_width],rotation=45)
plt.grid(linestyle="-.",alpha=0.5)
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
plt.xlabel("电影时长",fontproperties=my_font,size=12)
plt.ylabel("出现次数",fontproperties=my_font,size=12)
plt.show()
我们可以根据频率来归纳一些事情,比如大多数电影时长在102到138之间
3.1.2 调整组距
像上面就是使用均匀的组距,如果使用不均匀的可以自己定义。一般不会手动调,手动调有可能会导致图像错位
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
time_list = [131, 98, 125, 131, 124, 139, 131, 117, 128, 108, 135, 138, 131, 102, 107, 114, 119, 128, 121,142, 127, 130, 124, 101, 110,116,110, 128, 128, 115, 99, 136, 126,134, 95, 138, 117, 111,78, 132, 124, 113, 150, 110, 95, 144, 105, 126,130,126, 130, 126, 116, 123, 106, 112,138,123,86, 101,99, 136,123,83, 94, 146,133, 101,131,116,111,84,137,115,122,106,144,109,123, 116, 111,111,133, 150]
bin_width = (78,90,102,114,126,138,150)
num_bins = 6
plt.hist(time_list,num_bins)
plt.xticks(bin_width,rotation=45)
plt.grid(linestyle="-.",alpha=0.5)
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
plt.xlabel("电影时长",fontproperties=my_font,size=12)
plt.ylabel("出现次数",fontproperties=my_font,size=12)
plt.show()
3.2 占比直方图
在plt.hist()中将density置为True
- 有的版本不是density而是normed,具体看帮助文档就行了
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
time_list = [131, 98, 125, 131, 124, 139, 131, 117, 128, 108, 135, 138, 131, 102, 107, 114, 119, 128, 121,142, 127, 130, 124, 101, 110,116,110, 128, 128, 115, 99, 136, 126,134, 95, 138, 117, 111,78, 132, 124, 113, 150, 110, 95, 144, 105, 126,130,126, 130, 126, 116, 123, 106, 112,138,123,86, 101,99, 136,123,83, 94, 146,133, 101,131,116,111,84,137,115,122,106,144,109,123, 116, 111,111,133, 150]
print(len(time_list))
bin_width = 6
num_bins = int((max(time_list)-min(time_list))/bin_width)
plt.hist(time_list,num_bins,density=True)
plt.xticks(list(range(min(time_list),max(time_list)+bin_width))[::bin_width],rotation=45)
plt.grid(linestyle="-.",alpha=0.5)
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
plt.xlabel("电影时长",fontproperties=my_font,size=12)
plt.ylabel("出现占比",fontproperties=my_font,size=12)
plt.show()
3.3 条形图模拟直方图
上面给的数据全是没统计的数据,它给了你一堆数没给你数出来。统计后的数据就是告诉你78-84之间有多少个,84-90有多少个,如果给这种数据是画不了的,给统计后的数据可以画条形图代替直方图
比如下面给你的数据0-5之间有836个,5-10之间有2737个,10-15之间有3723个
python
from matplotlib import pyplot as plt
from matplotlib import font_manager
plt.figure(figsize=(15,6))
my_font = font_manager.FontProperties(fname="./STSONG.TTF")
x_true_value =[0,5,10,15,20,25,30,35,40,45,60,90]
width =[5,5,5,5,5,5,5,5,5,15,30,60]
y =[836,2737,3723,3926,3596,1438,3273,642,824,613,215,47]
x_show_value_position = []
x_labels = []
for i in range(len(x_true_value)):
x_labels.append(str(x_true_value[i]))
x_show_value_position.append(x_true_value[i]+width[i]/2)
x_labels = x_labels + [str(x_true_value[-1] + width[-1])]
x_show_value_position = [-2.5] + x_show_value_position
plt.xticks(x_show_value_position,x_labels,fontproperties=my_font,size=15)
plt.bar(x_true_value,y,width=width)
plt.grid()
plt.show()
这个图的x轴可能有一些复杂。x_true_value是x的真实的值,x_show_value_position是x标签摆放位置,x_labels是x的标签
概括来讲就是你上面看到的0其实不是0,而是-2.5。你看的的5其实不是5,而是2.5。坐标0与2.5之间才是实际的0