往期文章:
Pandas库基础操作 - 利用pandas查询数据
这里的查询数据相当于R语言里的subset功能,可以通过布尔索引有针对的选取原数据的子集、指定行、指定列等。我们先导入一个student数据集
python
stu_dic = {'Age':[14,13,13,14,14,12,12,15,13,12,11,14,12,15,16,12,15,11,15],
'Height':[69,56.5,65.3,62.8,63.5,57.3,59.8,62.5,62.5,59,51.3,64.3,56.3,66.5,72,64.8,67,57.5,66.5],
'Name':['Alfred','Alice','Barbara','Carol','Henry','James','Jane','Janet','Jeffrey','John','Joyce','Judy','Louise','Marry','Philip','Robert','Ronald','Thomas','Willam'],
'Sex':['M','F','F','F','M','M','F','F','M','M','F','F','F','F','M','M','M','M','M'],
'Weight':[112.5,84,98,102.5,102.5,83,84.5,112.5,84,99.5,50.5,90,77,112,150,128,133,85,112]}
student = pd.DataFrame(stu_dic)
查询数据的前5行或末尾5行
python
print(student.head())
print(student.tail())
# 输出
Age Height Name Sex Weight
0 14 69.0 Alfred M 112.5
1 13 56.5 Alice F 84.0
2 13 65.3 Barbara F 98.0
3 14 62.8 Carol F 102.5
4 14 63.5 Henry M 102.5
Age Height Name Sex Weight
14 16 72.0 Philip M 150.0
15 12 64.8 Robert M 128.0
16 15 67.0 Ronald M 133.0
17 11 57.5 Thomas M 85.0
18 15 66.5 Willam M 112.0
查询指定的行
student.iloc[[0,2,4,5,7]] #这里的loc索引标签函数必须是中括号[]
python
print(student.iloc[[0,2,4,5,7]])
# 输出
Age Height Name Sex Weight
0 14 69.0 Alfred M 112.5
2 13 65.3 Barbara F 98.0
4 14 63.5 Henry M 102.5
5 12 57.3 James M 83.0
7 15 62.5 Janet F 112.5
查询指定的列
student[['Name','Height','Weight']].head() #如果多个列的话,必须使用双重中括号
python
print(student[['Name','Height','Weight']].head())
# 输出
Name Height Weight
0 Alfred 69.0 112.5
1 Alice 56.5 84.0
2 Barbara 65.3 98.0
3 Carol 62.8 102.5
4 Henry 63.5 102.5
练习:查询出所有12岁以上的女生信息
python
print(student[(student['Sex']=='F') & (student['Age']>12)])
# 输出
Age Height Name Sex Weight
1 13 56.5 Alice F 84.0
2 13 65.3 Barbara F 98.0
3 14 62.8 Carol F 102.5
7 15 62.5 Janet F 112.5
11 14 64.3 Judy F 90.0
13 15 66.5 Marry F 112.0
上面的查询逻辑其实非常的简单,需要注意的是,如果是多个条件的查询,必须在&(且)或者|(或)的两端条件用括号括起来。
更多内容请查看我的gittee仓库 : Python基础练习