如何利用plotly和geopandas根据美国邮政编码(Zip-Code)绘制美国地图

对于我自己来说,该需求源自于分析Movielens-1m数据集的用户数据:

python 复制代码
UserID::Gender::Age::Occupation::Zip-code
1::F::1::10::48067
2::M::56::16::70072
3::M::25::15::55117
4::M::45::7::02460
5::M::25::20::55455
6::F::50::9::55117

我希望根据Zip-code计算出用户所在的州,然后在地图上显示每个州的用户数量。

那么应该这样写代码:

python 复制代码
import pandas as pd
import geopandas as gpd
import plotly.express as px
from uszipcode import SearchEngine

# 创建 SearchEngine 实例
search = SearchEngine()

# 读取用户数据集
data = pd.read_csv('./users.dat', sep='::', engine='python',
                   names=['UserID', 'Gender', 'Age', 'Occupation', 'Zip-code'])
data = data.dropna(subset=['Zip-code'])

def get_state_name(zipcode):
    result = search.by_zipcode(zipcode)
    if result is None:
        return None
    else:
        state_abbr = result.state
        return state_abbr
data['STATE_ABBR'] = data['Zip-code'].apply(get_state_name)

# 计算每个Zip-code的用户数量
zip_counts = data['STATE_ABBR'].value_counts()
zip_counts_df = zip_counts.reset_index() # 将Series转换为DataFrame
zip_counts_df.columns = ['STATE_ABBR', 'COUNT'] # 重新命名列

# 读取美国地图的shapefile
usa_map = gpd.read_file('./shapefile/USA_States.shp')

# 将Zip-code数据与地图数据进行合并
# 简称合并使用STATE_ABBR,全称合并使用STATE_NAME
zip_geo = pd.merge(usa_map, zip_counts_df, on='STATE_ABBR')

# 绘制地图
fig = px.choropleth(zip_geo,
                    locations='STATE_ABBR',
                    locationmode='USA-states',
                    color='COUNT',
                    scope='usa',
                    hover_data=['COUNT'],
                    color_continuous_scale='Reds',
                    range_color=(0, zip_geo['COUNT'].max()),
                    labels={'STATE_ABBR': 'User Count'})
fig.update_layout(title_text='Movielens User Distribution by State')
fig.show()

在上面的代码中,USA_States.shp可以在efrainmaps(https://www.efrainmaps.es/english-version/free-downloads/united-states/)下载。

效果如下,鼠标悬停到某个州,可以显示出州名称和对应的用户数量:

如果不希望显示州简称,可以创建州的简称与全称的映射,然后将Zip-code映射到州的全称,再显示地图:

python 复制代码
# 创建州的简称与全称的映射
# 该映射字典涵盖了50个州、哥伦比亚特区、5个美国领土以及3个军邮邮编简称。
state_name_dict = {
    "AL": "Alabama",
    "AK": "Alaska",
    "AZ": "Arizona",
    "AR": "Arkansas",
    "CA": "California",
    "CO": "Colorado",
    "CT": "Connecticut",
    "DE": "Delaware",
    "FL": "Florida",
    "GA": "Georgia",
    "HI": "Hawaii",
    "ID": "Idaho",
    "IL": "Illinois",
    "IN": "Indiana",
    "IA": "Iowa",
    "KS": "Kansas",
    "KY": "Kentucky",
    "LA": "Louisiana",
    "ME": "Maine",
    "MD": "Maryland",
    "MA": "Massachusetts",
    "MI": "Michigan",
    "MN": "Minnesota",
    "MS": "Mississippi",
    "MO": "Missouri",
    "MT": "Montana",
    "NE": "Nebraska",
    "NV": "Nevada",
    "NH": "New Hampshire",
    "NJ": "New Jersey",
    "NM": "New Mexico",
    "NY": "New York",
    "NC": "North Carolina",
    "ND": "North Dakota",
    "OH": "Ohio",
    "OK": "Oklahoma",
    "OR": "Oregon",
    "PA": "Pennsylvania",
    "RI": "Rhode Island",
    "SC": "South Carolina",
    "SD": "South Dakota",
    "TN": "Tennessee",
    "TX": "Texas",
    "UT": "Utah",
    "VT": "Vermont",
    "VA": "Virginia",
    "WA": "Washington",
    "WV": "West Virginia",
    "WI": "Wisconsin",
    "WY": "Wyoming",
    "DC": "District of Columbia",
    "AS": "American Samoa",
    "GU": "Guam",
    "MP": "Northern Mariana Islands",
    "PR": "Puerto Rico",
    "UM": "United States Minor Outlying Islands",
    "VI": "Virgin Islands",
    "AA": "Armed Forces Americas",
    "AE": "Armed Forces Europe",
    "AP": "Armed Forces Pacific"
}

def get_state_name(zipcode):
    result = search.by_zipcode(zipcode)
    if result is None:
        return None
    else:
        state_abbr = result.state
        state_name = state_name_dict.get(state_abbr, None)
        return state_name
data['STATE_NAME'] = data['Zip-code'].apply(get_state_name)
# 后续代码同上,注意要将STATE_ABBR替换为STATE_NAME
相关推荐
databook8 分钟前
『Plotly实战指南』--样式定制高级篇
python·数据分析·数据可视化
basketball6161 小时前
Python torchvision.transforms 下常用图像处理方法
开发语言·图像处理·python
兔子蟹子1 小时前
Java集合框架解析
java·windows·python
宁酱醇1 小时前
各种各样的bug合集
开发语言·笔记·python·gitlab·bug
谷晓光1 小时前
Python 中 `r` 前缀:字符串处理的“防转义利器”
开发语言·python
姚毛毛1 小时前
Windows上,10分钟构建一个本地知识库
python·ai·rag
站大爷IP2 小时前
Python ZIP文件操作全解析:从基础压缩到高级技巧
python
纪元A梦2 小时前
华为OD机试真题——通过软盘拷贝文件(2025A卷:200分)Java/python/JavaScript/C++/C语言/GO六种最佳实现
java·javascript·c++·python·华为od·go·华为od机试题
用户867132495742 小时前
97% 的 Python 项目可以使用 partial() 更简洁
python
灏瀚星空3 小时前
从单机工具到协同平台:开源交互式模拟环境的技术演进之路
经验分享·笔记·python·开源·oneapi