源码:用Python进行电影数据分析实战指南

源码:用Python进行电影数据分析实战指南

原创 IT小本本 IT小本本 2025年03月03日 22:28 北京

接上一篇文章:用Python进行电影数据分析实战指南

1、首先复制csv内容到csv文件中

2、接着创建.py文件复制源码内容

3、运行代码,就可以看到数据分析图啦

源码内容:

复制代码
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# 1. 加载数据
def load_data(file_path):
    """
    从CSV文件加载电影数据集
    """
    df = pd.read_csv(file_path)
    print("数据维度:", df.shape)
    print("\n前5行数据:")
    print(df.head())
    print("\n数据摘要:")
    print(df.info())
    return df

# 2. 数据清洗
def clean_data(df):
    """
    数据清洗预处理
    """
    # 删除重复值
    df = df.drop_duplicates()
    
    # 处理缺失值
    df['rating'] = df['rating'].fillna(df['rating'].median())
    df = df.dropna(subset=['release_year', 'genre'])
    
    # 转换数据类型
    df['release_year'] = df['release_year'].astype(int)
    
    # 处理genre列(可能有多个类型)
    df['genre'] = df['genre'].str.split(',')
    
    return df

# 3. 数据分析
def analyze_data(df):
    """
    执行数据分析并生成可视化
    """
    # 设置可视化风格
    sns.set(style="whitegrid")
    plt.figure(figsize=(12, 6))
    
    # 分析1:电影类型分布
    genre_counts = df.explode('genre')['genre'].value_counts().head(10)
    plt.subplot(2, 2, 1)
    genre_counts.plot(kind='bar', color='skyblue')
    plt.title('Top 10 Movie Genres')
    plt.xlabel('Genre')
    plt.ylabel('Count')

    # 分析2:评分分布
    plt.subplot(2, 2, 2)
    sns.histplot(df['rating'], bins=20, kde=True, color='orange')
    plt.title('Rating Distribution')
    plt.xlabel('IMDB Rating')

    # 分析3:年度电影数量趋势
    yearly_counts = df.groupby('release_year').size()
    plt.subplot(2, 2, 3)
    yearly_counts.plot(color='green')
    plt.title('Movies Released by Year')
    plt.xlabel('Year')
    plt.ylabel('Number of Movies')

    # 分析4:评分与时长关系
    plt.subplot(2, 2, 4)
    sns.scatterplot(x='runtime', y='rating', data=df, alpha=0.6)
    plt.title('Runtime vs Rating')
    plt.xlabel('Runtime (minutes)')
    plt.ylabel('Rating')

    plt.tight_layout()
    plt.show()

    # 高级分析:相关系数矩阵
    numeric_df = df.select_dtypes(include=['float64', 'int64'])
    plt.figure(figsize=(10, 8))
    sns.heatmap(numeric_df.corr(), annot=True, cmap='coolwarm')
    plt.title('Correlation Matrix')
    plt.show()

# 主程序
if __name__ == "__main__":
    # 文件路径(需要替换为实际路径)
    file_path = "movies.csv"
    
    # 加载数据
    movie_df = load_data(file_path)
    
    # 数据清洗
    cleaned_df = clean_data(movie_df)
    
    # 数据分析
    analyze_data(cleaned_df)
    
    # 生成统计摘要
    print("\n统计摘要:")
    print(cleaned_df.describe())

csv内容:

|----------------------------------------------|--------|-------------------------------|--------------|---------|--------|-----------------------|----------------------------------------------------------|
| title | rating | genre | release_year | runtime | votes | director | actors |
| Ne Zha: The Demon Boy Makes Havoc in the Sea | 9.8 | Animation | 2025 | 144 | | Jiaozi | |
| Product quite | 5.1 | Horror,Animation | 1998 | 154 | | Wesley Weaver | Sheila Blackburn, Christina Harris, Jacob Odonnell |
| Choose support stuff | 8 | Action | 1995 | 87 | 316631 | | Brian Vance, Karen Norris, Thomas West |
| Field itself growth | 8.2 | Thriller,Horror,Adventure | 2007 | 163 | 398833 | Daniel Kelly | Kenneth Jackson, Allen Campbell MD, Stacy Andersen |
| Task available president | 6.7 | Horror | 2011 | 73 | | | James Bishop, Rachel Williams, Cameron Wilson |
| Fly system event | 8 | Adventure | 1997 | 178 | 290428 | David Crawford | Randall Gonzalez, Larry Collins, Emily Sullivan |
| Evidence | 9.4 | Animation,Horror | 1994 | 65 | 406310 | Robert Lucas | Jeremiah Robinson, Megan Williams, Megan Herrera |
| Treat week | | Comedy | 2014 | 95 | 438314 | Mitchell Dickson | Hailey Richardson, Nancy Davis, Cynthia Luna MD |
| Step staff | 6 | Comedy,Drama | 2005 | 80 | 142530 | Rebecca Wilson | Patrick Thompson, Amy Hernandez, Christopher King |
| Much such | 7.8 | Romance,Action,Horror | 1992 | 163 | 141373 | | Kenneth Wang, April Avila, Adam Singleton |
| Wish water | 7.8 | Romance,Horror,Adventure | 2015 | 173 | 241570 | John Poole | Dr. Dennis Ryan, Vincent Valdez, John Rose |
| Two | 8.7 | Sci-Fi,Thriller,Documentary | 2013 | 88 | 73525 | Justin Turner | Andrew Coffey, Robin Jarvis, Daniel Murray |
| Someone song | | Comedy | 1999 | 140 | | Lisa Atkinson | James Brown, Cynthia Lopez, Jennifer Lopez |
| Culture quality | 6.9 | Adventure,Documentary,Sci-Fi | 1990 | 147 | 358412 | Michael Garrison | Robert Jenkins, Peter Combs, Charles Marsh DDS |
| Each listen and | 8.8 | Comedy,Sci-Fi | 2017 | 80 | 379584 | Michael Murphy | Rachel Reeves, David Matthews, Miss Dawn Hayes |
| Particular | 7.5 | Documentary | 1996 | 171 | 442277 | Rebecca Bryant DDS | Cody Cain, Dillon Powell, Kelsey Riley |
| Control lawyer | 5.8 | Documentary | 2023 | 177 | | Chad Brown | Susan Morales, Michael Mann, Brian Hunter |
| Performance yourself then | | Sci-Fi,Horror | 1993 | 90 | 497474 | | Christopher Knapp, Edward Chapman PhD, Steven Richardson |
| Clearly | 9.1 | Drama,Animation,Adventure | 2000 | 93 | 319029 | Cynthia Harrison | Rodney Patterson, Shawn Wells, David Hill |
| Risk town | 7.7 | Horror,Sci-Fi,Thriller | 2013 | 116 | 237708 | Suzanne Smith | James Williams, Francisco Miller, Scott Herman |
| Once structure | 6.7 | Documentary,Horror,Animation | 1990 | 69 | 31866 | Mr. Jonathan Stafford | Pamela Johnson, John Rodriguez, Misty Wells |
| Condition morning | 9.3 | Documentary | 2005 | 95 | 113321 | Robert Jennings | James Williams, Antonio Zuniga, Adam Stewart |
| Lawyer almost method | 9.4 | Adventure,Horror,Animation | 2016 | 84 | | Katherine Clark | Joshua Bernard, Jeffrey James, Cheryl Salinas |
| Central write | 7.1 | Action,Comedy,Animation | 2015 | 153 | 452764 | Joe Hernandez | Matthew Donaldson, Jennifer Kelley, Leslie Gomez |
| Apply window | 7.7 | Thriller | 2001 | 95 | 459482 | Logan Williams | Brooke Bruce, Danielle Dixon, Michael Burton |
| Trouble benefit another | 9.3 | Comedy,Action,Documentary | 1995 | 178 | 124930 | Christina Wood | Darren Jones, Brian Fischer, Paula Garcia |
| Represent career away | 6.1 | Action,Drama | 2014 | 60 | 140042 | Terri Melendez | Deanna Walker, Joseph Robinson, Mrs. Samantha Mccarthy |
| Product money | 6.4 | Documentary,Adventure,Drama | 2002 | 97 | 31662 | Paul Hale | Rachel Taylor, Lisa Hughes, Christopher Jordan |
| Standard campaign hot | 8.7 | Action | 2020 | 124 | 279463 | Evan Holmes | Madison Sanchez, Rachel Smith, Hannah Avery |
| Foreign care | 5.4 | Comedy | 1994 | 146 | 212694 | Alexander Morgan | Bill Doyle, Mary Garrison, Barbara Velazquez |
| Our of force | 6.2 | Action,Comedy,Thriller | 2023 | 100 | 108090 | Erica Mack | Cheryl Ray, Bobby Webster, Philip Mcdonald |
| Moment poor hour | 6.3 | Sci-Fi | 2019 | 100 | 491594 | | Tyler Smith, Crystal Grimes, Amanda Watson |
| Science suffer human | 10 | Comedy | 2003 | 124 | 490296 | Nicole Evans | Karen Cook, Albert Tate, Teresa Watkins |
| Possible mission | | Romance | 2008 | 80 | 285802 | Kyle Vasquez | Sonia Stanley, Dr. Olivia Sullivan, Tony Garcia |
| Factor difficult short | 9.9 | Documentary,Action,Sci-Fi | 1996 | 180 | 139656 | Terry Rogers | Felicia Dunn, Victor Spencer, Robert Mcdonald |
| Ready organization | 7.8 | Sci-Fi,Horror | 2011 | 86 | 448171 | Robert Green | Andrew Robinson, David Baker, Erik Jones |
| Right phone standard | 9.5 | Comedy | 2017 | 166 | 2858 | Kristin Montes | Thomas Martin, Amanda West, Patrick Travis |
| Start painting | 9.9 | Adventure | 2017 | 131 | | | Jesus Green, Robert Davis, Rebecca Davis |
| Whose member area | 9.5 | Action,Romance,Drama | 2017 | 76 | | Mr. Martin Garcia | Tiffany Williams, Karen Ramirez, Lauren Matthews |
| Teach however | 9.9 | Romance | 2003 | 147 | 54893 | Melvin Medina | Shannon Bell, Jeffrey Hoffman, Samantha Walton |
| Within response one book | 9.4 | Drama,Horror,Animation | 2001 | 172 | 95036 | Latoya Petersen | Christina Pearson, Shawn Hart, Joseph Moore |
| Nation south debate | 9.7 | Horror,Sci-Fi,Drama | 1996 | 108 | 451150 | Paul Clark | Darrell Neal, Patrick Durham, Nathan Freeman |
| Method town firm | 9.6 | Sci-Fi,Horror | 2004 | 63 | 209910 | John King | Renee Williams, James Hunter, Lindsey Buchanan |
| Produce movie | 9.8 | Romance,Documentary | 2015 | 146 | 282131 | Daniel Diaz | Ashley Lara, Dustin Pearson, John Franklin |
| Best across | | Drama,Sci-Fi | 1992 | 73 | 182239 | Debra Calderon | Deborah Hunter DVM, Peter Phillips, Donna Wright |
| Pass crime | 8 | Comedy,Thriller,Horror | 2006 | 65 | 1886 | Yolanda Baxter | Adam Hood, Edwin Henderson, Stephen Anderson |
| Move | 8.4 | Horror,Romance,Thriller | 1994 | 145 | 327712 | Roy Schwartz | Joann Fleming, Maria Simpson, John Mason |
| Where cause idea | 8.6 | Documentary,Sci-Fi | 2016 | 101 | 156008 | Wesley Turner | Dr. Carol Diaz, Daniel Santana, Tina White |
| Water concern | 8.3 | Drama,Sci-Fi | 2015 | 130 | 160317 | Nicole Martin | Corey Sanders, Rebecca Tran, Kari Mason |
| Degree you | 7.9 | Romance,Adventure,Documentary | 2018 | 146 | 249085 | Nicholas Lawson | Brian Robbins, Charles Schwartz, Shawn Ramos |
| Wonder firm pull | 8.7 | Comedy,Sci-Fi,Romance | 1995 | 164 | 124138 | Kelly Thomas | Holly Stark, Susan Bishop, Adam Perez |
[ ]

创作不易,从构思到成品,饱含我创作的热忱,希望大家尊重原创,拒绝抄袭、盗用!

相关推荐
我是华为OD~HR~栗栗呀3 小时前
华为OD-23届-测试面经
java·前端·c++·python·华为od·华为·面试
我是华为OD~HR~栗栗呀3 小时前
华为od面经-23届-Java面经
java·c语言·c++·python·华为od·华为·面试
逐步前行6 小时前
C标准库--C99--布尔型<stdbool.h>
c语言·开发语言
程序员爱钓鱼6 小时前
Python编程实战 · 基础入门篇 | 元组(tuple)
后端·python·ipython
QX_hao6 小时前
【Go】--闭包
开发语言·golang
程序员爱钓鱼6 小时前
Python编程实战 · 基础入门篇 | 列表(list)
后端·python·ipython
御承扬8 小时前
编程素养提升之EffectivePython(Builder篇)
python·设计模式·1024程序员节
chenchihwen8 小时前
AI代码开发宝库系列:FAISS向量数据库
数据库·人工智能·python·faiss·1024程序员节
林月明9 小时前
【VBA】自动设置excel目标列的左邻列格式
开发语言·excel·vba·格式
喜欢吃燃面9 小时前
数据结构算法题:list
开发语言·c++·学习·算法·1024程序员节