用R语言进行舆情监控并且做到可视化,对我来说,总体难度还算可以,主要是舆情监控通常涉及文本数据的收集(如社交媒体、新闻评论),然后进行情感分析,最后通过图表展示结果。步骤看似简单实则一点也不简单。

以下就是我使用R语言进行舆情监控和可视化统计的完整示例。该方案包括文本情感分析和时间趋势可视化:
R
# 加载必要的包
library(tidyverse) # 数据处理和可视化
library(tidytext) # 文本分析
library(lubridate) # 日期处理
library(wordcloud2) # 词云
library(ggplot2) # 绘图
library(plotly) # 交互式图表
# 1. 模拟舆情数据生成(实际应用中替换为真实数据)
set.seed(123)
n <- 500 # 样本量
# 生成模拟数据
sentiment_data <- tibble(
id = 1:n,
content = replicate(n, paste(sample(c("产品很好服务优秀",
"体验很差客服态度恶劣",
"性价比高会回购",
"物流慢包装破损",
"功能强大界面美观",
"系统卡顿更新后更糟",
"售后响应及时",
"虚假宣传失望透顶"),
sample(1:3,1), replace=TRUE), collapse=",")),
date = sample(seq.Date(as.Date("2024-01-01"), as.Date("2024-06-30"), by="day"), n, replace=TRUE),
source = sample(c("微博", "微信公众号", "知乎", "小红书", "抖音"), n, replace=TRUE),
author = paste0("用户", sample(1000:9999, n))
)
# 2. 情感分析处理
# 自定义情感词典
sentiment_dict <- tibble(
word = c("很好", "优秀", "强大", "美观", "及时", "高", "优秀", "满意", "回购", "优秀",
"差", "慢", "破损", "卡顿", "失望", "虚假", "糟糕", "恶劣"),
sentiment = c(rep("正面", 10), rep("负面", 8))
)
# 情感分析函数
analyze_sentiment <- function(data) {
data %>%
unnest_tokens(word, content) %>%
left_join(sentiment_dict, by = "word") %>%
filter(!is.na(sentiment)) %>%
count(id, sentiment) %>%
pivot_wider(names_from = sentiment, values_from = n, values_fill = 0) %>%
mutate(sentiment_score = 正面 - 负面)
}
# 执行情感分析
sentiment_results <- sentiment_data %>%
analyze_sentiment() %>%
right_join(sentiment_data, by = "id") %>%
replace_na(list(正面 = 0, 负面 = 0, sentiment_score = 0)) %>%
mutate(sentiment_label = case_when(
sentiment_score > 0 ~ "正面",
sentiment_score < 0 ~ "负面",
TRUE ~ "中性"
))
# 3. 数据概览统计
overall_stats <- sentiment_results %>%
summarise(
total_posts = n(),
positive_rate = mean(sentiment_label == "正面") * 100,
negative_rate = mean(sentiment_label == "负面") * 100,
avg_sentiment = mean(sentiment_score)
)
# 4. 可视化分析
# 4.1 情感分布饼图
sentiment_pie <- sentiment_results %>%
count(sentiment_label) %>%
plot_ly(labels = ~sentiment_label, values = ~n, type = 'pie') %>%
layout(title = "舆情情感分布")
# 4.2 情感趋势图(按周)
sentiment_trend <- sentiment_results %>%
mutate(week = floor_date(date, "week")) %>%
group_by(week) %>%
summarise(avg_score = mean(sentiment_score)) %>%
ggplot(aes(x = week, y = avg_score)) +
geom_line(color = "steelblue", size = 1) +
geom_point(color = "darkorange", size = 3) +
geom_hline(yintercept = 0, linetype = "dashed", color = "grey50") +
labs(title = "每周情感趋势变化", x = "日期", y = "情感得分均值") +
theme_minimal()
# 4.3 平台情感对比
platform_sentiment <- sentiment_results %>%
group_by(source) %>%
summarise(positive_rate = mean(sentiment_label == "正面") * 100) %>%
ggplot(aes(x = reorder(source, positive_rate), y = positive_rate, fill = source)) +
geom_col(show.legend = FALSE) +
coord_flip() +
labs(title = "各平台正面评价比例", x = "平台", y = "正面评价比例(%)") +
theme_minimal()
# 4.4 关键词词云
word_freq <- sentiment_results %>%
unnest_tokens(word, content) %>%
anti_join(stop_words, by = "word") %>% # 使用tidytext的停用词
count(word) %>%
filter(n > 5) %>%
arrange(desc(n))
wordcloud <- wordcloud2(word_freq, size = 1, shape = 'circle')
# 5. 输出结果
print(paste("总舆情数量:", overall_stats$total_posts))
print(paste("正面率:", round(overall_stats$positive_rate, 1), "%"))
print(paste("负面率:", round(overall_stats$negative_rate, 1), "%"))
# 显示图表
sentiment_pie
ggplotly(sentiment_trend)
platform_sentiment
wordcloud
# 6. 保存关键结果(可选)
# ggsave("sentiment_trend.png", plot = sentiment_trend, width = 10, height = 6)
# saveWidget(wordcloud, "wordcloud.html")
实际应用说明:
1、数据获取:真实场景需替换数据采集部分:
- 使用API(如微博、Twitter、Reddit等)
- 网页爬虫(rvest包)
- 数据库连接(RMySQL/RSQLite)
2、情感分析增强:
- 使用更专业的词典(如BosonNLP情感词典)
- 采用机器学习模型(如text2vec包)
- 考虑否定词处理(如"不"+"好"=负面)
3、扩展功能:
R
# 主题聚类示例
library(topicmodels)
dtm <- sentiment_results %>%
unnest_tokens(word, content) %>%
count(id, word) %>%
cast_dtm(id, word, n)
lda_model <- LDA(dtm, k = 4, control = list(seed = 1234))
topics <- tidy(lda_model, matrix = "beta")
4、预警机制:
R
# 负面舆情预警
negative_alerts <- sentiment_results %>%
filter(sentiment_label == "负面") %>%
arrange(desc(date)) %>%
select(date, source, content, sentiment_score)
可视化效果说明:
- 交互式情感趋势图(鼠标悬停查看数值)
- 动态词云(支持点击交互)
- 平台对比柱状图(直观比较各渠道舆情)
- 情感分布饼图(整体情绪概览)
运行此代码前请确保已安装所有必要的包:
R
install.packages(c("tidyverse", "tidytext", "lubridate", "wordcloud2", "plotly"))
最后给点实际部署时建议:最好设置定时任务自动抓取数据(如cronR包),同时也要构建Shiny应用生成动态报告,最后在添加邮件预警功能(如sendmailR包)这样让你的程序完美运行。