[R] Underline your idea with ggplot2

Preview:

介绍:之前的教程中,我们学习了如何使条形图或直方图看起来更好

比如:

  1. How to select a graph = calibrate the geom part
  2. How toselect variables = calibrate the aes part
  3. How to add a title = calibrate the labs part
  4. How to put the bar in a certain order = fct_infreq in the aes part
  5. How to change the colour and how to fill the bars = the fill in aes part or geombar , or the option scale_fill
  6. How to adjust transparency = alpha

今天我们将学习如何在图形中添加信息,编辑图例中的文本元素,并改变主题

添加图形中的信息使用geom_text()

示例:在条形图上添加每个条形的计数

R 复制代码
ggplot(data = mpg, aes(x = class)) +
  geom_bar() +
  geom_text(stat = 'count', aes(label = ..count..), vjust = -0.5)

编辑图例中的文本元素并改变主题使用theme()

示例:改变坐标轴文本的大小和位置

R 复制代码
ggplot(data = mpg, aes(x = class)) +
  geom_bar() +
  theme(axis.text.x = element_text(angle = 45, size = 10))

理解数据可视化的指导原则

例如,平衡、强调、运动、模式、重复、节奏和多样性

使用散点图进行两个连续变量的数据可视化

使用条形图进行两个分类数据的数据可视化,并学习新的自定义设置

使用一个连续变量和一个分类变量进行数据可视化

Main Content

Add info in the plots:

首先,让我们来看看如何在图形中添加信息 。在R中,我们可以使用geom_text()函数来实现这一点。例如,如果我们想在条形图上显示每个条形的计数,我们可以这样做:

R 复制代码
ggplot(data = mpg, aes(x = class)) +
  geom_bar() +
  geom_text(stat = 'count', aes(label = ..count..), vjust = -0.5)
  • ggplot(data = mpg, aes(x = class)): This sets up the basic plot using the mpg dataset and specifies that the class variable should be mapped to the x-axis.

  • geom_bar(): This adds a bar plot layer to the plot, creating a bar for each unique value of the class variable.

  • geom_text(stat = 'count', aes(label = ..count..), vjust = -0.5): This adds text labels to the plot. The stat = 'count' argument tells geom_text to calculate the count of observations for each class. The aes(label = ..count..) specifies that the count should be used as the label for each bar. The vjust = -0.5 argument adjusts the vertical position of the labels to place them above the bars.

  • if vjust = 0.5

接下来,让我们讨论如何编辑图例中的文本元素并改变图形的主题。在R中,我们可以使用theme()函数来实现这一点。例如,如果我们想改变坐标轴文本的大小和位置,我们可以这样做:

R 复制代码
ggplot(data = mpg, aes(x = class)) +
  geom_bar() +
  theme(axis.text.x = element_text(angle = 45, size = 10))

Changing the text size and position in the x or y axis

R 复制代码
 + theme(axis.text.x = element_text(angle = 45, size=10))
R 复制代码
+ theme(axis.text.x = element_text(angle = 45,size=7))
  • family : Specifies the font family to be used for the axis text. For example, setting family = "Arial" would use the Arial font for the axis text.

  • face : Specifies the font style to be used for the axis text. This can be used to make the text bold, italic, or bold italic . For example, setting face = "bold" would make the axis text bold.

  • colour : Specifies the color of the axis text, ticks, and marks. For example, setting colour = "red" would make the axis text red.

  • size : Specifies the size of the axis text. For example, setting size = 12 would make the axis text 12 points in size.

  • angle : Specifies the angle at which the axis text is displayed. For example, setting angle = 45 would rotate the axis text 45 degreesclockwise.

remove axis ticks and labels

you can remove axis ticks and labels using element_blank() or size=0 in theme() in ggplot2. Here's how you can do it:

R 复制代码
library(ggplot2)

# Create a basic plot
p <- ggplot(data = mpg, aes(x = class)) +
  geom_bar() +
  geom_text(stat = 'count', aes(label = ..count..), vjust = -0.5)

# Remove x-axis ticks and labels
p + theme(axis.text.x = element_blank(),
          axis.ticks.x = element_blank())

# Remove y-axis ticks and labels
p + theme(axis.text.y = element_blank(),
          axis.ticks.y = element_blank())

Add the headcount for each bar in a graph which indicate proportion

R 复制代码
ggplot(CUHKSZ_employment_survey_1,aes(fct_infreq(Occupation),y=(..count..)/sum(..count..),fill=Occupation))+geom_bar()+geom_text(stat='count',aes(label=..count..),vjust=+1.5)
R 复制代码
ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)) +
  geom_bar(aes(y = (..count..)/sum(..count..))) +
  geom_text(stat = 'count', aes(label = ..count.., y = (..count..)/sum(..count..)), vjust = +1.5) +
  labs(title="Occupation of CUHK Shenzhen students after graduation",x=NULL, y="Proportion")

If you want to remove the x-axis label entirely, you can use x = "" instead.

R 复制代码
ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)) +
  geom_bar(aes(y = (..count..)/sum(..count..))) +
  geom_text(stat = 'count', aes(label = ..count.., y = (..count..)/sum(..count..)), vjust = +1.5) +
  labs(title="Occupation of CUHK Shenzhen students after graduation", x = "", y = "Proportion")

If I want to underline that students are more likely to become "Professional an technician" or "Clerical personnel", I might use the same color for those category

R 复制代码
Scale_fill_manual(values=c("color1","color2"....)
R 复制代码
# Define custom colors
custom_colors <- c("Professional and technician" = "Red", "Clerical personnel" = "Red", "Other" = "grey")

# Create the plot with custom colors
ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)) +
  geom_bar(aes(y = (..count..)/sum(..count..))) +
  geom_text(stat = 'count', aes(label = ..count.., y = (..count..)/sum(..count..)), vjust = +1.5) +
  labs(title="Occupation of CUHK Shenzhen students after graduation", x = "", y = "Proportion") +
  scale_fill_manual(values = custom_colors)

If I want to underline that students more than 10% of the students become "Professional an technician" "Clerical personnel" or "managerial personnel", colour should de different and I should add a horizontal line

R 复制代码
+geom_hline(yintercept=0.1)
R 复制代码
ggplot(CUHKSZ_employment_survey_1, aes(x = fct_infreq(Occupation), fill = Occupation)) +
  geom_bar(aes(y = (..count..)/sum(..count..))) +
  theme(axis.text.x =element_text(angle = 45,vjust = 0.6))+
  geom_text(stat = 'count', aes(label = ..count.., y = (..count..)/sum(..count..)), vjust = +1.5) +
  labs(title="Occupation of CUHK Shenzhen students after graduation", x = "", y = "Proportion") +
  scale_fill_manual(values = custom_colors) +
  geom_hline(yintercept=0.1)

Demonstrate that your data are normally distributed by over-ploting a Gaussian curve on your histogram

R 复制代码
ggplot(CUHKSZ_employment_survey_1, aes(x = Monthly_salary_19, y = stat(density))) +
  geom_histogram(binwidth = 500, fill = "blue", colour = "black", alpha = 0.5, boundary = 8000) +
  geom_density(color = "red") +
  labs(title = "Histogram of Monthly Salary with Density Curve Overlay", x = "Monthly Salary", y = "Density")

Notice to use stat(density) here instead of ...density... , or it will report an Error

or more(

复制代码
Warning message:
`stat(density)` was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(density)` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated. )

Underline the individuals who are overweigth in the BMI histogram = change the colour of the bar in an histogram

Decompose the histogram into two using the function subset

R 复制代码
ggplot(SEE_students_data_2,aes(x=BMI))+
  geom_histogram(data=subset(SEE_students_data_2,BMI<25),fill="Blue", alpha=0.5,binwidth = 1,color="Black")+
  geom_histogram(data=subset(SEE_students_data_2,BMI>25),fill="Red", alpha=0.5,binwidth = 1,color="Black")
相关推荐
白拾8 分钟前
使用Conda管理python环境的指南
开发语言·python·conda
从0至134 分钟前
力扣刷题 | 两数之和
c语言·开发语言
总裁余(余登武)34 分钟前
算法竞赛(Python)-万变中的不变“随机算法”
开发语言·python·算法
NormalConfidence_Man35 分钟前
C++新特性汇总
开发语言·c++
一个闪现必杀技40 分钟前
Python练习2
开发语言·python
有梦想的咕噜1 小时前
target_link_libraries()
开发语言
liu_chunhai1 小时前
设计模式(3)builder
java·开发语言·设计模式
姜学迁1 小时前
Rust-枚举
开发语言·后端·rust
冷白白1 小时前
【C++】C++对象初探及友元
c语言·开发语言·c++·算法
凌云行者1 小时前
rust的迭代器方法——collect
开发语言·rust