寒假课程 学习资源 r ggplot2

python学习

GitHub - CodementorIO/Python-Learning-Resources

r语言实战 r语言基础

https://github.com/biotrainee/RiA2/blob/master/Ch01%20Introduction%20to%20R.Rhttps://github.com/biotrainee/RiA2/blob/master/Ch01%20Introduction%20to%20R.R

pca分析 umap分析 tsne分析

Chapter 3 The Seurat object | scRNAseq Analysis in R with SeuratMost of todays workshop will be following the Seurat PBMC tutorial (reproduced in the next section). We'll load raw counts data, do some QC and setup various useful information in a Seurat object....https://swbioinf.github.io/scRNAseqInR_Doco/seuratobject.html

r语言入门单细胞

Advanced R, matching and reordering shortened | Introduction to R - ARCHIVED

Introduction to R - ARCHIVEDhttps://hbctraining.github.io/Intro-to-R/

learning Objectives

  1. R syntax: Understand the different 'parts of speech'.
  2. Data types structures in R: Describe the various data types and data structures.
  3. Data inspection and wrangling: Demonstrate the utilization of functions and indices to inspect and subset data from various data structures.
  4. Visualizing data: Demonstrate the use of the ggplot2 package to create plots for easy data visualization.

最好的单细胞教程

3. Raw data processing --- Single-cell best practiceshttps://www.sc-best-practices.org/introduction/raw_data_processing.html

List of functions for data inspection

We already saw how the functions head() and str() can be useful to check the content and the structure of a data.frame. Here is a non-exhaustive list of functions to get a sense of the content/structure of data.

  • All data structures - content display:
    • str(): compact display of data contents (env.)
    • class(): data type (e.g. character, numeric, etc.) of vectors and data structure of dataframes, matrices, and lists.
    • summary(): detailed display, including descriptive statistics, frequencies
    • head(): will print the beginning entries for the variable
    • tail(): will print the end entries for the variable
  • Vector and factor variables:
    • length(): returns the number of elements in the vector or factor
  • Dataframe and matrix variables:
    • dim(): returns dimensions of the dataset
    • nrow(): returns the number of rows in the dataset
    • ncol(): returns the number of columns in the dataset
    • rownames(): returns the row names in the dataset
    • colnames(): returns the column names in the dataset

Data subsetting with base R: vectors and factors | Introduction to R - ARCHIVED

qc

Seurat: Quality controlhttps://nbisweden.github.io/workshop-scRNAseq/labs/compiled/seurat/seurat_01_qc.html#Calculate_QC

qc

Chapter 3 The Seurat object | scRNAseq Analysis in R with Seurat

cpp 复制代码
Sample sex
When working with human or animal samples, you should ideally constrain you experiments to a single sex to avoid including sex bias in the conclusions. However this may not always be possible. By looking at reads from chromosomeY (males) and XIST (X-inactive specific transcript) expression (mainly female) it is quite easy to determine per sample which sex it is. It can also bee a good way to detect if there has been any sample mixups, if the sample metadata sex does not agree with the computational predictions.

To get choromosome information for all genes, you should ideally parse the information from the gtf file that you used in the mapping pipeline as it has the exact same annotation version/gene naming. However, it may not always be available, as in this case where we have downloaded public data. Hence, we will use biomart to fetch chromosome information. As the biomart instances quite often are unresponsive, you can try the code below, but if it fails, we have the file with gene annotations on github here. Make sure you put it at the correct location for the path genes.file to work.

genes.file = "data/results/genes.table.csv"

if (!file.exists(genes.file)) {
    suppressMessages(require(biomaRt))

    # initialize connection to mart, may take some time if the sites are
    # unresponsive.
    mart <- useMart("ENSEMBL_MART_ENSEMBL", dataset = "hsapiens_gene_ensembl")

    # fetch chromosome info plus some other annotations
    genes.table <- try(biomaRt::getBM(attributes = c("ensembl_gene_id", "external_gene_name",
        "description", "gene_biotype", "chromosome_name", "start_position"), mart = mart,
        useCache = F))

    if (!dir.exists("data/results")) {
        dir.create("data/results")
    }
    if (is.data.frame(genes.table)) {
        write.csv(genes.table, file = genes.file)
    }

    if (!file.exists(genes.file)) {
        download.file("https://raw.githubusercontent.com/NBISweden/workshop-scRNAseq/master/labs/misc/genes.table.csv",
            destfile = "data/results/genes.table.csv")
        genes.table = read.csv(genes.file)
    }

} else {
    genes.table = read.csv(genes.file)
}

genes.table <- genes.table[genes.table$external_gene_name %in% rownames(data.filt),
    ]
Now that we have the chromosome information, we can calculate per cell the proportion of reads that comes from chromosome Y.

chrY.gene = genes.table$external_gene_name[genes.table$chromosome_name == "Y"]

data.filt$pct_chrY = colSums(data.filt@assays$RNA@counts[chrY.gene, ])/colSums(data.filt@assays$RNA@counts)
Then plot XIST expression vs chrY proportion. As you can see, the samples are clearly on either side, even if some cells do not have detection of either.

FeatureScatter(data.filt, feature1 = "XIST", feature2 = "pct_chrY")

正常流程,未使用harmony

Single Cell RNA-Seq Analysis and Visualization Workshop

从原始faseq数据开始单细胞流程4 Data Preprocessing | ANALYSIS OF SINGLE CELL RNA-SEQ DATA

12 Batch Effects | ANALYSIS OF SINGLE CELL RNA-SEQ DATAThis is a minimal example of using the bookdown package to write a book. The output format for this example is bookdown::gitbook.https://broadinstitute.github.io/2019_scWorkshop/batch-effects.html

ANALYSIS OF SINGLE CELL RNA-SEQ DATAThis is a minimal example of using the bookdown package to write a book. The output format for this example is bookdown::gitbook.https://broadinstitute.github.io/2019_scWorkshop/index.html#course-overview

画图

Ch 3: Data visualization | Yet another 'R for Data Science' study guideNotes and solutions to Garrett Grolemund and Hadley Wickham's 'R for Data Science'https://brshallo.github.io/r4ds_solutions/03-data-visualization.html#aesthetic-mappings

ggplot2 偷图最全 动图

https://exts.ggplot2.tidyverse.org/gallery/

ggplot2语法根源

ggplot2: Elegant Graphics for Data Analysis (3e) - 1 Introduction

ggplot2 is designed to work iteratively. You start with a layer that shows the raw data. Then you add layers of annotations and statistical summaries. This allows you to produce graphics using the same structured thinking that you would use to design an analysis. This reduces the distance between the plot in your head and the one on the page.

图形语法:. The grammar of graphics is an answer to the question of what is a statistical graphic? ggplot2 (Wickham 2009) builds on Wilkinson's grammar by focussing on the primacy of layers and adapting it for use in R. In brief, the grammar tells us that a graphic maps the data to the aesthetic attributes (colour, shape, size) of geometric objects (points, lines, bars).

所有的图表都由数据、映射描述了数据变量如何映射到审美属性。有五个映射组件:

  1. 图层(Layer)是几何元素和统计转换的集合。几何元素(简称geoms)代表你在图表中实际看到的内容:点、线、多边形等。统计转换(简称stats)总结数据:例如,对观察结果进行分箱和计数以创建直方图,或者拟合一个线性模型。

  2. 比例尺(Scales)将数据空间中的值映射到审美空间中的值。这包括颜色、形状或大小的使用。比例尺还绘制图例和坐标轴,这使得可以从图表中读取原始数据值(一种逆映射)。

  3. 坐标系统(Coord)描述了数据坐标如何映射到图形平面。它还提供了轴线和网格线来帮助阅读图表。我们通常使用笛卡尔坐标系统,但也有其他可用的坐标系统,包括极坐标和地图投影。

  4. 分面(Facet)指定如何将数据子集分解并显示为小多重图。这也被称为条件化或格栅化/镂空化。

  5. 主题(Theme)控制显示的细节,如字体大小和背景颜色。虽然ggplot2中的默认设置已经经过精心选择,但你可能需要查阅其他参考资料来创建一个吸引人的图表。一个好的起点是Tufte的早期作品(Tufte 1990, 1997, 2001)。

2.3 Key components

Every ggplot2 plot has three key components:

  1. data,

  2. A set of aesthetic mappings between variables in the data and visual properties, and

  3. At least one layer which describes how to render each observation. Layers are usually created with a geom function.

Here's a simple example:

https://ggplot2.tidyverse.org/reference/ggplot.html(mpg, https://ggplot2.tidyverse.org/reference/aes.html(x = displ, y = hwy)) + 
  https://ggplot2.tidyverse.org/reference/geom_point.html()
cs 复制代码
This produces a scatterplot defined by:

Data: mpg.
Aesthetic mapping: engine size mapped to x position, fuel economy to y position.
Layer: points.
Pay attention to the structure of this function call: data and aesthetic mappings are supplied in ggplot(), then layers are added on with +. This is an important pattern, and as you learn more about ggplot2 you'll construct increasingly sophisticated plots by adding on more types of components.

Almost every plot maps a variable to x and y, so naming these aesthetics is tedious, so the first two unnamed arguments to aes() will be mapped to x and y. This means that the following code is identical to the example above:

ggplot(mpg, aes(displ, hwy)) +
  geom_point()

We'll stick to that style throughout the book, so don't forget that the first two arguments to aes() are x and y. Note that we've put each command on a new line. We recommend doing this in your own code, so it's easy to scan a plot specification and see exactly what's there. In this chapter, we'll sometimes use just one line per plot, because it makes it easier to see the differences between plot variations.

2.4 Colour, size, shape and other aesthetic attributes

To add additional variables to a plot, we can use other aesthetics like colour, shape, and size (NB: while we use British spelling throughout this book, ggplot2 also accepts American spellings). These work in the same way as the x and y aesthetics, and are added into the call to aes():

  • aes(displ, hwy, colour = class)
  • aes(displ, hwy, shape = drv)
  • aes(displ, hwy, size = cyl)

ggplot2 takes care of the details of converting data (e.g., 'f', 'r', '4') into aesthetics (e.g., 'red', 'yellow', 'green') with a scale . **There is one scale for each aesthetic mapping in a plot. The scale is also responsible for creating a guide, an axis or legend, that allows you to read the plot, converting aesthetic values back into data value****s.**For now, we'll stick with the default scales provided by ggplot2. You'll learn how to override them in Chapter 11.

To learn more about those outlying variables in the previous scatterplot, we could map the class variable to colour:

https://ggplot2.tidyverse.org/reference/ggplot.html(mpg, https://ggplot2.tidyverse.org/reference/aes.html(displ, hwy, colour = class)) + 
  https://ggplot2.tidyverse.org/reference/geom_point.html()

r for datascience

R for Data Science: Exercise Solutions

python学习

Intro to Pythonhttps://ourcodingclub.github.io/tutorials/python-intro/

相关推荐
Mephisto.java33 分钟前
【大数据学习 | kafka高级部分】kafka中的选举机制
大数据·学习·kafka
南宫生1 小时前
贪心算法习题其三【力扣】【算法学习day.20】
java·数据结构·学习·算法·leetcode·贪心算法
武子康2 小时前
大数据-212 数据挖掘 机器学习理论 - 无监督学习算法 KMeans 基本原理 簇内误差平方和
大数据·人工智能·学习·算法·机器学习·数据挖掘
使者大牙2 小时前
【大语言模型学习笔记】第一篇:LLM大规模语言模型介绍
笔记·学习·语言模型
As977_3 小时前
前端学习Day12 CSS盒子的定位(相对定位篇“附练习”)
前端·css·学习
ajsbxi3 小时前
苍穹外卖学习记录
java·笔记·后端·学习·nginx·spring·servlet
Rattenking3 小时前
React 源码学习01 ---- React.Children.map 的实现与应用
javascript·学习·react.js
dsywws3 小时前
Linux学习笔记之时间日期和查找和解压缩指令
linux·笔记·学习
道法自然04023 小时前
Ethernet 系列(8)-- 基础学习::ARP
网络·学习·智能路由器
爱吃生蚝的于勒3 小时前
深入学习指针(5)!!!!!!!!!!!!!!!
c语言·开发语言·数据结构·学习·计算机网络·算法