Python酷库之旅-第三方库Pandas(088)

一、用法精讲

371、pandas.Series.sparse.density属性

371-1、语法

371-2、参数

371-3、功能

371-4、返回值

371-5、说明

371-6、用法

371-6-1、数据准备

371-6-2、代码示例

371-6-3、结果输出

372、pandas.Series.sparse.fill_value属性

372-1、语法

372-2、参数

372-3、功能

372-4、返回值

372-5、说明

372-6、用法

372-6-1、数据准备

372-6-2、代码示例

372-6-3、结果输出

373、pandas.Series.sparse.sp_values属性

373-1、语法

373-2、参数

373-3、功能

373-4、返回值

373-5、说明

373-6、用法

373-6-1、数据准备

373-6-2、代码示例

373-6-3、结果输出

374、pandas.Series.sparse.from_coo方法

374-1、语法

374-2、参数

374-3、功能

374-4、返回值

374-5、说明

374-6、用法

374-6-1、数据准备

374-6-2、代码示例

374-6-3、结果输出

375、pandas.Series.sparse.to_coo方法

375-1、语法

375-2、参数

375-3、功能

375-4、返回值

375-5、说明

375-6、用法

一、用法精讲

371、pandas.Series.sparse.density属性

371-1、语法

python 复制代码

# 371、pandas.Series.sparse.density属性
pandas.Series.sparse.density
The percent of non- fill_value points, as decimal.

371-2、参数

无

371-3、功能

计算并返回稀疏Series中非缺失(有效)值的占比。

371-4、返回值

返回一个浮点数，取值范围从0到1：

0 表示该稀疏Series中没有非缺失值。
1 表示该稀疏Series中所有值都是非缺失值。
其他值则表示非缺失值与总值的比例，例如0.25表示有25%的值是非缺失的。

371-5、说明

无

371-6、用法

371-6-1、数据准备

python 复制代码

无

371-6-2、代码示例

python 复制代码

# 371、pandas.Series.sparse.density属性
import pandas as pd
import numpy as np
# 创建一个稀疏Series
s = pd.Series(pd.arrays.SparseArray([1, np.nan, 2, np.nan, 3]))
# 获取非缺失值的密度
density = s.sparse.density
print("非缺失值的密度:", density)

371-6-3、结果输出

python 复制代码

# 371、pandas.Series.sparse.density属性
# 非缺失值的密度: 0.6

372、pandas.Series.sparse.fill_value属性

372-1、语法

python 复制代码

# 372、pandas.Series.sparse.fill_value属性
pandas.Series.sparse.fill_value
Elements in data that are fill_value are not stored.

For memory savings, this should be the most common value in the array.

372-2、参数

无

372-3、功能

在稀疏数组中那些未显式存储元素的默认值。例如，如果fill_value被设置为0，那么所有未存储的元素都会被视为0，该属性使得稀疏数据结构更高效，因为仅需存储那些与fill_value不同的元素。

372-4、返回值

返回值是稀疏数组中默认填充值的类型和值。例如：

如果你创建了一个稀疏数组并指定fill_value=0，那么这个属性将返回0。
如果你创建了一个稀疏数组并指定fill_value=-1，那么这个属性将返回-1。

372-5、说明

无

372-6、用法

372-6-1、数据准备

python 复制代码

无

372-6-2、代码示例

python 复制代码

# 372、pandas.Series.sparse.fill_value属性
import pandas as pd
import numpy as np
# 创建一个稀疏Series，缺失值默认填充为0
s = pd.Series(pd.arrays.SparseArray([1, np.nan, 2, np.nan, 3]))
# 查看fill_value属性
print(s.sparse.fill_value)  
# 创建一个自定义fill_value的稀疏Series
s_custom = pd.Series(pd.arrays.SparseArray([1, -1, 2, -1, 3], fill_value=-1))
# 查看自定义fill_value属性
print(s_custom.sparse.fill_value)

372-6-3、结果输出

python 复制代码

# 372、pandas.Series.sparse.fill_value属性
# nan
# -1

373、pandas.Series.sparse.sp_values属性

373-1、语法

python 复制代码

# 373、pandas.Series.sparse.sp_values属性
pandas.Series.sparse.sp_values
An ndarray containing the non- fill_value values.

373-2、参数

无

373-3、功能

提取稀疏数组中的实际存储数据，这些数据是与fill_value不同的值，因为稀疏数组不存储与fill_value相同的元素，所以sp_values只会包含那些显式存储的、非默认的元素。

373-4、返回值

返回一个NumPy数组，包含稀疏数组中的实际存储值，这些值与fill_value不同，并且是稀疏数据中需要存储和操作的部分。

373-5、说明

无

373-6、用法

373-6-1、数据准备

python 复制代码

无

373-6-2、代码示例

python 复制代码

# 373、pandas.Series.sparse.sp_values属性
import pandas as pd
import numpy as np
# 创建一个稀疏Series，缺失值默认填充为0
s = pd.Series(pd.arrays.SparseArray([1, 0, 2, 0, 3]))
# 查看sp_values属性
print(s.sparse.sp_values)
# 创建一个自定义fill_value的稀疏Series
s_custom = pd.Series(pd.arrays.SparseArray([1, -1, 2, -1, 3], fill_value=-1))
# 查看自定义fill_value的sp_values属性
print(s_custom.sparse.sp_values)

373-6-3、结果输出

python 复制代码

# 373、pandas.Series.sparse.sp_values属性
# [1 2 3]
# [1 2 3]

374、pandas.Series.sparse.from_coo方法

374-1、语法

python 复制代码

# 374、pandas.Series.sparse.from_coo方法
pandas.Series.sparse.from_coo(A, dense_index=False)
Create a Series with sparse values from a scipy.sparse.coo_matrix.

Parameters:
A
scipy.sparse.coo_matrix
dense_index
bool, default False
If False (default), the index consists of only the coords of the non-null entries of the original coo_matrix. If True, the index consists of the full sorted (row, col) coordinates of the coo_matrix.

Returns:
s
Series
A Series with sparse values.

374-2、参数

374-2-1、A**(必须)****：**一个稀疏矩阵，通常是从scipy.sparse.coo_matrix或类似的稀疏格式创建的矩阵，它表示了一个二维的稀疏矩阵。

374-2-2、dense_index**(可选，默认值为False)****：**一个布尔值参数，表示返回的Series会使用稀疏的索引，如果设置为True，则返回的Series会使用密集的索引。

374-3、功能

将一个COO格式的稀疏矩阵转换为pandas.Series对象，这对于在pandas中处理稀疏数据非常有用，因为可以直接将稀疏矩阵转换为Series形式，从而利用pandas的功能进行进一步的数据操作和分析。

374-4、返回值

返回一个稀疏Series对象，其中包含了从稀疏矩阵A中提取的数据。

374-5、说明

无

374-6、用法

374-6-1、数据准备

python 复制代码

无

374-6-2、代码示例

python 复制代码

# 374、pandas.Series.sparse.from_coo方法
import pandas as pd
import scipy.sparse as sp
# 创建一个稀疏的COO矩阵
A = sp.coo_matrix([[1, 0, 0], [0, 0, 2], [0, 3, 0]])
# 从COO矩阵创建一个稀疏的pandas Series
sparse_series = pd.Series.sparse.from_coo(A)
print(sparse_series)

374-6-3、结果输出

python 复制代码

# 374、pandas.Series.sparse.from_coo方法
# 0  0    1
# 1  2    2
# 2  1    3
# dtype: Sparse[int32, 0]

375、pandas.Series.sparse.to_coo方法

375-1、语法

python 复制代码

# 375、pandas.Series.sparse.to_coo方法
pandas.Series.sparse.to_coo(row_levels=(0,), column_levels=(1,), sort_labels=False)
Create a scipy.sparse.coo_matrix from a Series with MultiIndex.

Use row_levels and column_levels to determine the row and column coordinates respectively. row_levels and column_levels are the names (labels) or numbers of the levels. {row_levels, column_levels} must be a partition of the MultiIndex level names (or numbers).

Parameters:
row_levels
tuple/list
column_levels
tuple/list
sort_labels
bool, default False
Sort the row and column labels before forming the sparse matrix. When row_levels and/or column_levels refer to a single level, set to True for a faster execution.

Returns:
y
scipy.sparse.coo_matrix
rows
list (row labels)
columns
list (column labels)

375-2、参数

375-2-1、row_levels**(可选，默认值为(0,))****：**一个元组或列表，用于指定将Series的索引级别映射为稀疏矩阵的行索引，默认值为(0,)，表示使用Series的第一个索引级别作为行索引。

375-2-2、column_levels**(可选，默认值为(1,))****：**一个元组或列表，用于指定将Series的索引级别映射为稀疏矩阵的列索引，默认值为(1,)，表示使用Series的第二个索引级别作为列索引。

375-2-3、sort_labels**(可选，默认值为False)****：**一个布尔值参数，如果设置为True，则会在转换之前对行和列的标签进行排序。对于大多数情况，不排序会更高效，尤其是索引已经排序的情况下。

375-3、功能

将一个多层索引的稀疏Series转换为一个COO格式的稀疏矩阵，这在处理需要矩阵表示的数据时非常有用，特别是在进行线性代数操作或使用其他需要稀疏矩阵的库时(例如scipy.sparse)。

375-4、返回值

返回一个scipy.sparse.coo_matrix对象，这是一个标准的COO格式稀疏矩阵。

375-5、说明

无

375-6、用法

375-6-1、数据准备

python 复制代码

无

375-6-2、代码示例

python 复制代码

# 375、pandas.Series.sparse.to_coo方法
import pandas as pd
# 创建一个稀疏的Series带有多层索引
s = pd.Series([1, 2, 3],
              index=pd.MultiIndex.from_tuples([(0, 1), (1, 2), (2, 0)]),
              dtype="Sparse[int]")
# 将Series转换为COO格式的稀疏矩阵
coo_matrix = s.sparse.to_coo(row_levels=(0,), column_levels=(1,))
# 打印稀疏矩阵
print(coo_matrix)

375-6-3、结果输出

python 复制代码

# 375、pandas.Series.sparse.to_coo方法
# (<3x3 sparse matrix of type '<class 'numpy.int32'>'
# 	with 3 stored elements in COOrdinate format>, [(0,), (1,), (2,)], [(1,), (2,), (0,)])