2.使用 Label Studio 标注文本

使用 Label Studio 标注文本


文章目录


前言

Label Studio是一个开源的功能强大的标注平台,可以标注视频,图片,音频,文字等各类型的数据。

这篇文章主要介绍标注员如何使用Label Studio标注文本数据。

下面是开源地址

Github地址


Label Studio的简单使用

1.创建项目


2.添加本地存储


这里的路径填写的是之前设置的LOCAL_FILES_DOCUMENT_ROOT的路径,后面加了一个子文件夹Resume_Labeling(该文件夹需要提前创建),

填好之后可以点击Check Connection按钮检查路径是否配置正确

3.选择标注模板

Projects=>Resume_Labeling=>Settings=>Labeling Interface=>Browse Templates

选择我们刚刚添加的自定义模板

4.添加数据

把需要标注的文件和一个Import.json文件,放到Resume_Labeling文件夹下,再从界面导入Import.json文件,就可以了

导入数据的文件夹结构

点击Import按钮,选择Import.json文件



import.json文件内容

bash 复制代码
[{
	"data": {
            "labeler":"task3@qq.com",
			"reviewver":"reviewver1@qq.com",
			"resume_id":"fsdgsd",
			"rules":"rules",
			"source_resume": "/mydata/local-files/?d=Resume_Labeling/Round1/Import/LabelStudio/source_resume.html",
			"resume": "/mydata/local-files/?d=Resume_Labeling/Round1/Import/LabelStudio/resume.html"
        }
    }
]

resume.html 和source_resume.html

html 复制代码
<html>
	<head>
		<style>
			.page[theme="beryl"] * {
							user-select: text;
							color: #333344;
							font-size: 16px;
							line-height: 1.6;
							overflow-wrap: break-word;
						}
						
						.page[theme="beryl"] a {
							text-decoration-color: #008117;
						}
						
						.page[theme="beryl"] {
							width: 794px;
							background: #ffffff;
							padding: 72px;
							margin-bottom: 32px;
							
							border-radius: 4px;
							box-shadow: 0px 4px 8px #d0d0d0;
						
							position: relative;
						}
						
						/* 以下是页眉页脚样式 */
						
						.page[theme="beryl"]>header,
						footer {
							position: absolute;
						}
						
						.page[theme="beryl"]>header {
							top: 28px;
							right: 72px;
						}
						
						.page[theme="beryl"]>footer {
							bottom: 28px;
							left: 397px;
							transform: translate(-50%, 0);
						}
						
						.page[theme="beryl"]>footer>div::before {
							content: "- ";
						}
						
						.page[theme="beryl"]>footer>div::after {
							content: " -";
						}
						
						/* 以下是基本信息式 */
						
						.page[theme="beryl"]>.head {
							width: 100%;
							display: inline;
							grid-template-columns: auto 1fr auto;
							grid-column-gap: 16px;
							margin-bottom: 32px;
						}
						
						.page[theme="beryl"]>.head>div>.name {
							font-size: 36px;
							font-weight: bold;
							margin-bottom: 32px;
							text-align: center;
						}
						
						
						
						.page[theme="beryl"]>.head>.information {
							display: grid;
							grid-template-columns: auto  auto auto;
							/* grid-column-gap: 20px; */
						}
						
						.page[theme="beryl"]>.head>.information>.label {
							display: flex;
							/* justify-content: flex-end; */
							margin-bottom: 8px;
						}
						
						.page[theme="beryl"]>.head>.information>.label>.title {
							font-size: 16px;
							font-weight: bold;
							color: #008117;
						}
						
						.page[theme="beryl"]>.head>.information>.label>.msg {
							font-size: 16px;
							margin-left: 8px;
							font-weight: bold;
						}
						.page[theme="beryl"]>.head>.information>.label>.icon {
							width: 18px;
							height: 22px;
							object-fit: contain;
							margin-right: 8px;
						}
						
						.page[theme="beryl"]>.head>.information>.label>.tag {
							margin-right: 8px;
							font-weight: bold;
						}
						
						.page[theme="beryl"]>.head>.portrait {
							/* 48mm x 33mm */
							height: 182px;
							width: 125px;
							object-fit: contain;
						}
						
						.page[theme="beryl"]>.head>.portrait:not([src]) {
							width: 0;
							opacity: 0;
						}
						
						/* 以下是详细信息式 */
						
						.page[theme="beryl"]>.main {
							margin-bottom: 36px;
						}
						
						.page[theme="beryl"]>.main>.mainhead {
							display: flex;
							margin-bottom: 8px;
							background: #497089;
							padding-left: 16px;
							border-radius: 8px;
						}
						
						.page[theme="beryl"]>.main>.mainhead>.icon {
							display: none;
						}
						
						.page[theme="beryl"]>.main>.mainhead>.t1 {
							font-size: 24px;
							font-weight: bold;
							color: #ffffff;
						}
						
						.page[theme="beryl"]>.main>.subhead {
							display: inline;
							/* grid-template-columns: 1fr auto; */
						}
						
						.page[theme="beryl"]>.main>.subhead>.information {
							display: grid;
							grid-template-columns: auto;
							margin-bottom:20px;
						}
						
						.page[theme="beryl"]>.main>.subhead>.information>.label {
							display: flex;
						}
						
						.page[theme="beryl"]>.main>.subhead>.information>.label>.title {
							font-size: 16px;
							font-weight: bold;
							color: #008117;
						}
						
						.page[theme="beryl"]>.main>.subhead>.information>.label>.value {
							font-size: 16px;
							margin-left: 8px;
							font-weight: bold;
						}
						
						.page[theme="beryl"]>.main>.subhead>.t2 {
							font-size: 16px;
							font-weight: bold;
							color: #008117;
						}
						
						.page[theme="beryl"]>.main>.subhead>.time {
							font-weight: bold;
							color: #008117;
						}
						
						.page[theme="beryl"]>.main>.subhead>.note {
							font-weight: bold;
							color: #555555;
						}
						
						.page[theme="beryl"]>.main>ol,
						ul {
							padding-left: 20px;
						}
						
						.page[theme="beryl"]>.main>.contents {
							margin-bottom: 8px;
						}
						
						.page[theme="beryl"]>.main>.contents>div {
							font-size: 16px;
						}
						
						/* 以下是打印样式处理 */
						
						@media print {
							.page[theme="beryl"] * {
								color: #000000;
							}
						
							.page[theme="beryl"] {
								border-radius: 0;
								box-shadow: none;
							}
						}
						
						* {
							padding: 0;
							margin: 0;
							user-select: none;
							box-sizing: border-box;
							color: #333344;
							print-color-adjust: exact;
							-webkit-print-color-adjust: exact;
						}
						
						body {
							background: #f0f0f0;
						}
						
						.themes {
							position: fixed;
							top: 16px;
							left: 16px;
						}
						
						.themes>div {
							margin: 16px;
							font-size: 22px;
							height: 48px;
							line-height: 48px;
							text-align: center;
							padding: 0 16px;
							border-radius: 24px;
						}
						
						.themes,
						.language,
						.toolbar {
							display: none;
						}
						
						.language {
							position: fixed;
							right: 16px;
							top: 16px;
						}
						
						.toolbar {
							position: fixed;
							right: 16px;
							bottom: 16px;
						}
						
						.language>div,
						.toolbar>div {
							width: 48px;
							height: 48px;
							border-radius: 50%;
							margin: 16px;
							display: flex;
							align-items: center;
							justify-content: center;
						}
						
						.language>div>img {
							width: 40px;
							height: 40px;
							object-fit: contain;
						}
						
						.toolbar>div>img {
							width: 32px;
							height: 32px;
							object-fit: contain;
						}
						
						.themes>.themes-title {
							font-size: 28px;
							color: #666666;
						}
						
						.language>div>img,
						.toolbar>div>img {
							filter: brightness(2)
						}
						
						.themes>.theme,
						.language>div,
						.toolbar>div {
							background: #f0f0f0;
							box-shadow: 2px 2px 4px #dadada, -2px -2px 4px #ffffff;
							color: #666666;
						}
						
						.themes>.theme:active,
						.language>div:active,
						.toolbar>div:active {
							filter: brightness(1.03);
						}
						
						.themes>.theme[selected="true"],
						.language>div[selected="true"] {
							box-shadow: inset 2px 2px 4px #dadada, inset -2px -2px 4px #ffffff;
						}
						
						.resume {
							/* 210mm x 297mm */
							width: 794px;
							/* height: 1123px; */
							margin: 32px auto;
						}
						
						.source {
							width: 100%;
							text-align: center;
							margin-bottom: 32px;
						}
						
						@media print {
							@page {
								margin: 0;
							}
						
							.no-print {
								display: none !important
							}
						
							.resume {
								margin: 0 auto;
							}
						}
		</style>
		<meta charset="UTF-8">
	</head>
	<body>

		<div id="resume" class="resume">
			<div class="page" theme="beryl" style="height: 900px;">
				<section class="head" name="basic_information">
					<div>
						<div class="name">个人简历</div>
					</div>
					<div class="information">
						<div class="label">
							<h2 class="title">姓名</h2>
							<div class="msg" name="name">xxx</div>
						</div>
						<div class="label">
							<h2 class="title">性别</h2>
							<div class="msg" name="gender">男</div>
						</div>
						<div class="label">
							<h2 class="title">年龄</h2>
							<div class="msg" name="age">31</div>
						</div>
						<div class="label">
							<h2 class="title">邮箱</h2>
							<div class="msg" name="email">dddddd@qq.com</div>
						</div>
						<div class="label">
							<h2 class="title">电话</h2>
							<div class="msg" name="phone">1111111</div>
						</div>
						<div class="label">
							<h2 class="title">住址</h2>
							<div class="msg" name="loc"></div>
						</div>
						<div class="label">
							<h2 class="title">工作年限</h2>
							<div class="msg" name="work_year">2</div>
						</div>
					</div>
				</section>
				<section class="main pri-subdir" name="edu_exp">
					<div class="mainhead">
						<h1 class="t1">教育经历</h1>
					</div>
					<div class="subhead" name="edu_exp">
						<div class="information" name="edu_exp_1">
							<div class="label">
								<h2 class="title">学校</h2>
								<div class="value" name="school">美国麻省大学波士顿分校</div>
							</div>
							<div class="label">
								<h2 class="title">开始时间</h2>
								<div class="value" name="start_time">2015.10</div>
							</div>
							<div class="label">
								<h2 class="title">结束时间</h2>
								<div class="value" name="end_time">2019.12</div>
							</div>
						</div>
					</div>
					<div class="subhead" name="edu_exp">
						<div class="information" name="edu_exp_2">
							<div class="label">
								<h2 class="title">学校</h2>
								<div class="value" name="school">第二个学校</div>
							</div>
							<div class="label">
								<h2 class="title">开始时间</h2>
								<div class="value" name="start_time">2015.10</div>
							</div>
							<div class="label">
								<h2 class="title">结束时间</h2>
								<div class="value" name="end_time">2019.12</div>
							</div>
						</div>
					</div>
				</section>
				<section class="main no-subdir" name="english_ability">
					<div class="mainhead">
						<h1 class="t1">英语能力</h1>
					</div>
					<div class="contents">
						<div>读写能力良好</div>
						<div>听说能力良好</div>
					</div>
				</section>
				<section class="main no-subdir" name="certs">
					<div class="mainhead">
						<h1 class="t1">证书</h1>
					</div>
					<div class="contents">
						<div>证书1</div>
						<div>证书2</div>
					</div>
				</section>
				<footer>
					<div>1</div>
				</footer>
			</div>
			<div class="page" theme="beryl" style="height: 900px;">
				<section class="main no-subdir" name="skills">
					<div class="mainhead">
						<h1 class="t1">专业技能</h1>
					</div>
					<div class="contents">
						<div>Wireshark- HTTP , DNS, TCP/IP, capture Ethernet data</div>
						<div>VM WorkStation</div>
					</div>
				</section>
				<section class="main no-subdir" name="my_desc">
					<div class="mainhead">
						<h1 class="t1">自我评价</h1>
					</div>
					<div class="contents">
						<div>自我评价内容</div>
					</div>
				</section>
				<section class="main pri-subdir" name="job_exp">
					<div class="mainhead">
						<h1 class="t1">工作经历</h1>
					</div>
					<div class="subhead" name="job_exp">
						<div class="information" name="job_exp_1">
							<div class="label">
								<h2 class="title">公司</h2>
								<div class="value" name="company">美团</div>
							</div>
							<div class="label">
								<h2 class="title">开始时间</h2>
								<div class="value" name="start_time">2023.10</div>
							</div>
							<div class="label">
								<h2 class="title">结束时间</h2>
								<div class="value" name="end_time">至今</div>
							</div>
							<div class="label">
								<h2 class="title">岗位</h2>
								<div class="value" name="position">AI岗位</div>
							</div>
							<div class="contents"></div>
						</div>
					</div>
				</section>
				<footer>
					<div>2</div>
				</footer>
			</div>
			<div class="page" theme="beryl" style="height: 900px;">
				<section class="main pri-subdir" name="job_exp">
					<div class="subhead">
						<div class="information" name="job_exp_1">
							<div class="contents">
								<div>内容</div>
							</div>
						</div>
					</div>
					<div class="subhead" name="job_exp">
						<div class="information" name="job_exp_2">
							<div class="label">
								<h2 class="title">公司</h2>
								<div class="value" name="company">公司1</div>
							</div>
							<div class="label">
								<h2 class="title">开始时间</h2>
								<div class="value" name="start_time">2021.08</div>
							</div>
							<div class="label">
								<h2 class="title">结束时间</h2>
								<div class="value" name="end_time">2023.10</div>
							</div>
							<div class="label">
								<h2 class="title">岗位</h2>
								<div class="value" name="position">标注审核员</div>
							</div>
							<div class="contents">
								<div>负责对已标注视频数据内容审核工作</div>
							</div>
						</div>
					</div>
				</section>
				<section class="main pri-subdir" name="proj_exp">
					<div class="mainhead">
						<h1 class="t1">项目经历</h1>
					</div>
					<div class="subhead" name="proj_exp">
						<div class="information" name="proj_exp_1">
							<div class="label">
								<h2 class="title">项目名称</h2>
								<div class="value" name="proj_name">AI模型数据标注基础研发平台</div>
							</div>
							<div class="label">
								<h2 class="title">开始时间</h2>
								<div class="value" name="start_time">2023.10</div>
							</div>
							<div class="label">
								<h2 class="title">结束时间</h2>
								<div class="value" name="end_time">至今</div>
							</div>
							<div class="label">
								<h2 class="title">项目职责</h2>
							</div>
							<div class="contents">
								<div>熟练掌握AI模型训练及评测相关标注任务</div>
							</div>
							<div class="label">
								<h2 class="title">项目内容</h2>
							</div>
							<div class="contents"></div>
						</div>
					</div>
				</section>
				<footer>
					<div>3</div>
				</footer>
			</div>
			<div class="page" theme="beryl" style="height: 900px;">
				<section class="main pri-subdir" name="proj_exp">
					<div class="subhead">
						<div class="information" name="proj_exp_1">
							<div class="contents">
								<div>熟练掌握AI模型训练及评测相关标注任务</div>
							</div>
						</div>
					</div>
				</section>
				<footer>
					<div>4</div>
				</footer>
			</div>
		</div>


	</body>
</html>

5.标注

界面稍微有调整,左边添加了一个原始简历,用于展示和对比

1.标注时,先选择标签,如Name(也可以用快捷键选择,快捷键在标签的右上角展示,如Name的快捷键是4)
2.然后在标注界面,选择文本,即完成标注
3.如果需要修改文本,则选择文本,在界面的上方会显示一个文本框,在里面填写修改后的文本
4.完成后点击提交(或更新)

6.添加关系

通常简历中可能不止一段教育经历(项目经历|工作经历),为区分,需要给同一段教育经历分组(项目经历|工作经历),可以通过添加关系来达成目的。
1.选择关系的起点:同一段教育经历下的字段,如下图中的结束时间
2.标签基本信息面板,点击关系按钮(或者快捷键Alt+R)
3.选择关系的终点:同一段教育经历下的学校字段,如下图中的学校名称


总结

本文从标注人员的角度简单介绍了Label Studio的使用。

相关推荐
衬衫chenshan30 分钟前
【论文阅读】Large Language Models for Equivalent Mutant Detection: How Far Are We?
论文阅读·人工智能·语言模型
云空1 小时前
《人工智能深度学习的基本路线图》
人工智能·深度学习
一只老虎1 小时前
AI 技术在旅游和酒店行业的应用前景
人工智能·旅游·酒店
墨绿色的摆渡人1 小时前
用 Python 从零开始创建神经网络(五):损失函数(Loss Functions)计算网络误差
人工智能·python·深度学习·神经网络
我的心永远是冰冰哒2 小时前
pytorch奇怪错误
人工智能·pytorch·python
DisonTangor2 小时前
英伟达基于Mistral 7B开发新一代Embedding模型——NV-Embed-v2
人工智能·搜索引擎·embedding
飞凌嵌入式2 小时前
飞凌嵌入式RK3576核心板已适配Android 14系统
android·人工智能·飞凌嵌入式
微刻时光2 小时前
RPA真的是人工智能吗?
人工智能·rpa
爱编程的涵崽2 小时前
PyTorch——从入门到精通:PyTorch基础知识(张量)【PyTorch系统学习】
人工智能·pytorch·python·深度学习
宋一诺333 小时前
机器学习—诊断偏差和方差
人工智能·机器学习