3lc与kaggle

kaggle比赛是一个很适合机器学习的初学者或者从业者参加的系列比赛,并且大多配有奖金,按照我的理解,比美国大学生数学建模竞赛更加可靠。kaggle比赛如果拿到名次的话也是可以写到简历的,并且其本身就是一个很好的实践学到的机器学习理论的方法。有很多著名的模型都是在kaggle类似的比赛中被提出并应用的。
按照某子比赛cotton weed detection challenge官方的说明文档,3lc是作为一个线上dash board训练观测可视化平台,本身意义不是很大,但可以理解的是,因为kaggle比赛举办需要奖金所以3lc就成为了一些kaggle比赛的金主。3lc主要是yolo系列比赛的金主,所以有趣的是居然像modelscope抄openai一样,3lc也搞了个3lc-ultralytics库。众所周知,ultralytics是yolo8支持的基础库。
kaggle下载
kaggle官方给的命令十分简单:
python
kaggle competitions download -c the-3lc-cotton-weed-detection-challenge
但是实测用该命令是没有下载成功的。最后的解决办法是用了土办法,首先点击官网的下载按钮并复制链接,然后再用python 的wget库来下载。示例代码如下:
python
import wget
url ="" # 文件名
# 下载到和.py同路径
wget.download(url, out=output_filename)
然后下好了.zip压缩包后再用unzip filename给解压即可。
3lc的ubuntu安装
bash
pip install 3lc-ultralytics
这样一个命令就可以安装,随后要去登陆注册该网站https://dashboard.3lc.ai/并获得一个api key。
随后用
bash
3lc login <api-key>
登陆并通过3lc service开启本地服务。然后就会在本地或者服务器开一个通信端口,再把这个端口和dash board连上就行,就像插根数据线一样。
将本地项目(kaggle下载的)注册到3lc
python
import tlc
from pathlib import Path
# Define constants for 3LC registration
PROJECT_NAME = "kaggle_cotton_weed_detection"
DATASET_NAME = "cotton_weed_det3"
WORK_DIR = Path(".")
DATASET_YAML = WORK_DIR / "dataset.yaml"
print("=" * 70)
print("DATA REGISTRATION")
print("=" * 70)
# ============================================================================
# IDEMPOTENCY CHECK - Safe to run multiple times
# ============================================================================
try:
# Check if tables already exist
existing_train = tlc.Table.from_names(
project_name=PROJECT_NAME,
dataset_name=DATASET_NAME,
table_name=f"{DATASET_NAME}-train1",
)
existing_val = tlc.Table.from_names(
project_name=PROJECT_NAME,
dataset_name=DATASET_NAME,
table_name=f"{DATASET_NAME}-val1",
)
print("\n⚠️ Tables already exist!")
print(f" Training: {len(existing_train)} samples")
print(f" Validation: {len(existing_val)} samples")
print("\n✅ Using existing tables (no duplicates created)")
print(" This cell is safe to run multiple times!")
# Set variables for compatibility
train_table = existing_train
val_table = existing_val
except Exception:
# Tables don't exist, create them
print("\n✅ No existing tables - creating new ones...")
# Create training table
print("\n Creating training table...")
train_table = tlc.Table.from_yolo(
dataset_yaml_file=str(DATASET_YAML),
split="train",
task="detect",
dataset_name=DATASET_NAME,
project_name=PROJECT_NAME,
table_name=f"{DATASET_NAME}-train1",
)
# Create validation table
print(" Creating validation table...")
val_table = tlc.Table.from_yolo(
dataset_yaml_file=str(DATASET_YAML),
split="val",
task="detect",
dataset_name=DATASET_NAME,
project_name=PROJECT_NAME,
table_name=f"{DATASET_NAME}-val1",
)
# Display registration results
print("\n✅ Tables created successfully!")
print("=" * 70)
print("\n Training Table:")
print(f" Samples: {len(train_table)}")
print(f" URL: {train_table.url}")
print("\n Validation Table:")
print(f" Samples: {len(val_table)}")
print(f" URL: {val_table.url}")
print("\n" + "=" * 70)
print("✅ Phase 1 Complete: Dataset Registered with 3LC!")
print("=" * 70)
print("\n Next Steps:")
print(" (Optional) Explore tables in Dashboard: https://dashboard.3lc.ai")
跑这段代码就是把本地的项目注册到dash board随后会返回training table和validation table的两个地址,后续的程序运行需要这些。