> experiment-tracking-swanlab
Provides guidance for experiment tracking with SwanLab. Use when you need open-source run tracking, local or self-hosted dashboards, and lightweight media logging for ML workflows.
curl "https://skillshub.wtf/Orchestra-Research/AI-Research-SKILLs/swanlab?format=md"SwanLab: Open-Source Experiment Tracking
When to Use This Skill
Use SwanLab when you need to:
- Track ML experiments with metrics, configs, tags, and descriptions
- Visualize training with scalar charts and logged media
- Compare runs across seeds, checkpoints, and hyperparameters
- Work locally or self-hosted instead of depending on managed SaaS
- Integrate with PyTorch, Transformers, PyTorch Lightning, or Fastai
Deployment: Cloud, local, or self-hosted | Media: images, audio, text, GIFs, point clouds, molecules | Integrations: PyTorch, Transformers, PyTorch Lightning, Fastai
Installation
# Install SwanLab plus the media dependencies used in this skill
pip install "swanlab>=0.7.11" "pillow>=9.0.0" "soundfile>=0.12.0"
# Add local dashboard support for mode="local" and swanlab watch
pip install "swanlab[dashboard]>=0.7.11"
# Optional framework integrations
pip install transformers pytorch-lightning fastai
# Login for cloud or self-hosted usage
swanlab login
pillow and soundfile are the media dependencies used by the Image and Audio examples in this skill. swanlab[dashboard] adds the local dashboard dependency required by mode="local" and swanlab watch.
Quick Start
Basic Experiment Tracking
import swanlab
run = swanlab.init(
project="my-project",
experiment_name="baseline",
config={
"learning_rate": 1e-3,
"epochs": 10,
"batch_size": 32,
"model": "resnet18",
},
)
for epoch in range(run.config.epochs):
train_loss = train_epoch()
val_loss = validate()
swanlab.log(
{
"train/loss": train_loss,
"val/loss": val_loss,
"epoch": epoch,
}
)
run.finish()
With PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
import swanlab
run = swanlab.init(
project="pytorch-demo",
experiment_name="mnist-mlp",
config={
"learning_rate": 1e-3,
"batch_size": 64,
"epochs": 10,
"hidden_size": 128,
},
)
model = nn.Sequential(
nn.Flatten(),
nn.Linear(28 * 28, run.config.hidden_size),
nn.ReLU(),
nn.Linear(run.config.hidden_size, 10),
)
optimizer = optim.Adam(model.parameters(), lr=run.config.learning_rate)
criterion = nn.CrossEntropyLoss()
for epoch in range(run.config.epochs):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
optimizer.zero_grad()
logits = model(data)
loss = criterion(logits, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
swanlab.log(
{
"train/loss": loss.item(),
"train/epoch": epoch,
"train/batch": batch_idx,
}
)
run.finish()
Core Concepts
1. Projects and Experiments
Project: Collection of related experiments
Experiment: Single execution of a training or evaluation workflow
import swanlab
run = swanlab.init(
project="image-classification",
experiment_name="resnet18-seed42",
description="Baseline run on ImageNet subset",
tags=["baseline", "resnet18"],
config={
"model": "resnet18",
"seed": 42,
"batch_size": 64,
"learning_rate": 3e-4,
},
)
print(run.id)
print(run.config.learning_rate)
2. Configuration Tracking
config = {
"model": "resnet18",
"seed": 42,
"batch_size": 64,
"learning_rate": 3e-4,
"epochs": 20,
}
run = swanlab.init(project="my-project", config=config)
learning_rate = run.config.learning_rate
batch_size = run.config.batch_size
3. Metric Logging
# Log scalars
swanlab.log({"loss": 0.42, "accuracy": 0.91})
# Log multiple metrics
swanlab.log(
{
"train/loss": train_loss,
"train/accuracy": train_acc,
"val/loss": val_loss,
"val/accuracy": val_acc,
"lr": current_lr,
"epoch": epoch,
}
)
# Log with custom step
swanlab.log({"loss": loss}, step=global_step)
4. Media and Chart Logging
import numpy as np
import swanlab
# Image
image = np.random.randint(0, 255, (224, 224, 3), dtype=np.uint8)
swanlab.log({"examples/image": swanlab.Image(image, caption="Augmented sample")})
# Audio
wave = np.sin(np.linspace(0, 8 * np.pi, 16000)).astype("float32")
swanlab.log({"examples/audio": swanlab.Audio(wave, sample_rate=16000)})
# Text
swanlab.log({"examples/text": swanlab.Text("Training notes for this run.")})
# GIF video
swanlab.log({"examples/video": swanlab.Video("predictions.gif", caption="Validation rollout")})
# Point cloud
points = np.random.rand(128, 3).astype("float32")
swanlab.log({"examples/point_cloud": swanlab.Object3D(points, caption="Point cloud sample")})
# Molecule
swanlab.log({"examples/molecule": swanlab.Molecule.from_smiles("CCO", caption="Ethanol")})
# Custom chart with swanlab.echarts
line = swanlab.echarts.Line()
line.add_xaxis(["epoch-1", "epoch-2", "epoch-3"])
line.add_yaxis("train/loss", [0.92, 0.61, 0.44])
line.set_global_opts(
title_opts=swanlab.echarts.options.TitleOpts(title="Training Loss")
)
swanlab.log({"charts/loss_curve": line})
See references/visualization.md for more chart and media patterns.
5. Local and Self-Hosted Workflows
import os
import swanlab
# Self-hosted or cloud login
swanlab.login(
api_key=os.environ["SWANLAB_API_KEY"],
host="http://your-server:5092",
)
# Local-only logging
run = swanlab.init(
project="offline-demo",
mode="local",
logdir="./swanlog",
)
swanlab.log({"loss": 0.35, "epoch": 1})
run.finish()
# View local logs
swanlab watch -l ./swanlog
# Sync local logs later
swanlab sync ./swanlog
Integration Examples
HuggingFace Transformers
from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir="./results",
per_device_train_batch_size=8,
evaluation_strategy="epoch",
logging_steps=50,
report_to="swanlab",
run_name="bert-finetune",
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
)
trainer.train()
See references/integrations.md for callback-based setups and additional framework patterns.
PyTorch Lightning
import pytorch_lightning as pl
from swanlab.integration.pytorch_lightning import SwanLabLogger
swanlab_logger = SwanLabLogger(
project="lightning-demo",
experiment_name="mnist-classifier",
config={"batch_size": 64, "max_epochs": 10},
)
trainer = pl.Trainer(
logger=swanlab_logger,
max_epochs=10,
accelerator="auto",
)
trainer.fit(model, train_loader, val_loader)
Fastai
from fastai.vision.all import accuracy, resnet34, vision_learner
from swanlab.integration.fastai import SwanLabCallback
learn = vision_learner(dls, resnet34, metrics=accuracy)
learn.fit(
5,
cbs=[
SwanLabCallback(
project="fastai-demo",
experiment_name="pets-classification",
config={"arch": "resnet34", "epochs": 5},
)
],
)
See references/integrations.md for fuller framework examples.
Best Practices
1. Use Stable Metric Names
# Good: grouped metric namespaces
swanlab.log({
"train/loss": train_loss,
"train/accuracy": train_acc,
"val/loss": val_loss,
"val/accuracy": val_acc,
})
# Avoid mixing flat and grouped names for the same metric family
2. Initialize Early and Capture Config Once
run = swanlab.init(
project="image-classification",
experiment_name="resnet18-baseline",
config={
"model": "resnet18",
"learning_rate": 3e-4,
"batch_size": 64,
"seed": 42,
},
)
3. Save Checkpoints Locally
import torch
import swanlab
checkpoint_path = "checkpoints/best.pth"
torch.save(model.state_dict(), checkpoint_path)
swanlab.log(
{
"best/val_accuracy": best_val_accuracy,
"artifacts/checkpoint_path": swanlab.Text(checkpoint_path),
}
)
4. Use Local Mode for Offline-First Workflows
run = swanlab.init(project="offline-demo", mode="local", logdir="./swanlog")
# ... training code ...
run.finish()
# Inspect later with: swanlab watch -l ./swanlog
5. Keep Advanced Patterns in References
- Use references/visualization.md for advanced chart and media patterns
- Use references/integrations.md for callback-based and framework-specific integration details
Resources
See Also
- references/integrations.md - Framework-specific examples
- references/visualization.md - Charts and media logging patterns
> related_skills --same-repo
> whisper
OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.
> fine-tuning-openvla-oft
Fine-tunes and evaluates OpenVLA-OFT and OpenVLA-OFT+ policies for robot action generation with continuous action heads, LoRA adaptation, and FiLM conditioning on LIBERO simulation and ALOHA real-world setups. Use when reproducing OpenVLA-OFT paper results, training custom VLA action heads (L1 or diffusion), deploying server-client inference for ALOHA, or debugging normalization, LoRA merge, and cross-GPU issues.
> fine-tuning-serving-openpi
Fine-tune and serve Physical Intelligence OpenPI models (pi0, pi0-fast, pi0.5) using JAX or PyTorch backends for robot policy inference across ALOHA, DROID, and LIBERO environments. Use when adapting pi0 models to custom datasets, converting JAX checkpoints to PyTorch, running policy inference servers, or debugging norm stats and GPU memory issues.
> evaluating-cosmos-policy
Evaluates NVIDIA Cosmos Policy on LIBERO and RoboCasa simulation environments. Use when setting up cosmos-policy for robot manipulation evaluation, running headless GPU evaluations with EGL rendering, or profiling inference latency on cluster or local GPU machines.