Less Selector

About 697 wordsAbout 2 min

2025-10-30

This document introduces how to use the Less Selector within the DataFlex framework to implement dynamic selection of training data, thereby improving the effectiveness of Supervised Fine-Tuning (SFT). This method originates from Less: Sample Selection via Influence Functions (ICML 2024).

1. Method Overview

The core idea of the Less Selector is: Based on the Influence Function of the Adam optimizer, it measures the relevance between training samples and validation samples through the similarity of gradient directions. It dynamically selects training samples during the SFT process to enhance the model's generalization performance.

Mathematical Definition

\mathrm{Inf}_{\mathrm{less}}(z, z') \triangleq \sum_{i=1}^{N} \bar{\eta}_i \cos \big( \nabla \ell(z'; \theta_i), \Gamma(z, \theta_i) \big)

2. Implementation Steps

Step 1: Environment Setup

git clone https://github.com/OpenDCAI/DataFlex.git
cd DataFlex
pip install -e .
pip install llamafactory

Step 2: Less Selector Configuration

Configuration file path:

DataFlex/src/dataflex/configs/components.yaml

Example configuration:

less:
  name: less
  params:
    cache_dir: ../dataflex_saves/less_output
    gradient_type: adam
    proj_dim: 4096
    seed: 123
    save_interval: 16

Parameter Description:

gradient_type: Type of gradient descent, default adam.
proj_dim: Random projection dimension (e.g., 4096 or 8192), used to reduce computation cost. See Less “4.1 Step 2: Projecting the gradients”.
cache_dir: Directory to store intermediate cache results.
seed: Random seed for reproducibility.

Step 3: Dynamic Training Configuration

Configuration file path:

DataFlex/examples/train_lora/selectors/less.yaml

Example configuration:

### model
model_name_or_path: meta-llama/Llama-3.1-8B
trust_remote_code: true

### method
stage: sft
do_train: true
finetuning_type: lora
lora_target: all
lora_rank: 16
lora_alpha: 8
deepspeed: examples/deepspeed/ds_z3_config.json  

### dataset
dataset: alpaca_en_demo
template: llama3
cutoff_len: 4096
overwrite_cache: true
preprocessing_num_workers: 16
dataloader_num_workers: 0
seed: 42

### output
output_dir: ../dataflex_saves/Llama-3.1-8B/less
logging_steps: 10
save_steps: 100
plot_loss: true
save_only_model: false
overwrite_output_dir: true

### train
per_device_train_batch_size: 1
gradient_accumulation_steps: 1
learning_rate: 1.0e-4
num_train_epochs: 1.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000

### dynamic_train
train_type: dynamic_select
components_cfg_file: src/dataflex/configs/components.yaml
component_name: less
warmup_step: 10
update_step: 10
update_times: 2

eval_dataset: alpaca_zh_demo

Parameter Description:

model_name_or_path: Model name or path for supervised fine-tuning.
dataset: Training dataset.
output_dir: Output directory of dynamic fine-tuning (LoRA adapter).
warmup_step: Number of warmup steps before the first sample selection.
update_step: Number of steps between each dynamic data selection.
update_times: Total number of dynamic data selection iterations.
eval_dataset: Validation dataset.

Both dataset and eval_dataset can be selected from DataFlex/data/dataset_info.json or local JSON files in ShareGPT/Alpaca format. Note: The training set size significantly affects computation cost. Total steps = warmup_step + update_step × update_times.

Step 4: Run Training

FORCE_TORCHRUN=1 DISABLE_VERSION_CHECK=1 dataflex-cli train examples/train_lora/selectors/less.yaml

The training process automatically performs dynamic data selection and model updates.

Step 5: Model Merge and Export

Configuration file path:

DataFlex/examples/merge_lora/llama3_lora_sft.yaml

Example configuration:

model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct
adapter_name_or_path: ../dataflex_saves/Llama-3.1-8B/less
template: llama3
trust_remote_code: true

export_dir: ../dataflex_saves/Llama-3.1-8B_lora_sft
export_size: 5
export_device: cpu  # choices: [cpu, auto]
export_legacy_format: false

Parameter Description:

model_name_or_path: Model name or path used for training.
adapter_name_or_path: Output path of the LoRA adapter.
export_dir: Directory for saving the merged result of the fine-tuned model and LoRA adapter.

Execute the export command:

llamafactory-cli export llama3_lora_sft.yaml

The merged model will be saved in:

/dataflex_saves/Llama-3.1-8B_lora_sft

3. Model Evaluation

It is recommended to use the DataFlow Model QA Evaluation Pipeline for systematic evaluation of the generated model.