LLMs with fine adjustment often require extended resources, time and memory, challenges that can hinder rapid experimentation and deployment. Note AI Revolutionary this process by allowing advanced and effective refiners models such as QWEN3-14B with minimal GPU memory, by taking advantage of advanced techniques such as 4-bit quantification and LORA (low-rank adaptation). In this tutorial, we cover a practical implementation on Google Colab to refine QWEN3-14B using a combination of instructions and monitoring of instructions, combining the FastlanguageModel utilities from FastnanguageModel of UNFOTH with TRL.Sfttrainer users can obtain powerful fine performance with basic quality equipment.
%%capture
import os
if "COLAB_" not in "".join(os.environ.keys()):
!pip install unsloth
else:
!pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo
!pip install sentencepiece protobuf "datasets>=3.4.1" huggingface_hub hf_transfer
!pip install --no-deps unsloth
We install all the essential libraries required to refine the Qwen3 model using an AI No, it installs dependencies according to the environment, using a light approach on colab to ensure compatibility and reduce general costs. Key components such as Bitsandbytes, TRL, XFormers and Unslotk_zoo are included to allow quantified 4 -bit training and optimization based on Lora.
from unsloth import FastLanguageModel
import torch
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/Qwen3-14B",
max_seq_length = 2048,
load_in_4bit = True,
load_in_8bit = False,
full_finetuning = False,
)
We load the QWEN3-14B model using FastlanguageModel from the Unnuloth library, which is optimized for effective fine adjustment. It initializes the model with a context length of 2048 tokens and the 4 -bit precision load, considerably reducing the use of memory. The complete fine adjustment is disabled, which makes it suitable for light and effective techniques by parameters like Lora.
model = FastLanguageModel.get_peft_model(
model,
r = 32,
target_modules = ("q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"),
lora_alpha = 32,
lora_dropout = 0,
bias = "none",
use_gradient_checkpointing = "unsloth",
random_state = 3407,
use_rslora = False,
loftq_config = None,
)
We apply Lora (low -ranking adaptation) to the Qwen3 model using fastlangoagemodel.get_peft_model. It injects driven adapters into specific layers of transformers (such as Q_Proj, V_Proj, etc.) with a row of 32, allowing effective fine adjustment while keeping most of the fixed model weights. The use of the verification of the “unnulots” gradient optimizes more the use of memory, which makes it suitable for the formation of large models on limited equipment.
from datasets import load_dataset
reasoning_dataset = load_dataset("unsloth/OpenMathReasoning-mini", split="cot")
non_reasoning_dataset = load_dataset("mlabonne/FineTome-100k", split="train")
We charge two sets of pre-connovated data from the embraced front center using the library. The reasoning_dataset contains problems of reflection chain (COT) of the openmathreason-mini of Unsil, designed to improve logical reasoning in the model. Non_Reasoning_Dataset draws general data monitoring data monitoring of the Fineetome-100k of Mlabonne, which helps the model to acquire wider and task-oriented skills. Together, these data sets support a well -balanced fine adjustment objective.
def generate_conversation(examples):
problems = examples("problem")
solutions = examples("generated_solution")
conversations = ()
for problem, solution in zip(problems, solutions):
conversations.append((
{"role": "user", "content": problem},
{"role": "assistant", "content": solution},
))
return {"conversations": conversations}
This function, Generate_Conversation, transforms the pairs of RAW question questions from the reasoning data set into a cat style format adapted to a fine adjustment. For each problem and its corresponding generated solution, a conversation is carried out in which the user asks a question and the assistant provides the answer. The output is a list of dictionaries according to the structure expected by cat -based language models, preparing data for token with a cat model.
reasoning_conversations = tokenizer.apply_chat_template(
reasoning_dataset("conversations"),
tokenize=False,
)
from unsloth.chat_templates import standardize_sharegpt
dataset = standardize_sharegpt(non_reasoning_dataset)
non_reasoning_conversations = tokenizer.apply_chat_template(
dataset("conversations"),
tokenize=False,
)
import pandas as pd
chat_percentage = 0.75
non_reasoning_subset = pd.Series(non_reasoning_conversations).sample(
int(len(reasoning_conversations) * (1.0 - chat_percentage)),
random_state=2407,
)
data = pd.concat((
pd.Series(reasoning_conversations),
pd.Series(non_reasoning_subset)
))
data.name = "text"
We prepare the fine adjustment data set by converting the sets of reasoning data and instructions into coherent cat format, then by combining them. He first applies the applicable tokenizer_chat_template to convert structured conversations into tokenable chains. The Standardize_ShaRegpt function normalizes the instruction data set in a compatible structure. Then, a 75-25 mixture is created by sampling 25% of the unrecoverated conversations (instruction) and by combining them with the reasoning data. This mixture guarantees that the model is exposed to logical reasoning and general instructions monitoring tasks, improving its versatility during training. The final combined data is stored in the form of a pandas series to a single column called “text”.
from datasets import Dataset
combined_dataset = Dataset.from_pandas(pd.DataFrame(data))
combined_dataset = combined_dataset.shuffle(seed=3407)
from trl import SFTTrainer, SFTConfig
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=combined_dataset,
eval_dataset=None,
args=SFTConfig(
dataset_text_field="text",
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
warmup_steps=5,
max_steps=30,
learning_rate=2e-4,
logging_steps=1,
optim="adamw_8bit",
weight_decay=0.01,
lr_scheduler_type="linear",
seed=3407,
report_to="none",
)
)
We take pretepired conversations, the packages in an enhanced face data set (ensuring that the data is in a coherent format) and mix the data set with a fixed seed for reproducibility. Then the fine adjustment coach is initialized using SFTTRAINER from TRL and SFTCONFIG. The trainer is configured to use the combined data set (with the text column field called “text”) and defines training hyperparammeters such as lot size, gradient accumulation, the number of heating and training steps, the learning rate, optimistic parameters and a linear learning rate plan. This configuration is oriented towards an effective fine adjustment while maintaining the reproducibility and the exploitation of the minimum details (with report_to = “none”).
Trainer.Train () starts the end adjustment process of the QWEN3-14B model using SFTTRAINER. It forms the model on the mixed data set prepared for reasoning conversations and instructions monitoring, optimizing only the parameters adapted to the LORA thanks to the unwanted configuration. The training will take place according to the configuration specified previously (for example, Max_steps = 30, batch_size = 2, lr = 2e-4), and the progress will be printed at each journalization stage. This final command launches the adaptation of the real model according to your personalized data.
model.save_pretrained("qwen3-finetuned-colab")
tokenizer.save_pretrained("qwen3-finetuned-colab")
We locally record the model and the Tokenizer adjusted to the “Qwen3-Finetuned-Colab” directory. By calling Save_Pretraind (), the adapted weights and the configuration of the Tokenzer can be recharged later for more inference or more in -depth training, locally or for downloading on the hub of the hub embedded.
In conclusion, with the help of a Nonuse AI, massive LLM with fine adjustment like QWEN3-14B become possible, using limited resources and are very effective and accessible. This tutorial has shown how to load a 4 -bit quantified version of the model, apply structured cat models, mix several data sets for better generalization and form using TRL Sfttrainer. Whether you build personalized assistants or specialized domain models, note tools considerably reduce the barrier to large -scale fine adjustment. As open source finishing ecosystems are evolving, insufficients continue to show the way to make LLM training faster, cheaper and more practical for everyone.
Discover the Colaab. All the merit of this research goes to researchers in this project. Also, don't hesitate to follow us Twitter And don't forget to join our 95K + ML Subdreddit and subscribe to Our newsletter.
Asif Razzaq is the CEO of Marktechpost Media Inc .. as a visionary entrepreneur and engineer, AIF undertakes to exploit the potential of artificial intelligence for social good. His most recent company is the launch of an artificial intelligence media platform, Marktechpost, which stands out from its in-depth coverage of automatic learning and in-depth learning news which are both technically solid and easily understandable by a large audience. The platform has more than 2 million monthly views, illustrating its popularity with the public.
