1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
  2. Anuncie Aqui
    Anuncie aqui você Também: fdantas@4each.com.br

[Python] Why LoRA Shows 1.4B Trainable Params Instead of 38M When Fine-Tuning Gemma 3 4B?...

Discussão em 'Python' iniciado por Stack, Dezembro 1, 2025.

  1. Stack

    Stack Membro Participativo

    I found a code snippet for fine-tuning the Gemma-3 4B model on an OCR dataset that converts handwritten math formulas into LaTeX. The original author shared their results showing about **38 million trainable parameters** when using LoRA.

    I copied the code exactly — without modifying even a single line — and ran it on Google Colab using the Unsloth library for LoRA fine-tuning. However, during training, it reports **1.4 billion trainable parameters** instead of 38 million.

    I’m not sure why this huge difference is happening, especially since the code is identical to the original. Does anyone know what might be causing this mismatch?

    Continue reading...

Compartilhe esta Página