30 подписчиков

🚀 Big News from Google: Data-Saving Active Learning for Fine-Tuning LLMs at 10,000× Less

19 августа 202519 авг 2025

1 мин

🚀 Big News from Google: Data-Saving Active Learning for Fine-Tuning LLMs at 10,000× Less! Google's come up with a slick active learning process that slashes the need for labeled data when fine-tuning large language models—think of it as getting your model ready for tough tasks like content moderation without breaking the bank. 💰 🟢 Here’s how it rolls: 1. The starting model (LLM-0) gets a prompt and labels tons of data automatically. 2. Clustering spots where the model gets confused (aka those tricky, juicy learning moments). 3. Data selection: pick out the most informative and diverse examples from these clusters. 4. Expert labeling—only for the chosen few. 5. Iteration: retrain the model → select more tricky examples → label → repeat. 🟢 Results? - Cut down from 100,000 labeled examples to under 500 while keeping or boosting quality! 🔥 - Cohen’s Kappa metric improves by 55–65%. - In big production models, we’re talking 3–4 orders of magnitude less data while maintaining or enha

🚀 Big News from Google: Data-Saving Active Learning for Fine-Tuning LLMs at 10,000× Less!

Google's come up with a slick active learning process that slashes the need for labeled data when fine-tuning large language models—think of it as getting your model ready for tough tasks like content moderation without breaking the bank. 💰

🟢 Here’s how it rolls:

1. The starting model (LLM-0) gets a prompt and labels tons of data automatically.

2. Clustering spots where the model gets confused (aka those tricky, juicy learning moments).

3. Data selection: pick out the most informative and diverse examples from these clusters.

4. Expert labeling—only for the chosen few.

5. Iteration: retrain the model → select more tricky examples → label → repeat.

🟢 Results?

- Cut down from 100,000 labeled examples to under 500 while keeping or boosting quality! 🔥

- Cohen’s Kappa metric improves by 55–65%.

- In big production models, we’re talking 3–4 orders of magnitude less data while maintaining or enhancing quality!

🟢 What's Cohen’s Kappa?

It's a metric that gauges agreement between two "judges" (like an expert and the model), adjusting for random chance.

- 0.0 — no agreement (or worse than random)

- 0.41–0.60 — moderate agreement

- 0.61–0.80 — substantial

- 0.81–1.00 — almost perfect agreement

For imbalanced tasks, Kappa provides a fairer assessment than plain accuracy.

Why is this better than old-school methods?

- Targeted labeling: only the most informative examples get tagged.

- Scalability: works on datasets with billions of examples.

- Resource-saving: less time and money spent on labeling.

- Fast adaptation: perfect for domains with shifting rules (ads, moderation, security).

🟢 The takeaway:

With smart data selection, you can adapt LLMs thousands of times faster and cheaper than traditional methods! 💡✨