10,2 ΡΡΡ ΠΏΠΎΠ΄ΠΏΠΈΡΡΠΈΠΊΠΎΠ²
π¬ ΠΠΎΠ»Π΅Π·Π½ΡΠ΅ NLP ΠΈΠ½ΡΡΡΡΠΌΠ΅Π½ΡΡ: ΠΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠ° fastText
fastText - ΡΡΠΎ Π±ΠΈΠ±Π»ΠΈΠΎΡΠ΅ΠΊΠ° Π΄Π»Ρ Π°Π½Π°Π»ΠΈΠ·Π° ΠΈ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΠ΅ΠΊΡΡΠ°.
ΠΠΎΡ ΠΊΠ°ΠΊ Π·Π°Π³ΡΡΠ·ΠΈΡΡ ΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ ΠΏΡΠ΅Π΄Π²Π°ΡΠΈΡΠ΅Π»ΡΠ½ΠΎ ΠΎΠ±ΡΡΠ΅Π½Π½ΡΠ΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ:
import fasttext
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="facebook/fasttext-en-vectors", filename="model.bin")
model = fasttext.load_model(model_path)
model.words
['the', 'of', 'and', 'to', 'in', 'a', 'that', 'is', ...]
len(model.words)
145940
model['bread']
array([ 4.89417791e-01, 1.60882145e-01, -2.25947708e-01, -2.94273376e-01,
-1.04577184e-01, 1.17962055e-01, 1.34821936e-01, -2.41778508e-01, ...])
Π ΡΠ»Π΅Π΄ΡΡΡΠ΅ΠΌ ΠΏΡΠΈΠΌΠ΅ΡΡ ΠΌΡ Π±ΡΠ΄Π΅ΠΌ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ ΠΌΠ΅ΡΠΎΠ΄ Π±Π»ΠΈΠΆΠ°ΠΉΡΠΈΡ
ΡΠΎΡΠ΅Π΄Π΅ΠΉ:
import fasttext
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="facebook/fasttext-en-nearest-neighbors", filename="model.bin")
model = fasttext.load_model(model_path)
model.get_nearest_neighbors("bread", k=5)
[(0.5641006231307983, 'butter'),
(0.48875734210014343, 'loaf'),
(0.4491206705570221, 'eat'),
(0.42444291710853577, 'food'),
(0.4229326844215393, 'cheese')]
ΠΠΎΡ ΠΊΠ°ΠΊ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ ΡΡΡ ΠΌΠΎΠ΄Π΅Π»Ρ Π΄Π»Ρ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΡ ΡΠ·ΡΠΊΠ° ΠΈΠ· Π²Π²Π΅Π΄Π΅Π½Π½ΠΎΠ³ΠΎ ΡΠ΅ΠΊΡΡΠ°:
import fasttext
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="facebook/fasttext-language-identification", filename="model.bin")
model = fasttext.load_model(model_path)
model.predict("Hello, world!")
(('__label__eng_Latn',), array([0.81148803]))
model.predict("Hello, world!", k=5)
(('__label__eng_Latn', '__label__vie_Latn', '__label__nld_Latn', '__label__pol_Latn', '__label__deu_Latn'),
array([0.61224753, 0.21323682, 0.09696738, 0.01359863, 0.01319415]))
βͺGithub
1 ΠΌΠΈΠ½ΡΡΠ°
6Β ΠΈΡΠ½ΡΒ 2023
196 ΡΠΈΡΠ°Π»ΠΈ