[LLM] Mistral 7B v0.2 Base Model 공개

728x90

Mistral 7B v0.2 Base Model이 공개되었다고 합니다.

https://twitter.com/marvinvonhagen/status/1771609042542039421

X의 Marvin von Hagen님(@marvinvonhagen)

Mistral just announced at @SHACK15sf that they will release a new model today: Mistral 7B v0.2 Base Model - 32k instead of 8k context window - Rope Theta = 1e6 - No sliding window

twitter.com

이번 버전업에서 특징적인 부분은 다음과 같습니다.

Mistral just announced at @SHACK15sf that they will release a new model today
:Mistral 7B v0.2 Base Model

- 32k instead of 8k context window
- Rope Theta = 1e6
- No sliding window (304kB)

3:29

context window도 기존의 8k에서 32k로 늘어났고,

v0.1 에서는 Sliding Window Attention (SWA) 이 추가된 것이 특징이었는데 v0.2 에서는 SWA가 없어졌다고 합니다.

왜 넣었는지 궁금했는데 바로 뺴버리네요 ㅋㅋ

모델 다운로드 링크 : https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar

모델 파일 사이즈는 13.GB로 용량이 그렇게 크지 않아서, 우선 다운로드 받아놨습니다.

허깅페이스에 HF Transformers 로 변환한 것도 업로드 되어 있습니다.

https://huggingface.co/alpindale/Mistral-7B-v0.2-hf

alpindale/Mistral-7B-v0.2-hf · Hugging Face

🏆🇵🇱 speakleash/open_pl_llm_leaderboard 🔥 bpawnzZ/alpindale-Mistral-7B-v0.2-hf 👁 Soraj/alpindale-Mistral-7B-v0.2-hf

huggingface.co

이제 한국어 데이터셋 부어서 한국어 파생 모델만 만들면 되겠군요..! ^^

마침 얼마전 Markr AI에서 대용량 한국어 데이터셋인 KoCommercial Dataset를 라이센스 free로 배포했으니,

데이터셋도 있겠다..! 한번 돌려보면 좋겠네요 ㅎㅎㅎ

대용량 한국어 데이터셋 : Markr AI - KoCommercial Dataset

개요 지난달 Markr AI에서 140만개의 한국어 Fine tuning 데이터셋을 모으고 만들어서 배포했습니다. LLM을 파인튜닝 하려면 데이터셋 확보가 8할, 아니 9할이라고 해도 과언이 아닌데요. 한국 LLM 생태

didi-universe.tistory.com

728x90

저작자표시 비영리 변경금지 (새창열림)

'AI > LLM' 카테고리의 다른 글

호랑이(Horangi) - 한국어 LLM 리더보드 리뷰 (0)	2024.04.08
대용량 한국어 데이터셋 : Markr AI - KoCommercial Dataset (0)	2024.04.03
[NLP] 허깅페이스 모델 캐시 확인하기 (2)	2024.04.02
[논문리뷰] DeepSpeed-FastGen: High-throughput Text Generation forLLMs via MII and DeepSpeed-Inference (0)	2024.01.22
[ChatGPT] GPT Store(GPTs) 오픈, 리뷰 및 사용성 검토 (0)	2024.01.17