Codart Studio

Phi-2: A Locally Deployable Small Language AI Model


Phi-2是微软发布的一个小模型,只有2.7b参数。

目前来说,应该是最强的3B以下LLM。

部署操作简单,模型文件只有5G。

环境配置:

1
2
3
4
5
6
conda create --name phi2 python=3.9
conda activeate phi2
pip install torch
pip install modelscope
pip install transformers
pip install accelerate

代码下载:

1
git clone https://www.modelscope.cn/AI-ModelScope/phi-2.git

示例代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import torch
from modelscope import AutoModelForCausalLM, AutoTokenizer

torch.set_default_device("cuda")

model = AutoModelForCausalLM.from_pretrained(\
"AI-ModelScope/phi-2", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(\
"AI-ModelScope/phi-2", trust_remote_code=True)

inputs = tokenizer('''def print_prime(n):
"""
Print all primes between 1 and n
"""''', return_tensors="pt", return_attention_mask=False)

outputs = model.generate(**inputs, max_length=200)
text = tokenizer.batch_decode(outputs)[0]
print(text)

Phi-2模型支持QA格式、聊天格式和代码格式。

QA模式下,可以直接提供提示内容:

1
2
inputs = tokenizer("Please tell me about China.", \
return_tensors="pt", return_attention_mask=False)

输出样例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Please tell me about China. I want to know what it's like.

I'm sorry, I don't know much about China. I've never been there.

That's okay. I'll tell you what I know. China is a big country in Asia.
It has a lot of people and a lot of different cultures.

Wow, that sounds really interesting. What kind of cultures?

Well, there are many different cultures in China. Some people speak Chinese,
some speak other languages like English or Spanish.
And there are many different foods, clothes, and traditions.

That's so cool! I want to learn more about China.

Me too! Maybe we can do some research together and find out more.

That's a great idea! Let's do it.

也可以提供Instruct: <prompt>\nOutput:格式的提示内容,鼓励模型输出简洁的答案:

1
2
inputs = tokenizer("Instruct: Please tell me about China.\nOutput:", \
return_tensors="pt", return_attention_mask=False)

输出样例:

1
2
3
4
5
6
7
8
9
10
Instruct: Please tell me about China.
Output: China is a country located in East Asia. It is the world's most
populous country, with a population of over 1.4 billion people. China is
known for its rich history, ancient civilizations, and cultural heritage.
It is the world's second-largest economy and a major global power.
China has a diverse landscape, ranging from the Himalayas in the north
to the Gobi Desert in the west. The country is home to many famous
landmarks, such as the Great Wall of China, the Terracotta Army, and
the Forbidden City. Chinese cuisine is also renowned
worldwide, with dishes like Peking duck, dumplings, and hot pot.

聊天格式下,可能会输出多次对话内容,可以这么提问:

1
2
inputs = tokenizer("Alice: Please tell me about China.\nBob:", \
return_tensors="pt", return_attention_mask=False)

输出样例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Alice: Please tell me about China.
Bob: China is a country in Asia. It has a long history and a large population.
It is known for its ancient culture, its modern economy,
and its diverse landscapes.
Alice: Wow, that sounds amazing. What about India?
Bob: India is a country in Asia. It has a rich history and a large population.
It is known for its ancient civilization, its modern democracy,
and its diverse religions.
Alice: Wow, that sounds amazing too. How are they different?
Bob: Well, they are different in many ways. For example,
China has a one-party system, while India has a multi-party system.
China has a centralized government, while India has a federal government.
China has a large urban population, while India has a large rural population.
Alice: I see. How are they similar?
Bob: Well, they are similar in some ways too. For example, they both have ...

代码格式则见上文示例代码。

推荐阅读


关注【码上艺术】公众号,第一时间获取更多前沿技术干货!

打赏作者