文本转换(Transforming)

如何利用大语言模型(LLM)强大的文本转换能力,通过编程调用 API 接口,实现包括多语种翻译、拼写与语法纠正、语气调整及格式转换等多种功能。

1. 多语种翻译

模型不仅能进行基础的语言互译,还能识别语种、调整翻译语气,甚至构建一个通用的翻译工作流。

  • 基础翻译 :将一种语言翻译成另一种。

    python 复制代码
    prompt = f"""
    将以下中文翻译成西班牙语: \ 
    ```您好,我想订购一个搅拌机。```
    """
    response = get_completion(prompt)
    print(response)
    • **输出:**Hola, me gustaría ordenar una batidora.
  • 语种识别 :让模型判断一段文本的语言。

    python 复制代码
    prompt = f"""
    请告诉我以下文本是什么语种: 
    ```Combien coûte le lampadaire?```
    """
    response = get_completion(prompt)
    print(response)
    • 输出这是法语。
  • 多语种翻译 :一次性将文本翻译成多种语言。

    python 复制代码
    prompt = f"""
    请将以下文本分别翻译成中文、英文、法语和西班牙语: 
    ```I want to order a basketball.```
    """
    response = get_completion(prompt)
    print(response)
    • 输出
      • 中文:我想订购一个篮球。
      • 法语:Je veux commander un ballon de basket.
      • ...
  • 翻译+语气调整 :在翻译的同时,指定输出的语气风格。

    python 复制代码
    prompt = f"""
    请将以下文本翻译成中文,分别展示成正式与非正式两种语气: 
    ```Would you like to order a pillow?```
    """
    response = get_completion(prompt)
    print(response)
    • 输出
      • 正式语气:请问您需要订购枕头吗?
      • 非正式语气:你要不要订一个枕头?
  • 通用翻译器:结合语种识别和翻译,构建一个自动化的翻译流程。

python 复制代码
user_messages = [
  "La performance du système est plus lente que d'habitude.",  # System performance is slower than normal         
  "Mi monitor tiene píxeles que no se iluminan.",              # My monitor has pixels that are not lighting
  "Il mio mouse non funziona",                                 # My mouse is not working
  "Mój klawisz Ctrl jest zepsuty",                             # My keyboard has a broken control key
  "我的屏幕在闪烁"                                             # My screen is flashing
]

for issue in user_messages:
    prompt = f"告诉我以下文本是什么语种,直接输出语种,如法语,无需输出标点符号: ```{issue}```"
    lang = get_completion(prompt)
    print(f"原始消息 ({lang}): {issue}\n")

    prompt = f"""
    将以下消息分别翻译成英文和中文,并写成
    中文翻译:xxx
    英文翻译:yyy
    的格式:
    ```{issue}```
    """
    response = get_completion(prompt)
    print(response, "\n=========================================")

结果:

复制代码
原始消息 (法语): La performance du système est plus lente que d'habitude.

中文翻译:系统性能比平时慢。
英文翻译:The system performance is slower than usual. 
=========================================
原始消息 (西班牙语): Mi monitor tiene píxeles que no se iluminan.

中文翻译:我的显示器有一些像素点不亮。
英文翻译:My monitor has pixels that don't light up. 
=========================================
..............

2. 语气/风格调整

根据目标受众和场景,改变文本的写作风格。例如:将口语化的中文转换成正式的商务信函。

python 复制代码
prompt = f"""
将以下文本翻译成商务信函的格式: 
```小老弟,我小羊,上回你说咱部门要采购的显示器是多少寸来着?```
"""
response = get_completion(prompt)
print(response)

结果:

尊敬的XXX(收件人姓名):

您好!我是XXX(发件人姓名),在此向您咨询一个问题。上次我们交流时,您提到我们部门需要采购显示器,但我忘记了您所需的尺寸是多少英寸。希望您能够回复我,以便我们能够及时采购所需的设备。

谢谢您的帮助!

此致

敬礼

复制代码
XXX(发件人姓名)

3. 格式转换

将文本从一种数据格式转换为另一种,例如从 JSON 转换为 HTML。

python 复制代码
data_json = { "resturant employees" :[ 
    {"name":"Shyam", "email":"shyamjaiswal@gmail.com"},
    {"name":"Bob", "email":"bob32@gmail.com"},
    {"name":"Jai", "email":"jai87@gmail.com"}
]}
prompt = f"""
将以下Python字典从JSON转换为HTML表格,保留表格标题和列名:{data_json}
"""
response = get_completion(prompt)
print(response)

from IPython.display import display, Markdown, Latex, HTML, JSON
display(HTML(response))

结果:

4. 拼写及语法纠正

充当一个智能校对工具,自动发现并纠正文本中的错误。

  • 基础纠错 :循环处理一个句子列表,纠正其中的主谓不一致、同音异义词误用、拼写错误等。

    python 复制代码
    text = [ 
      "The girl with the black and white puppies have a ball.",  # The girl has a ball.
      "Yolanda has her notebook.", # ok
      "Its going to be a long day. Does the car need it's oil changed?",  # Homonyms
      "Their goes my freedom. There going to bring they're suitcases.",  # Homonyms
      "Your going to need you're notebook.",  # Homonyms
      "That medicine effects my ability to sleep. Have you heard of the butterfly affect?", # Homonyms
      "This phrase is to cherck chatGPT for speling abilitty"  # spelling
    ]
    
    for i in range(len(text)):
        prompt = f"""请校对并更正以下文本,注意纠正文本保持原始语种,无需输出原始文本。
        如果您没有发现任何错误,请说"未发现错误"。
        
        例如:
        输入:I are happy.
        输出:I am happy.
        ```{text[i]}```"""
        response = get_completion(prompt)
        print(i, response)

    结果:

    复制代码
    0 The girl with the black and white puppies has a ball.
    1 未发现错误。
    2 It's going to be a long day. Does the car need its oil changed?
    3 Their goes my freedom. They're going to bring their suitcases.
    4 You're going to need your notebook.
    5 That medicine affects my ability to sleep. Have you heard of the butterfly effect?
    6 This phrase is to check chatGPT for spelling abil。

5. 综合转换任务

将以上多种能力组合起来(文本翻译+拼写纠正+风格调整+格式转换),通过一个复杂的 Prompt 完成端到端的文本处理。

python 复制代码
text = f"""
Got this for my daughter for her birthday cuz she keeps taking \
mine from my room.  Yes, adults also like pandas too.  She takes \
it everywhere with her, and it's super soft and cute.  One of the \
ears is a bit lower than the other, and I don't think that was \
designed to be asymmetrical. It's a bit small for what I paid for it \
though. I think there might be other options that are bigger for \
the same price.  It arrived a day earlier than expected, so I got \
to play with it myself before I gave it to my daughter.
"""

prompt = f"""
针对以下三个反引号之间的英文评论文本,
首先进行拼写及语法纠错,
然后将其转化成中文,
再将其转化成优质淘宝评论的风格,从各种角度出发,分别说明产品的优点与缺点,并进行总结。
润色一下描述,使评论更具有吸引力。
输出结果格式为:
【优点】xxx
【缺点】xxx
【总结】xxx
注意,只需填写xxx部分,并分段输出。
将结果输出成Markdown格式。
```{text}```
"""
response = get_completion(prompt)
display(Markdown(response))