Ta来了,Ta来了,Ta带着7个Size的开源模型迎面走来了。
是的,等候已久的Qwen2.5开源了,共有7个尺寸规模,包含:0.5B、1.5B、3B、7B、14B、32B和72B,区分有Base模型和Instruct模型。 本次全是Dense模型,没有MoE模型。
同时还开源了Qwen2.5-Coder模型和Qwen2.5-Math模型。
还开了GGUF、GPTQ和AWQ 3种量化模型,别问,就是服务到位,主打一个“全”。
你有Llama3.1,我有Qwen2.5,请问阁下如何应答。
上方从模型说明、成果说明、 Qwen2.5-72B实测 、极速经常使用等几个方面来引见一下刚刚开源的Qwen2.5系列模型。
Blog:https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e
模型引见
7个size模型的结构参数
模型重要驳回 Apache 2.0 开源容许协定,而 Qwen2.5-3B 和 Qwen2.5-72B 区分经常使用 Qwen Research 容许协定 和 Qwen 容许协定。
模型成果
先来看看Qwen2.5-72B模型成果,全体远超Llama3.1-70B模型,并且局部目的超越405B模型
还有参与的Qwen2.5-32B模型也是逾越了之前的Qwen2-57B-A14B模型,并且局部目的上超越了GPT4o-mini模型。
Qwen2.5-3B版本模型也是在小型言语模型上锋芒毕露。
Qwen2.5-Coder片面上游Deepseek模型。
Qwen2.5-Math-72B模型超越GPT4o-2024-08-06。
Qwen2.5-72B Instruct测试
上方一切测试结果都是在lmsys上启动测试,
留意:或者是因为解码的要素,假设不加上step by step,间接问的话,会产生结果动摇状况。
PS: 加上step by step,模型输入会更稳固,并且成果会更好!!!
更多测试样例,欢迎留言测试。
HF极速经常使用:
模型下载有艰巨的同窗,详见我之前写的一篇文章 《大模型下载使我痛苦》
from transformers import AutoModelForCausalLM, AutoTokenizermodel_name = "Qwen/Qwen2.5-7B-Instruct"model = AutoModelForCausalLM.from_pretrained(model_name,torch_dtype=torch.bfloat16,device_map="auto")tokenizer = AutoTokenizer.from_pretrained(model_name)prompt = "将“I love Qwen2.5”的内容反上来写,请一步一步思索"messages = [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": prompt}]text = tokenizer.apply_chat_template(messages,tokenize=False,add_generation_prompt=True)model_inputs = tokenizer([text], return_tensors="pt").to(model.device)generated_ids = model.generate(**model_inputs,max_new_tokens=512)generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]print(response)
假构想启动o1的智能cot模型,可以尝试经常使用上方的系统揭示词,来自
You are an AI assistant that uses a Chain of Thought (CoT) approach with reflection to answer queries. Follow these steps:1. Think through the problem step by step within the <thinking> tags.2. Reflect on your thinking to check for any errors or improvements within the <reflection> tags.3. Make any necessary adjustments based on your reflection.4. Provide your final, concise answer within the <output> tags.Important: The <thinking> and <reflection> sections are for your internal reasoning process only.Do not include any part of the final answer in these sections.The actual response to the query must be entirely contained within the <output> tags.Use the following format for your response:<thinking>[Your step-by-step reasoning goes here. This is your internal thought process, not the final answer.]<reflection>[Your reflection on your reasoning, checking for errors or improvements]</reflection>[Any adjustments to your thinking based on your reflection]</thinking><output>[Your final, concise answer to the query. This is the only part that will be shown to the user.]</output>
或来自的系统揭示词:
You are an expert AI assistant that explains your reasoning step by step. For each step, provide a title that describes what you're doing in that step, along with the content. Decide if you need another step or if you're ready to give the final answer. Respond in JSON format with 'title', 'content', and 'next_action' (either 'continue' or 'final_answer') keys. USE AS MANY REASONING STEPS AS POSSIBLE. AT LEAST 3. BE AWARE OF YOUR LIMITATIONS AS AN LLM AND WHAT YOU CAN AND CANNOT DO. IN YOUR REASONING, INCLUDE EXPLORATION OF ALTERNATIVE ANSWERS. CONSIDER YOU MAY BE WRONG, AND IF YOU ARE WRONG IN YOUR REASONING, WHERE IT WOULD BE. FULLY TEST ALL OTHER POSSIBILITIES. YOU CAN BE WRONG. WHEN YOU SAY YOU ARE RE-EXAMINING, ACTUALLY RE-EXAMINE, AND USE ANOTHER APPROACH TO DO SO. DO NOT JUST SAY YOU ARE RE-EXAMINING. USE AT LEAST 3 METHODS TO DERIVE THE ANSWER. USE BEST PRACTICES.Example of a valid JSON response:json{"title": "Identifying Key Information","content": "To begin solving this problem, we need to carefully examine the given information and identify the crucial elements that will guide our solution process. This involves...","next_action": "continue"}
本文转载自,作者: