The contribution Alibaba-Ki: How does the language model QWen 2.5 work? Fabian Peters first appeared on Basic Thinking. You always stay up to date with our newsletter.
Immediately after Deepseek, the Chinese online retailer Alibaba also presented a AI that should be better than chatt. But what exactly is Qwen 2.5 and how does the voice model work?
Just a few days after the Chinese company Deepseek put the AI industry into turmoil, the next China-Ki makes headlines. Because online retailers and Amazon competitor Alibaba also presented a voice model that should be better than chatt-and also deepseek.
Alibaba-Ki: What is Qwen 2.5?
Strictly accepted, Alibaba did not develop a completely new AI, but a new version of his “Qwen” model. According to company information the language model QWEN 2.5 should almost consistently exceed the services of Deepseek, Chatgpt and Google Claude. Only the Google Chatbot Gemini can keep up.
QWEN 2.5 is a so -called multimodal language model. Specifically, the AI can process and output different modalities such as texts, pictures and videos. Like Deepseek, the Alibaba Ki is based on a source-open open source approach. The source text of the software is open to the public and can be viewed, changed and used by third parties.
Because: Open source software can usually be used free of charge in compliance with certain license conditions. The source -open approach enables the specified performance values of QWEN 2.5 to imitate in the form of performance tests (benchmarks).
If Alibaba had not published his AI as an open source software, there would be significantly more doubts about their performance. The same applies to Deepseek’s voice model.
How does the voice model work?
Alibaba offers its AQWEN 2.5 Ki model as series. There are different versions with a different number of parameters. According to company, they range from three billion to seven billion to 72 billion elements. These AI parameters emerge from the training data of a model and are adjustable.
The elements decide how a AI works. However, the number of parameters has no direct influence on the general quality of a voice model. Rather, the number, quality and link cover different services. Means: Alibaba has released three qwen versions that cover different areas of responsibility.
The company offers with Qwen 2.5-VL For example, a model that was primarily tailored to a visual understanding of texts, tables, diagrams and graphics. It should be able to understand videos that are longer than an hour and answer questions.
The model Qwen 2.5-1m This in turn is to be distinguished, to process particularly long context inputs and thus enable long conversations. According to Alibaba, it could process up to a million tokens. These are the smallest data units that a AI model uses for the processing of input and output of content.
For comparison: The standard version of Chatgpt 40 processes around 8,000 tokens in context. The number of tokens can improve the performance of a AI model, but the practical use of very large token windows is controversial. The version QWEN 2.5-VL-72B instruct Meanwhile, Alibaba refers to his “flagship”.
It is said to be the most competitive compared to the large voice models from Google, Meta and Openai. QWEN 2.5-VL-72B-Instruct covers reading documents and diagrams, answering visual questions, mathematics, video understanding and visual editions.
Alibaba-Ki “Qwen” and Deepseek in comparison
Alibaba has its “flagship model” QWEN 2.5-VL-72B-Instruct as in-house Chat platform made accessible. The models of the QWEN2.5 VL series are on the Open Spource platform Huggingface or Alibabas Open Source Community Model Scope available. So far there is no smartphone app.
When training his AI models, Alibaba in turn relies on other approaches than some competitors. The company uses, for example, synthetic data that, in contrast to real data, is created, for example, by computer simulations. The main advantage is above all a certain cost efficiency.
Like Deepseek, QWen should also put pressure on the established US providers-even if the hype is significantly lower than around the domestic competitor. The reason: Deepseek not only caused turmoil not only because of its performance, but also because of its cost and energy efficiency.
The subscription models of the company are, for example, ten to twenty times cheaper than those of the large US language models. For comparable services, Deepseek should also need just five percent of the Chatgpt energy energy. However, misunderstandings are hidden behind some assessments of cost and performance efficiency.
Alibaba, meanwhile, may cause much less stir with Qwen, since the company has so far not provided any precise information on costs, development and energy efficiency. The AI model also supports significantly fewer languages. However, one of the main aspects should be that, unlike Deepseek, Qwen has not yet been available as an app.
Also interesting:
- Spy allegations: Did Deepseek stolen data from Chatgpt?
- “Stargate” chaos: Is Openai the great beneficiary of the AI project?
- Electricity and water consumption: the effects of AI on the environment
- “Free Our Feeds”: New initiative should protect social media from billionaires
The contribution Alibaba-Ki: How does the language model QWen 2.5 work? Fabian Peters first appeared on Basic Thinking. Follow us too Google News and Flipboard.
As a Tech Industry expert, I can say that Alibaba-Ki’s language model QWen 2.5 is an advanced AI-powered tool that is designed to understand and generate human language. The model works by utilizing a deep learning architecture called transformers, which allows it to process and analyze large amounts of text data to generate human-like responses.
QWen 2.5 uses a combination of natural language processing techniques, including word embeddings, attention mechanisms, and recurrent neural networks, to understand the context of a given text and generate coherent and contextually relevant responses.
Overall, Alibaba-Ki’s QWen 2.5 is a cutting-edge language model that showcases the advancements in AI technology and its potential applications in various industries, including customer service, content generation, and more.
Credits