Skip to main contentSkip to navigationSkip to footer
Logo of the University of Applied Sciences and Arts Northwestern Switzerland
Degree Programmes
Continuing Education
Research and Services
International
About FHNW
DeEn
Locations and ContactFHNW LibraryMedia Relations

      Logo of the University of Applied Sciences and Arts Northwestern Switzerland
      • Degree Programmes
      • Continuing Education
      • Research and Services
      • International
      • About FHNW
      DeEn
      Locations and ContactFHNW LibraryMedia Relations
      Newsroom
      News
      «It feels like a major development for AI research.»
      5.2.2025 | University of Applied Sciences and Arts Northwestern Switzerland, School of Computer Science

      «It feels like a major development for AI research.»

      The launch of the Chinese AI chatbot Deepseek two weeks made headlines, shook the financial markets and the technology world. Reza Kakooee, AI researcher at the FHNW School of Computer Science and expert for reinforcement learning of AI models, explains the reasons for Deepseek’s success and what it means for the future of AI research. He calls for increased efforts by actors in Switzerland to catch up in AI.

      Reza Kakooee

      Reza Kakooee, Research Associate at the Institute for Data Science, FHNW School of Computer Science, Image: FHNW

      Reza, Deepseek has created a lot of uproar. Major news reported, Deepseek’s website went down and stocks of companies that profit from the AI boom such as chip maker Nvidia plunged. What happened?

      Deepseek is a chatbot launched by a Chinese company. It is based on a large language model released by the same company already in December 2024. Two weeks ago, they released their new reasoning model for the public along a scientific paper, in which the team explained their methodology.

      Is the great excitement around Deepseek justified or is it a short hype?

      It feels like a major development for AI research. There are several aspects about Deepseek that surprised the AI community, and also me. While DeepSeek's approach aligns with current AI research trends, they employed different techniques to train the model in a less resource-intensive manner, achieving performance comparable with its Western competitors.

      What does Deepseek make differently from others?

      What’s unique about it that it uses established development techniques in AI research but combined them in a more efficient way.

      According to Deepseek’s paper, they managed to train their model with much less resources than their competitors.  DeepSeek’s approach introduces a shift in the application of scaling laws, demonstrating that with new algorithms and optimized training, better performance can be achieved with less compute resources. Deepseek claims to have used only about 2,000 AI chips for training its reasoning model, although some claim they have around 50,000.  This would still be much less expensive than the costs OpenAI incurred for its GPT-based models.

      Deepseek offers its services at a significantly lower cost than its competitors. While OpenAI charges $60 per million output tokens for its O1 model, Deepseek's reasoning model costs only about $2; which is 30 times less. However, this is the current pricing, and OpenAI may lower its prices for future models. In general, the cost of AI, like any other technology, tends to decrease over time. OpenAI's trained their earlier models at higher price, but maybe this is the price of being first to innovate.

      «AI models will be critical for future developments in business and daily life, and in Switzerland, we need to have our own AI models that are well-aligned with our cultural values.»
      Reza Kakooee

      How did Deepseek achieve this high efficiency?

      The big AI models by OpenAI or Anthropic are usually trained in three levels. After the model has been trained on large dataset and fine-tuned with good quality data; in the next step of ‘reinforcement learning’, humans rank the model’s responses, so the model learns what are good answers. But human labour is costly compared to computer power and require additional models which adds to the training costs.

      Deepseek had a surprisingly simple solution to skip this last step for their reasoning model. Instead of generating answers to questions like ‘How is it like to live on the moon?’ that must be assessed by humans, Deepseek generated answers to questions that could be solved programmatically. For example, the AI had to write a certain piece of code. The validity of this code can be tested and the answers ranked accordingly, whether the code is correct.

      What does this mean for future developments in AI?

      Another surprise to me was that Deepseek released their model as open weights with a paper detailing their methods. So, everybody can see how the model was trained, how it works, and everyone can download it for their own use.

      For these two reasons, Deepseek’s efficiency and easy availability, this could prove to be a pivotal moment in AI. We will need less resources to run better AI models, which will likely lead to more adoption of AI.

      It will however not mean that less resources will be spent on AI. The so-called Jevon’s paradox states that increasing efficiency can lead to an even higher use of resources, mainly because new applications for AI are now possible that were until now too expensive.

      Which new areas of AI application will open up now?

      For smaller teams, it becomes interesting to adapt existing large-language models to their own use by fine-tuning them on their own data. For example, we at the FHNW School of Computer Science can help companies to build custom AI models by fine-tuning them on the company’s data to be better useful for their use cases and the Swiss market.

      How should Swiss companies and decision makers react to Deepseek?

      Deepseek has shown that it is possible to catch up to the big US companies with less resources. But it raises a major question for Switzerland or Europe: Why couldn’t a Swiss or European company come up with such an innovation, although they apparently have more resources? AI models will be critical for future developments in business and daily life, and in Switzerland, we need to have our own AI models that are well-aligned with our cultural values. This requires a dedicated team of AI talent working around the clock for at least the next 2–3 years to close the gap with other competitors.

      News

      «It feels like a major development for AI research.»
      News
      Karin Weinmann

      Karin Weinmann

      Media Relations at the FHNW School of Computer Science

      Telephone

      +41 56 202 90 10

      E-mail

      karin.weinmann@fhnw.ch

      Address

      FHNW University of Applied Sciences and Arts Northwestern Switzerland School of Engineering and Environment Klosterzelgstrasse 2 CH-5210 Windisch

      FHNW NewsNews

      What we offer

      • Degree Programmes
      • Continuing Education
      • Research and Services

      About FHNW

      • Schools
      • Organisation
      • Management
      • Facts and Figures

      Information

      • Data Protection
      • Accessibility
      • Imprint

      Support & Intranet

      • IT Support
      • Login Inside-FHNW

      Member of: