DeepSeek is amazing. And it has a pro-Chinese bias.

Waleed Kadous
4 min readJan 28, 2025

--

TL;DR: DeepSeek is a wonderful step in the development of open AI approaches. It also has a pretty serious pro-Chinese bias. I compare the results of 3 sensitive questions (about Gaza, Xinjiang and TikTok) and on all three, the Chinese bias is pretty apparent while existing tools (ChatGPT, Gemini) are far more balanced. In two instances, it used the pronoun “we” to describe the Chinese position, which suggests lots of training data that associates “we” with the Chinese.

There is no doubt that DeepSeek R1 and V3 are amazing technical achievements. There are so many clever ideas packed into the project including:

  • Training the model for just under $6 million using techniques like multi-token prediction
  • Using reinforcement learning (without human feedback) to improve performance considerably
  • How efficient it is at inference (using only 37 billion parameters to predict each token, even though there are 670 billion parameters total using a technique called mixture-of-experts)
  • How they subverted the US bans on powerful GPUs and made the watered down H800s perform on par with H100s by lots of low level optimizations.

It is also extremely heartening to see the weights were released openly. This will significantly reduce the cost of using LLMs. If you look globally, it will allow the benefits of AI to be surfaced more immediately in new and interesting applications. Not to mention that existing players will adopt the same techniques.

However, all LLMs have biases. When we use LLMs, we are buying into biases and the culture of the society and company that produced them.

In the case of DeepSeek, the biases — based on a cursory examination — seem pretty strong.

For example, I tested DeepSeek on a purely factual question: Have international organizations found evidence of genocide in Gaza? Regardless of whether people disagree or agree with the international organizations, that is a matter of fact: they have.

Note that I did not even ask it what the Chinese government’s position was at all.

Compare this with the results from ChatGPT which are much more balanced:

The first paragraph of ChatGPT’s

Gemini’s answer was even less ambiguous and arguably more pro-Palestinian:

Let’s take another example, again of a factual question: Have international organizations found evidence of crimes against humanity in Xinjiang?

DeepSeek’s answer on Xinjiang

Again, note I did not ask it about the Chinese government’s position on Xinjiang, I asked it a purely factual question. Notice also the words “We welcome the world to view …” as if it is mirroring Chinese training data.

Again, looking at ChatGPT and Gemini, we get the comparatively factual answers we are looking for:

First part of ChatGPT’s response
Gemini’s answer on Xinjiang

There was one final question I posed which was: Is the US’ ban on Tiktok justified. Ironically, I came at this question with the assumption that it would show that “Western” LLMs were biased against China, but it turns out — once again — that DeepSeek is intent on making the Chinese position clear, while Gemini and ChatGPT present a more balanced view.

DeepSeek’s view on TikTok

The answers from Gemini and ChatGPT are too long to paste here, but the summary is they presented the arguments for and against pretty clearly and concisely.

Especially curious is the use of the pronoun “we.” We is an incredibly common word. For an LLM to start using “we” in this way strongly suggests that the association between “we” and “Chinese” in the training data is very strong. In other words, it was likely trained on a lot of pro-Chinese material.

This doesn’t eliminate the utility of DeepSeek, nor does it mean we should stop using it. I am saying that even a cursory examination reveals biases in DeepSeek and that we as a community need to invest time in understanding (and perhaps even correcting) those biases before we unintentionally introduce them into our applications.

--

--

Waleed Kadous
Waleed Kadous

Written by Waleed Kadous

Co-founder of CVKey, ex Head Engineer Office of the CTO @ Uber, ex Principal Engineer @ Google.

Responses (4)