China’s DeepSeek is a new player in the artificial intelligence domain, which has so far been dominated by US-based companies such as OpenAI, Microsoft, Meta, and Google. Its latest model, R1, is touted to match its peers in reasoning capabilities while being more efficient in terms of hardware requirements. Most importantly, it is offered as an open-source model.
DeepSeek is not the first to go open source with AI models, but it provides multiple options, including distilled models that are less resource-intensive. This makes it a viable option for those looking to run AI models on a personal computer. Here is all you need to know about open-source AI models and how to run them on PCs.
What are open-source LLMs
Open-source large language models (LLMs) are publicly available for use, modification, and distribution. Unlike proprietary models, they offer greater accessibility and adaptability. Their open nature allows for customisation to meet specific requirements.
Why run LLMs locally
There are several advantages to running an open-source AI model locally. Running large language models (LLMs) locally offers benefits like cost savings, data privacy, and faster response times.
Also Read
Cloud-based AI services rely on an internet connection, which can introduce latency, and raises concerns about data privacy, as sensitive information is processed on external servers. Cloud services also operate on a pay-per-use or subscription model, which can become expensive over time, especially for high-frequency users.
Businesses and individuals using AI services locally (open-source models) do not have to pay providers per request. Additionally, local AI models ensure data security while enabling faster inference compared to cloud-based models. Local processing also allows full customisation of the model, whereas cloud-based models can only be fine-tuned to a limited extent.
Downsides of running LLMs locally
High hardware requirements of prime models make it impractical for most users, as powerful GPUs, large RAM, and ample storage are needed to run these models efficiently.
Local LLMs also consume significant power, generate heat, and lack the scalability of cloud-based solutions, which dynamically allocate resources based on demand. Additionally, models require frequent updates and maintenance, placing the burden on users to keep them optimised and secure.
Moreover, local models have limitations in integration and accessibility. Unlike cloud-hosted models that connect with APIs and web services, locally run LLMs require additional configurations for online functionalities. They are also confined to a single device unless users set up remote access.
Running open source AI models locally
Ollama is among the most widely used command-line tools for running LLMs on local machines with Windows, Linux, and macOS. Other alternatives, such as LM Studio, are also available and provide a more user-friendly interface for running LLMs locally.
Once a suitable platform is installed, users need to download the open source AI model. Several distilled versions of the model are available, and users can choose one based on their computing resources. For this guide, I will go with DeepSeek AI models.
Using LM Studio
- Visit the official LM Studio website and download the installation file corresponding to your operating system. The software is free, but may require you to sign up to download the application.
- Open LM Studio after installation.
- By default, the application may prompt you to download the Llama model—skip this step if you want to go with any other open-source AI model such as DeepSeek.
- Click on the magnifying glass icon to open the Discover tab, which allows users to browse and download models directly from the Hugging Face repository.
- Use the search bar in the Discover tab to find “DeepSeek R1.”
- Review the available options and select a quantised model, if available, for better performance on less powerful systems.
- Click the download button to retrieve the model.
- Once the download is complete, return to the main interface and click on the Chat button to start a session with the model.
Using Ollama
- Visit the official Ollama website and download the installation file for your operating system.
- Open the terminal or command prompt.
- To download the primary DeepSeek R1 model, run: ollama pull deepseek-r1
- To download a distilled variant of the model (e.g., 1.5B, 7B, 14B) that requires fewer computing resources, specify the desired version. For example: ollama pull deepseek-r1:1.5b
- In a new terminal window or tab, initiate the Ollama server by running: ollama serve
Hardware requirements
The exact hardware requirements to run the open-source AI models are not officially listed by any of the providers. But below mentioned configuration may work just fine:
- CPU: AMD EPYC 9115 (alternatives: AMD EPYC 9015, AMD EPYC 9354, Intel Xeon Platinum 8358P)
- RAM: 768GB DDR5 RDIMM (24x32GB) (for the full DeepSeek R1 model)
- Motherboard: Dual-socket server motherboard for 24 DDR5 RAM channels
- Storage: 1TB NVMe SSD