Versuchen GOLD - Frei
Open Source Solutions for Building Specialised Language Models: An Overview
Open Source For You
|April 2025
Specialised language models score over large language models in various ways. What's more, there are a range of open source solutions you can choose from to build a reliable model.
A large language model (LLM) has millions of parameters whereas a small language model has significantly fewer parameters, uses less resources and is optimised for a specific domain. The specialised language model (SLM) can be small or large in model size but focuses on specific fields like law, healthcare, and so on.
Creating a specialised language model using multiple LLM sources
The process of developing an SLM involves harnessing the strengths of multiple LLMs to filter data effectively. This requires several steps, which are outlined below.
Data collection: The first step is to gather a diverse set of data from various sources, including domain-specific databases, scientific journals, articles, and generic data repositories. The goal is to assemble a comprehensive dataset that encompasses both specialised and general knowledge.
Data preprocessing: Data preprocessing is essential for cleaning and organising the collected data. This step involves removing duplicates, irrelevant information, and noise. Techniques such as tokenization, stemming, and lemmatization are employed to standardise the text.
Data filtering: To create an effective SLM, it is crucial to filter out domain-specific data from generic information. This can be achieved by leveraging multiple LLMs, each trained on different datasets. These models can be used to classify and segregate data based on their relevance and context.
Model training: Once the data is filtered, the next step is to train the SLM. This involves fine-tuning the selected LLMs on the domain-specific dataset. Techniques such as transfer learning and supervised learning are employed to enhance the model’s performance.
Diese Geschichte stammt aus der April 2025-Ausgabe von Open Source For You.
Abonnieren Sie Magzter GOLD, um auf Tausende kuratierter Premium-Geschichten und über 9.000 Zeitschriften und Zeitungen zuzugreifen.
Sie sind bereits Abonnent? Anmelden
WEITERE GESCHICHTEN VON Open Source For You
Open Source For You
Top 10 Open Source Tools for System and IT Administrators
All reputed online services have committed system and IT administrators working behind the scenes. Here are ten open source tools they should be aware of, as these can help them monitor, automate, as well as manage complex infrastructure with relative ease.
6 mins
February 2026
Open Source For You
Google opens access to its Gemini Deep Research Agent
Google has opened access to its Gemini Deep Research Agent for the first time, allowing developers to integrate advanced autonomous research capabilities directly into their applications.
1 min
February 2026
Open Source For You
NVIDIA buys SchedMD, keeps Slurm open source and vendor neutral
NVIDIA has acquired AI software company SchedMD, signalling a deeper commitment to open source technologies as competition intensifies across the artificial intelligence ecosystem.
1 min
February 2026
Open Source For You
How Open Source Tools Power Modern IT Operations
Open source tools have not replaced enterprise IT platforms; they have become the connective layer that makes modern operations possible.
6 mins
February 2026
Open Source For You
Mandiant's Auralnspector enhances Salesforce security
Google-owned cybersecurity firm Mandiant has released AuraInspector, a free, open source command-line tool designed to identify dangerous access control misconfigurations in Salesforce environments, marking a significant move to democratise enterprise-grade security testing.
1 min
February 2026
Open Source For You
Google launches Universal Commerce Protocol to power agentic AI commerce
Google has introduced the Universal Commerce Protocol (UCP), a new open standard that enables AI agents to autonomously perform end-to-end commerce activities, spanning product discovery, purchasing, checkout, payments, and postpurchase experiences.
1 min
February 2026
Open Source For You
Zero Trust CI/CD: The Death of Static Secrets
In an era where data breach costs continue to hit record highs, shifting to a secretless CI/CD pipeline is the most effective step to safeguard digital infrastructure.
7 mins
February 2026
Open Source For You
Quantum Algorithms: The Future of Computing
Explore the essence of quantum algorithms, their groundbreaking applications, recent innovations, and the challenges that remain.
8 mins
February 2026
Open Source For You
Bringing Clarity to the Chaos in AI
AI feels powerful, yet most teams struggle because they cannot define what intelligence they really need. But there are ways to address this challenge.
5 mins
February 2026
Open Source For You
Top researchers return to OpenAI
OpenAI has welcomed back three high-profile researchers, Barret Zoph, Luke Metz, and Sam Schoenholz, following their brief tenure at former OpenAI CTO Mira Murati's AI startup, Thinking Machines.
1 min
February 2026
Listen
Translate
Change font size
