استمتع بـUnlimited مع Magzter GOLD

استمتع بـUnlimited مع Magzter GOLD

احصل على وصول غير محدود إلى أكثر من 9000 مجلة وصحيفة وقصة مميزة مقابل

$149.99
 
$74.99/سنة

يحاول ذهب - حر

Deploying Generative AI Models Efficiently

December 2025

|

Open Source For You

Enterprise deployment of generative AI hinges on the optimisation of hardware and software. It's important to choose the right-sized language model and fine-tune it as per your need.

OpenAI's launch of ChatGPT powered by GPT-2 in mid-2020 showcased a model with 175 billion parameters—a monumental breakthrough at that time. By the time GPT-4 arrived, parameter counts had surged into trillions, enabling sophisticated chat assistants, code generation, and creative applications, yet imposing unprecedented strain on compute infrastructure.

Organisations are leveraging open source genAI models such as Llama to streamline operations, improve customer interactions, and empower developers. Choosing a large language model (LLM) optimised for efficiency enables significant savings in inference hardware costs. Let's see how this can be done.

Choosing the rightsized LLM

Since the public launch of ChatGPT, generative AI adoption has skyrocketed, capturing the imagination of consumers and enterprises alike. Its unprecedented accessibility has empowered not just developers but also nontechnical users to embed AI into their everyday workflows.

Central to this evolution is a fundamental measure of progress: LLM parameters — the trainable weights fine-tuned during learning to determine model capability. In 2017, early generative AI models based on the Transformer architecture featured roughly around 65 million such parameters.

This explosive growth has reinforced the belief that ‘bigger is better’, positioning trillion-parameter models as the benchmark for AI success. However, these massive models are typically optimised for broad, consumer-oriented applications rather than specialised needs.

For enterprises that demand domain-specific accuracy and efficiency, blindly pursuing larger parameter counts can be both costly and counterproductive. The real question isn’t how big the model is, but whether its scale is rightsized for the task at hand.

Analysing LLMs through a technical lens

المزيد من القصص من Open Source For You

Open Source For You

Open Source For You

Top 10 Open Source Tools for System and IT Administrators

All reputed online services have committed system and IT administrators working behind the scenes. Here are ten open source tools they should be aware of, as these can help them monitor, automate, as well as manage complex infrastructure with relative ease.

time to read

6 mins

February 2026

Open Source For You

Google opens access to its Gemini Deep Research Agent

Google has opened access to its Gemini Deep Research Agent for the first time, allowing developers to integrate advanced autonomous research capabilities directly into their applications.

time to read

1 min

February 2026

Open Source For You

Open Source For You

NVIDIA buys SchedMD, keeps Slurm open source and vendor neutral

NVIDIA has acquired AI software company SchedMD, signalling a deeper commitment to open source technologies as competition intensifies across the artificial intelligence ecosystem.

time to read

1 min

February 2026

Open Source For You

Open Source For You

How Open Source Tools Power Modern IT Operations

Open source tools have not replaced enterprise IT platforms; they have become the connective layer that makes modern operations possible.

time to read

6 mins

February 2026

Open Source For You

Mandiant's Auralnspector enhances Salesforce security

Google-owned cybersecurity firm Mandiant has released AuraInspector, a free, open source command-line tool designed to identify dangerous access control misconfigurations in Salesforce environments, marking a significant move to democratise enterprise-grade security testing.

time to read

1 min

February 2026

Open Source For You

Google launches Universal Commerce Protocol to power agentic AI commerce

Google has introduced the Universal Commerce Protocol (UCP), a new open standard that enables AI agents to autonomously perform end-to-end commerce activities, spanning product discovery, purchasing, checkout, payments, and postpurchase experiences.

time to read

1 min

February 2026

Open Source For You

Open Source For You

Zero Trust CI/CD: The Death of Static Secrets

In an era where data breach costs continue to hit record highs, shifting to a secretless CI/CD pipeline is the most effective step to safeguard digital infrastructure.

time to read

7 mins

February 2026

Open Source For You

Open Source For You

Quantum Algorithms: The Future of Computing

Explore the essence of quantum algorithms, their groundbreaking applications, recent innovations, and the challenges that remain.

time to read

8 mins

February 2026

Open Source For You

Open Source For You

Bringing Clarity to the Chaos in AI

AI feels powerful, yet most teams struggle because they cannot define what intelligence they really need. But there are ways to address this challenge.

time to read

5 mins

February 2026

Open Source For You

Open Source For You

Top researchers return to OpenAI

OpenAI has welcomed back three high-profile researchers, Barret Zoph, Luke Metz, and Sam Schoenholz, following their brief tenure at former OpenAI CTO Mira Murati's AI startup, Thinking Machines.

time to read

1 min

February 2026

Listen

Translate

Share

-
+

Change font size