يحاول ذهب - حر

Deploying Generative AI Models Efficiently

December 2025

Open Source For You

Enterprise deployment of generative AI hinges on the optimisation of hardware and software. It's important to choose the right-sized language model and fine-tune it as per your need.

OpenAI's launch of ChatGPT powered by GPT-2 in mid-2020 showcased a model with 175 billion parameters—a monumental breakthrough at that time. By the time GPT-4 arrived, parameter counts had surged into trillions, enabling sophisticated chat assistants, code generation, and creative applications, yet imposing unprecedented strain on compute infrastructure.

Organisations are leveraging open source genAI models such as Llama to streamline operations, improve customer interactions, and empower developers. Choosing a large language model (LLM) optimised for efficiency enables significant savings in inference hardware costs. Let's see how this can be done.

Choosing the rightsized LLM

Since the public launch of ChatGPT, generative AI adoption has skyrocketed, capturing the imagination of consumers and enterprises alike. Its unprecedented accessibility has empowered not just developers but also nontechnical users to embed AI into their everyday workflows.

Central to this evolution is a fundamental measure of progress: LLM parameters — the trainable weights fine-tuned during learning to determine model capability. In 2017, early generative AI models based on the Transformer architecture featured roughly around 65 million such parameters.

This explosive growth has reinforced the belief that ‘bigger is better’, positioning trillion-parameter models as the benchmark for AI success. However, these massive models are typically optimised for broad, consumer-oriented applications rather than specialised needs.

For enterprises that demand domain-specific accuracy and efficiency, blindly pursuing larger parameter counts can be both costly and counterproductive. The real question isn’t how big the model is, but whether its scale is rightsized for the task at hand.

Analysing LLMs through a technical lens

هذه القصة من طبعة December 2025 من Open Source For You.

اشترك في Magzter GOLD للوصول إلى آلاف القصص المتميزة المنسقة، وأكثر من 9000 مجلة وصحيفة.

هل أنت مشترك بالفعل؟ تسجيل الدخول

Sending IoT Sensor Data to Public or Private Servers

This IoT system shows a simple and effective way to send sensor data using an ESP8266 microchip.

3 mins

March 2026

Open Source For You

Popular FOSS Tools for LLM Observability, Monitoring and Evaluation

This overview of popular tools for monitoring large language models also sheds light on how LLM-as-a-judge enhances their performance.

2 mins

March 2026

Open Source For You

Data Deduplication Done the Right Way

Deduplication helps to save space on Linux-based storage systems. Choose the right platform and check whether it meets your goals.

6 mins

March 2026

Open Source For You

The Relevance of Rubber Duck Debugging in the Age of AI

Discover why rubber duck debugging is a powerful process today. There's also a step-by-step guide on how to use it in the age of artificial intelligence.

4 mins

March 2026

Open Source For You

GitHub weighs turning off pull requests as AĬ slop floods projects

GitHub has formally acknowledged that AI-generated 'slop' is overwhelming open source projects, forcing maintainers to sift through poor pull requests (PRS), abandoned submissions and guideline violations - and is now considering restricting or even disabling pull requests, the core mechanism of open collaboration.

1 min

March 2026

Open Source For You

Global banks are deploying Ethereum's Layer-2 stack

Banks are standardising on Ethereum's open source stack as production financial infrastructure, shifting from experimental pilots and proprietary blockchains to live Layer-2 networks for tokenised deposits, interbank payments, and cross-border settlement.

1 min

March 2026

Open Source For You

OpenClaw's creator joins OpenAl

In a move that reinforces its commitment to open development rather than acquisition, OpenAI has brought Peter Steinberger, founder of OpenClaw, into the company while placing the popular AI agent under a foundation structure to ensure it remains open source.

1 min

March 2026

Open Source For You

LibreOffice 26.2 comes with native Markdown support

LibreOffice 26.2 has been released by The Document Foundation, strengthening its position as a fully free and open source office suite for Windows, macOS, and Linux, with support for more than 120 languages.

1 min

March 2026

Open Source For You

Indian government mandates labelling of Al-generated content and quicker deletion of illegal deepfakes

India has introduced sweeping AI content rules that immediately place pressure on social platforms and open source AI ecosystems to label, trace and rapidly remove AI Open ource synthetic media at scale.

1 min

March 2026

Open Source For You

I2C and I3C: How Modern Devices Communicate

I3C and I2C are both two-wire communication protocols that help exchange data between multiple devices. While I3C preserves the simplicity of I2C, it introduces new features suited for today's sensor-rich devices.

8 mins

March 2026

Deploying Generative AI Models Efficiently

المزيد من القصص من Open Source For You

Sending IoT Sensor Data to Public or Private Servers

Popular FOSS Tools for LLM Observability, Monitoring and Evaluation

Data Deduplication Done the Right Way

The Relevance of Rubber Duck Debugging in the Age of AI

GitHub weighs turning off pull requests as AĬ slop floods projects

Global banks are deploying Ethereum's Layer-2 stack

OpenClaw's creator joins OpenAl

LibreOffice 26.2 comes with native Markdown support

Indian government mandates labelling of Al-generated content and quicker deletion of illegal deepfakes

I2C and I3C: How Modern Devices Communicate