Prøve GULL - Gratis

Cold-Pressed AI Juice - Is That Bottle Here? Is It Worth It?

DataQuest

October 2024

Compressing AI models has been both an adventure and a formidable next-inflection-point in the many curves of AI innovation. Can we look for options better than, and beyond, erstwhile approaches like pruning and SLMs?

- Pratima H

Cold-Pressed AI Juice - Is That Bottle Here? Is It Worth It?

What happens when you compress Al's heavier parts with a new centrifugal blade? You save so much storage, memory, GPU stacks and compute gas-tanks, of course. Besides boosting speed, cutting inference latency and expanding compatibility for small devices and edge-networks. But how does this 'squeeze' affect accuracy, error compensation and application-ease? And what about all the other grinders that are attacking the same problem? Like SLMs, GPTQ and QuIP? Recently, Yandex Research, IST Austria, NeuralMagic, and KAUST announced that they have developed what they call 'two innovative compression methods for large language models'. It was also claimed that, when combined, these methods allow for a reduction in model size by up to 8 times while preserving response quality by 95 per cent. Compressed models like Llama 2 13B can run on I GPU instead of 4- they added. So how does it all work and does it address the issues we mentioned earlier?

We compress it all in this interview with Artem Babenko, Head of Yandex Research. He oversees scientific research at Yandex and the company's engagement with the international scientific community. He also supervises a team of approximately 30 researchers engaged in various areas of computer science. According to Artem, his main achievements are his scientific contributions in three key areas: neural networks for image search, high-dimensional vector compression, and fast search across massive databases containing billions of records. Who better than he to explain the ambition, ingredients and final taste of compression? Let's press those buttons.

Can you explain additive quantization and PV-tuning in simpler terms- for a layman? Is it similar to model pruning?

Denne historien er fra October 2024-utgaven av DataQuest.

Abonner på Magzter GOLD for å få tilgang til tusenvis av kuraterte premiumhistorier og over 9000 magasiner og aviser.

Allerede abonnent? Logg på

FLERE HISTORIER FRA DataQuest

Vis alle

DataQuest

Empowering India's Al future through data: Snowflake's Vijayant Rai on innovation, collaboration, and talent

Snowflake India MD Vijayant Rai shares how the company is unifying data, advancing AI innovation, and skilling the next generation for a data-first India.

6 mins

December 2025

DataQuest

How AI is redefining delivery in the digital engineering era

As AI reshapes software engineering, delivery models are evolving from effort-based execution to intelligent, outcome-driven systems that blend human and machine collaboration.

3 mins

December 2025

DataQuest

NetSuite's Global Vision: Building the Intelligent Enterprise for the Al Era

At SuiteWorld 2025, NetSuite unveiled an AI-first vision with embedded assistants, customizable AI workflows, and global expansion focused on balancing innovation, trust, and local market needs.

4 mins

December 2025

DataQuest

V. Rajaraman: The teacher who built India's computing mind, no more

When a teacher departs, the blackboards weep. A generation of learners, spread across the world, pause and go back in time, overwhelmed by a quiet sense of gratitude and loss. Such is life, and such is India’s timeless Guru-Shishya parampara, where many jambavans silently walk the corridors of knowledge, leaving behind an imprint that endures long after they are gone.

5 mins

December 2025

DataQuest

Pilot or Paradox: Where are you parking your Al today?

Fragmented data, model pluralism, lack of a fabric, not enough skills, model economics, model volatility and the blank page syndrome- everything matters when it comes to making sure that an AI pilot does not end up as a paradox. And whether you are in that '5 pc' club?

6 mins

December 2025

DataQuest

QA engineers must think like adversaries

What happens when Ramp-testing a vehicle happens around the assembly line, earlier-faster-deeper-and-smarter than before? And as ruthless as a crash-test?

4 mins

December 2025

DataQuest

Why data readiness defines GenAl success: Krish Vitaldevara, Informatica

Informatica's Krish Vitaldevara explains data readiness gaps, CLAIRE's evolution, multi-cloud neutrality, governance for GenAI, ROI metrics, and the impact of the Salesforce acquisition.

7 mins

December 2025

DataQuest

Customer Zero to Global Impact: Salesforce's Playbook for Intelligent Enterprise Transformation

At Dreamforce 2025, Salesforce unveiled Agentforce 360, highlighting how context-aware AI agents are driving measurable business transformation across India and ASEAN.

3 mins

December 2025

DataQuest

DisCERNing Quantum – And not as some Shiny-Pink Uni-saurus

Noise control, fault tolerance, error-correction, superconducting circuits, trapped ions, photonic systems, hardware stability, hardware scalability, algorithmic maturity, strong-enough qubits - everything matters when it comes to the difference between reality and disillusionment with the Quantum Advantage.

6 mins

December 2025

DataQuest

Improving Efficiency and Supplier Relations through Accounts Payable Automation

AP automation transforms accounts payable from a cost centre into a strategic enabler, driving efficiency, transparency, and stronger supplier relationships.

4 mins

December 2025