Prøve GULL - Gratis
Cold-Pressed AI Juice - Is That Bottle Here? Is It Worth It?
DataQuest
|October 2024
Compressing AI models has been both an adventure and a formidable next-inflection-point in the many curves of AI innovation. Can we look for options better than, and beyond, erstwhile approaches like pruning and SLMs?
What happens when you compress Al's heavier parts with a new centrifugal blade? You save so much storage, memory, GPU stacks and compute gas-tanks, of course. Besides boosting speed, cutting inference latency and expanding compatibility for small devices and edge-networks. But how does this 'squeeze' affect accuracy, error compensation and application-ease? And what about all the other grinders that are attacking the same problem? Like SLMs, GPTQ and QuIP? Recently, Yandex Research, IST Austria, NeuralMagic, and KAUST announced that they have developed what they call 'two innovative compression methods for large language models'. It was also claimed that, when combined, these methods allow for a reduction in model size by up to 8 times while preserving response quality by 95 per cent. Compressed models like Llama 2 13B can run on I GPU instead of 4- they added. So how does it all work and does it address the issues we mentioned earlier?
We compress it all in this interview with Artem Babenko, Head of Yandex Research. He oversees scientific research at Yandex and the company's engagement with the international scientific community. He also supervises a team of approximately 30 researchers engaged in various areas of computer science. According to Artem, his main achievements are his scientific contributions in three key areas: neural networks for image search, high-dimensional vector compression, and fast search across massive databases containing billions of records. Who better than he to explain the ambition, ingredients and final taste of compression? Let's press those buttons.
Can you explain additive quantization and PV-tuning in simpler terms- for a layman? Is it similar to model pruning?
Denne historien er fra October 2024-utgaven av DataQuest.
Abonner på Magzter GOLD for å få tilgang til tusenvis av kuraterte premiumhistorier og over 9000 magasiner og aviser.
Allerede abonnent? Logg på
FLERE HISTORIER FRA DataQuest
DataQuest
Empowering India's Al future through data: Snowflake's Vijayant Rai on innovation, collaboration, and talent
Snowflake India MD Vijayant Rai shares how the company is unifying data, advancing AI innovation, and skilling the next generation for a data-first India.
6 mins
December 2025
DataQuest
How AI is redefining delivery in the digital engineering era
As AI reshapes software engineering, delivery models are evolving from effort-based execution to intelligent, outcome-driven systems that blend human and machine collaboration.
3 mins
December 2025
DataQuest
NetSuite's Global Vision: Building the Intelligent Enterprise for the Al Era
At SuiteWorld 2025, NetSuite unveiled an AI-first vision with embedded assistants, customizable AI workflows, and global expansion focused on balancing innovation, trust, and local market needs.
4 mins
December 2025
DataQuest
V. Rajaraman: The teacher who built India's computing mind, no more
When a teacher departs, the blackboards weep. A generation of learners, spread across the world, pause and go back in time, overwhelmed by a quiet sense of gratitude and loss. Such is life, and such is India’s timeless Guru-Shishya parampara, where many jambavans silently walk the corridors of knowledge, leaving behind an imprint that endures long after they are gone.
5 mins
December 2025
DataQuest
Pilot or Paradox: Where are you parking your Al today?
Fragmented data, model pluralism, lack of a fabric, not enough skills, model economics, model volatility and the blank page syndrome- everything matters when it comes to making sure that an AI pilot does not end up as a paradox. And whether you are in that '5 pc' club?
6 mins
December 2025
DataQuest
QA engineers must think like adversaries
What happens when Ramp-testing a vehicle happens around the assembly line, earlier-faster-deeper-and-smarter than before? And as ruthless as a crash-test?
4 mins
December 2025
DataQuest
Why data readiness defines GenAl success: Krish Vitaldevara, Informatica
Informatica's Krish Vitaldevara explains data readiness gaps, CLAIRE's evolution, multi-cloud neutrality, governance for GenAI, ROI metrics, and the impact of the Salesforce acquisition.
7 mins
December 2025
DataQuest
Customer Zero to Global Impact: Salesforce's Playbook for Intelligent Enterprise Transformation
At Dreamforce 2025, Salesforce unveiled Agentforce 360, highlighting how context-aware AI agents are driving measurable business transformation across India and ASEAN.
3 mins
December 2025
DataQuest
DisCERNing Quantum – And not as some Shiny-Pink Uni-saurus
Noise control, fault tolerance, error-correction, superconducting circuits, trapped ions, photonic systems, hardware stability, hardware scalability, algorithmic maturity, strong-enough qubits - everything matters when it comes to the difference between reality and disillusionment with the Quantum Advantage.
6 mins
December 2025
DataQuest
Improving Efficiency and Supplier Relations through Accounts Payable Automation
AP automation transforms accounts payable from a cost centre into a strategic enabler, driving efficiency, transparency, and stronger supplier relationships.
4 mins
December 2025
Listen
Translate
Change font size
