Try GOLD - Free
Cold-Pressed AI Juice - Is That Bottle Here? Is It Worth It?
DataQuest
|October 2024
Compressing AI models has been both an adventure and a formidable next-inflection-point in the many curves of AI innovation. Can we look for options better than, and beyond, erstwhile approaches like pruning and SLMs?
 
 What happens when you compress Al's heavier parts with a new centrifugal blade? You save so much storage, memory, GPU stacks and compute gas-tanks, of course. Besides boosting speed, cutting inference latency and expanding compatibility for small devices and edge-networks. But how does this 'squeeze' affect accuracy, error compensation and application-ease? And what about all the other grinders that are attacking the same problem? Like SLMs, GPTQ and QuIP? Recently, Yandex Research, IST Austria, NeuralMagic, and KAUST announced that they have developed what they call 'two innovative compression methods for large language models'. It was also claimed that, when combined, these methods allow for a reduction in model size by up to 8 times while preserving response quality by 95 per cent. Compressed models like Llama 2 13B can run on I GPU instead of 4- they added. So how does it all work and does it address the issues we mentioned earlier?
We compress it all in this interview with Artem Babenko, Head of Yandex Research. He oversees scientific research at Yandex and the company's engagement with the international scientific community. He also supervises a team of approximately 30 researchers engaged in various areas of computer science. According to Artem, his main achievements are his scientific contributions in three key areas: neural networks for image search, high-dimensional vector compression, and fast search across massive databases containing billions of records. Who better than he to explain the ambition, ingredients and final taste of compression? Let's press those buttons.
Can you explain additive quantization and PV-tuning in simpler terms- for a layman? Is it similar to model pruning?
This story is from the October 2024 edition of DataQuest.
Subscribe to Magzter GOLD to access thousands of curated premium stories, and 10,000+ magazines and newspapers.
Already a subscriber? Sign In
MORE STORIES FROM DataQuest
 
 DataQuest
The GCC Boom: India's Journey from Cost Arbitrage to Innovation
India's 1,700+ GCCs are shifting from cost to co-creation. Can India convert scale, AI depth, and leadership ambition into true orchestration power for global enterprises?
16 mins
October 2025
 
 DataQuest
'Je ne sais quoi' is now 'Je ne sais quAl': From metrics to experiences at Genesys Xperience 2025
At Genesys Xperience 2025, CEO Tony Bates showed how agentic AI is shifting businesses from metrics to empathy-driven experiences that build trust.
8 mins
October 2025
 
 DataQuest
Legacy is not enough: Why enterprises need Al-native SaaS
Phenom's Kiran Menon shares how AI-first SaaS is redefining talent experience, augmenting legacy systems, and delivering measurable outcomes.
4 mins
October 2025
 
 DataQuest
SAP's Jan Bungert on how business Al and data cloud are powering India's Techade
Jan Bungert, CRO of SAP Business AI, discusses how SAP is embedding AI into core applications and leveraging SAP Business Data Cloud to help Indian enterprises like Parle and Mahindra unlock trusted insights, efficiency, and measurable outcomes.
4 mins
October 2025
 
 DataQuest
Why the operating system is no longer just plumbing: Raj Das on the future of RHEL
Many enterprises still think of the operating system as a background utility-something you set up once and forget. In reality, modern OS platforms like RHEL are dynamic, intelligent enablers of innovation.
5 mins
October 2025
 
 DataQuest
Don't bolt Al onto ERP—build a connected system from day one
In an exclusive interaction with Dataquest, Paritosh Ladhani, Joint Managing Director of SLMG Beverages, outlines how the Coca-Cola bottler has moved from legacy processes to a fully digitised, AI-enabled, smart-factory ecosystem.
10 mins
October 2025
 
 DataQuest
The future isn't about isolated robots
C Balaji, PSG Head, TVS Electronics draws a rough, but realistic, picture of factories that embrace robots for new business models as well as faster (and smarter) assembly lines and packaging. It's an age of managed automation, performance-based services, flexible manufacturing, mass customisation, vision-intelligence, serialisation and traceability across all areas. But would this world be with or without taxes, accidents and retrofitting? Let's take a walk with Balaji around what's changing and what's staying.
4 mins
October 2025
 
 DataQuest
Hitting 'Reset', Risking 'Reboot' - VMware's Bold Leap from Complexity to Clarity
VMware, under Broadcom, is redefining cloud with VCF 9.0-simplifying portfolios, reshaping partner strategy, and positioning as a product-led platform.
4 mins
October 2025
 
 DataQuest
Feeding the Al beast, with some beauty
Jameson Mendonca, Power Generation Business Leader, Cummins Power System opens up some pistons of carbon hunger of modern data centres while he also shows how Natural gas, Hydrotreated Vegetable Oil, Life Cycle Assessments (LCAs) and Environmental Product Declarations (EPDs) can weld well in this new era. And why we should we worried about scope 1 and 2 in the race to be no.1 in AI.
6 mins
October 2025
 
 DataQuest
We allow you to say No!
What's ETA status of real consent, useful personalisation, technology for the everyday commuter and data ethics in the super-busy travel terminal? Anytime now or are we still catching this bus?
4 mins
October 2025
Listen
Translate
Change font size

