يحاول ذهب - حر

Large multimodal modelsAnother step towards AGI

september 2024

|

PCQuest

Large Multimodal Models LMMs) represent the next leap in Al, combining text, images, and audio into a single system that understands the world more like humans do. This advancement moves us closer to Al that can perform complex tasks across various domains, from healthcare to entertainment, and brings Us a step nearer to Artificial General Intelligence

- Amit Gupta

Large multimodal modelsAnother step towards AGI

The excitement surrounding large language models (LLMs) is rapidly increasing, with industries widely exploring diverse use cases. As a transformative technology, LLMs are being closely monitored for their potential to revolutionize and optimize everything from customer service to complex data analysis to advance health care. Bill Gates recently wrote a blog on how agents will be the next big thing in software. He further claimed that in the next 5 years, anyone who’s online will be able to have a personal assistant powered by artificial intelligence.

While the industries & user community are still embracing the euphoria of Large Language Models (LLMs), the Hi-Tech industry has already started to work on evolution of Large Multimodal Models (LMM) - a step towards extending the ‘emergent’ abilities of LLMs beyond text-only input/output models.

▾ Large Multimodal Models

We human beings are blessed with multiple sensory & cognitive capabilities and our intelligence is a collective intelligence derived from multiple sources. As we grow, we learn to use one or more of these ‘Modes of interactions’ to interact with the world around us. The future of AI will likely follow the same realm and will work on integrating multiple data modalities at input and/or output into AI models, leading to the development of LMMs. The input or output modes of interest could be text/language, images, video, audio, sensors data, actuator data, etc. Till recently, the focus was on unimodal models which could process only one data mode (such as text or speech or image) at a time.

By combining these different types of data, LMMs can achieve a more holistic understanding of the world, enabling them to perform complex tasks. For instance, an LMM could analyze a video, recognize objects, understand spoken language, and generate descriptive text all in one seamless iteration.

PCQuest

هذه القصة من طبعة september 2024 من PCQuest.

اشترك في Magzter GOLD للوصول إلى آلاف القصص المتميزة المنسقة، وأكثر من 9000 مجلة وصحيفة.

هل أنت مشترك بالفعل؟

المزيد من القصص من PCQuest

PCQuest

PCQuest

The critical role of cloud developers in democratizing AI

AI is getting smarter, but it's cloud developers who give it wings. From stitching agentic systems to embedding invisible intelligence into everyday tools, they're the silent architects making AI not just powerfulbut personal, secure, and everywhere

time to read

4 mins

August 2025

PCQuest

PCQuest

The calm before the quantum storm

Quantum threats are coming, but Futurex isn't waiting. With Crypto Hub and hybrid crypto magic, they're helping businesses dodge a future where today's secrets become tomorrow's open files. Think vaults with brains-ready for tomorrow, built today

time to read

4 mins

August 2025

PCQuest

PCQuest

The silent revolution inside your next diagnosis

From code to care, a quiet revolution is unfolding, where machines predict, images think, and hospitals learn. In a world of wires and whispers, precision becomes personal, and healing begins before symptoms even speak

time to read

6 mins

August 2025

PCQuest

PCQuest

From aspiration to acceleration India's next-gen mobility transformation

India's mobility ecosystem is undergoing a fundamental reset from fuel-based to electric, from mechanical to digital, and from ownership-driven to experience-led. This transformation is not only accelerating innovation but redefining India's position in the global mobility landscape

time to read

5 mins

August 2025

PCQuest

PCQuest

AI with a conscience, India's journey toward ethical and responsible innovation

What if AI didn't just follow rules but wrote them, watched them, and called for backup when needed? This isn't automation 2.0, it's a thinking ecosystem where agents act, adapt, and align. Welcome to the future of enterprise intelligence

time to read

3 mins

August 2025

PCQuest

PCQuest

How integrated platforms enhance threat detection and response

Cybercrime's evolving, and so should defense. From AI agents to unified dashboards, this piece dives into how cohesive platforms aren't just cool, they're mission-critical. Smarter tech, faster response, zero guesswork. The future fights back

time to read

3 mins

August 2025

PCQuest

PCQuest

SENNHEISER Momentum True Wireless 4

The Momentum True Wireless 4 earbuds stand out with their sleek design and impressive features. Designed to deliver high-fidelity audio, these earbuds are built to appeal to audiophiles and casual listeners alike. Sennheiser has aimed to blend style and substance, but does it really hit the mark?

time to read

1 mins

August 2025

PCQuest

PCQuest

Charging change, one cell at a time

From chemistry choices to thermal finesse and circular ecosystems, this story unpacks the real forces powering tomorrow's EVs, where smarter batteries, faster charging, and cleaner production drive more than just mobility

time to read

2 mins

August 2025

PCQuest

PCQuest

The road reprogrammed

From traffic chaos to smart cruising, the future of mobility is getting a brain upgrade. Think cars that learn, roads that react, and journeys that adapt in real time, all stitched together by code, sensors, and a touch of everyday desi unpredictability

time to read

2 mins

August 2025

PCQuest

PCQuest

ASUS Zenbook A14 (UX3407)

The ASUS Zenbook A14 UX3407 redefines portability with a sub one kilogram build, Already Snapdragon hardware, and a vibrant OLED display. Aimed at professionals, students, and frequent travelers, it offers silent performance, long battery life, and next generation connectivity in a slim, durable chassis.

time to read

1 mins

August 2025

Listen

Translate

Share

-
+

Change font size