يحاول ذهب - حر
Large multimodal modelsAnother step towards AGI
september 2024
|PCQuest
Large Multimodal Models LMMs) represent the next leap in Al, combining text, images, and audio into a single system that understands the world more like humans do. This advancement moves us closer to Al that can perform complex tasks across various domains, from healthcare to entertainment, and brings Us a step nearer to Artificial General Intelligence
The excitement surrounding large language models (LLMs) is rapidly increasing, with industries widely exploring diverse use cases. As a transformative technology, LLMs are being closely monitored for their potential to revolutionize and optimize everything from customer service to complex data analysis to advance health care. Bill Gates recently wrote a blog on how agents will be the next big thing in software. He further claimed that in the next 5 years, anyone who’s online will be able to have a personal assistant powered by artificial intelligence.
While the industries & user community are still embracing the euphoria of Large Language Models (LLMs), the Hi-Tech industry has already started to work on evolution of Large Multimodal Models (LMM) - a step towards extending the ‘emergent’ abilities of LLMs beyond text-only input/output models.
▾ Large Multimodal Models
We human beings are blessed with multiple sensory & cognitive capabilities and our intelligence is a collective intelligence derived from multiple sources. As we grow, we learn to use one or more of these ‘Modes of interactions’ to interact with the world around us. The future of AI will likely follow the same realm and will work on integrating multiple data modalities at input and/or output into AI models, leading to the development of LMMs. The input or output modes of interest could be text/language, images, video, audio, sensors data, actuator data, etc. Till recently, the focus was on unimodal models which could process only one data mode (such as text or speech or image) at a time.
By combining these different types of data, LMMs can achieve a more holistic understanding of the world, enabling them to perform complex tasks. For instance, an LMM could analyze a video, recognize objects, understand spoken language, and generate descriptive text all in one seamless iteration.
هذه القصة من طبعة september 2024 من PCQuest.
اشترك في Magzter GOLD للوصول إلى آلاف القصص المتميزة المنسقة، وأكثر من 9000 مجلة وصحيفة.
هل أنت مشترك بالفعل؟ تسجيل الدخول
المزيد من القصص من PCQuest
PCQuest
When Software Drives the Machine Need for Enterprise-Grade Software
Cars used to fail because of broken parts.Now they fail because of broken code. As vehicles become rolling computers, enterprise-grade software, ruthless testing, and fail-safe architecture decide one thing: whether a car keeps moving safely at 100 km/h
2 mins
March 2026
PCQuest
AI on the ground Practical use cases of AI in large enterprise operations
AI isn't a side project anymore, it's the quiet operator inside global giants. It reads invoices, senses machine fatigue, tailors every customer moment, flags risk in real time, and feeds leaders sharper instincts. Scale just got smarter
3 mins
March 2026
PCQuest
From AI experiments in 2025 to enterprise scale in 2026: Why data foundations will decide the winners
Everyone's betting big on Al, but most are burning cash instead of building value. The hidden culprit? Dirty data, clunky processes, and missing context. What if fixing your foundation, not your algorithms, was the real AI game-changer?
4 mins
March 2026
PCQuest
How automation at the periphery is accelerating digital transformation
Digital transformation is not tearing down the core anymore. It is happening at the edges. With AI and automation layered onto existing systems, companies are cutting costs, boosting productivity by up to 40%, and scaling smarter without risking operational chaos
2 mins
March 2026
PCQuest
When AI moves from chips to racks
AI performance is no longer just about faster chips. It is about how racks, power, networking, and orchestration work together. As agentic AI grows, infrastructure must become predictable, open, and built for scale from day one
4 mins
March 2026
PCQuest
Designing enterprise AI systems that stay fair
In 2026, bias is no longer treated as a communications issue or a public relations headache.
6 mins
March 2026
PCQuest
HALO smart sensor
What if bathrooms, locker rooms, and isolated spaces could become safer without adding cameras?
2 mins
March 2026
PCQuest
Building enterprise AI that doesn't discriminate
Bias in enterprise AI is not a side issue. It starts in data pipelines, training systems, product design, and engineering workflows. As AI scales, fairness, transparency, and accessibility are becoming core software requirements
4 mins
March 2026
PCQuest
Bias travels faster than code
Bias in enterprise AI is not a surface issue. It enters through data, features, model training, APIs, and UI logic, then spreads across the stack. The technical response is shifting from audits to architecture, observability, and deployment controls
6 mins
March 2026
PCQuest
How hospitals can use AI without risking patient data
With the fast pace of adoption of Artificial Intelligence (AI) and digital health systems in Indian hospitals, issues related to the security of patient data are also increasing at an equal rate.
2 mins
March 2026
Listen
Translate
Change font size

