Innovate More Rapidly
DataQuest
|November 2019
How do you undertake a journey to the data cloud? Digital transformation is going on around us. It is happening across all aspects of the society. We are now learning how to integrate new technologies
Change has accelerated in the past decade. Earlier, systems were deployed with the expectation that they would last forever. They were not designed to look at each other’s data, and were fairly limiting. Open source was a new idea in the early 2000s. People began to adopt Lucene, a software I had written. There was no institutional backing or publicity. Open source emerged as a tool for development.
A Distributing Computing Platform
Nutch started in 2003. Around 2005, Google published a paper on how they build search engines. They had a paper talking about how they had automated things. We started working on reworking Nutch in 2004. The tale of debugging is much longer. In 2006, I joined yahoo! I developed Hadoop. Hadoop was named after my son’s toy elephant. It was a distributing computing platform, based on Google’s ideas.
A group of people believed that Hadoop could be used much further. Together, they formed Cloudera. I joined Cloudera in 2009. Stepping back from my lesson in Hadoop, if you can increase the scale and focus on flexibility, you can permit them to store more data in raw form and experiment. They can innovate more quickly. The waterfall method inhibited process through data. This gave us a much more appropriate platform. Most of the past data was relational.
New sources of data are events, things recorded from sensors, etc. We need a different class of tools. Companies can run petabytes of data easily today. Software is also eating the world. In every industry, everywhere, the advances being made are predominantly using software. A company’s growth is fuelled more by data, today. The use of data is no longer isolated. It has emerged everywhere.
Challenges
This story is from the November 2019 edition of DataQuest.
Subscribe to Magzter GOLD to access thousands of curated premium stories, and 10,000+ magazines and newspapers.
Already a subscriber? Sign In
MORE STORIES FROM DataQuest
DataQuest
Empowering India's Al future through data: Snowflake's Vijayant Rai on innovation, collaboration, and talent
Snowflake India MD Vijayant Rai shares how the company is unifying data, advancing AI innovation, and skilling the next generation for a data-first India.
6 mins
December 2025
DataQuest
How AI is redefining delivery in the digital engineering era
As AI reshapes software engineering, delivery models are evolving from effort-based execution to intelligent, outcome-driven systems that blend human and machine collaboration.
3 mins
December 2025
DataQuest
NetSuite's Global Vision: Building the Intelligent Enterprise for the Al Era
At SuiteWorld 2025, NetSuite unveiled an AI-first vision with embedded assistants, customizable AI workflows, and global expansion focused on balancing innovation, trust, and local market needs.
4 mins
December 2025
DataQuest
V. Rajaraman: The teacher who built India's computing mind, no more
When a teacher departs, the blackboards weep. A generation of learners, spread across the world, pause and go back in time, overwhelmed by a quiet sense of gratitude and loss. Such is life, and such is India’s timeless Guru-Shishya parampara, where many jambavans silently walk the corridors of knowledge, leaving behind an imprint that endures long after they are gone.
5 mins
December 2025
DataQuest
Pilot or Paradox: Where are you parking your Al today?
Fragmented data, model pluralism, lack of a fabric, not enough skills, model economics, model volatility and the blank page syndrome- everything matters when it comes to making sure that an AI pilot does not end up as a paradox. And whether you are in that '5 pc' club?
6 mins
December 2025
DataQuest
QA engineers must think like adversaries
What happens when Ramp-testing a vehicle happens around the assembly line, earlier-faster-deeper-and-smarter than before? And as ruthless as a crash-test?
4 mins
December 2025
DataQuest
Why data readiness defines GenAl success: Krish Vitaldevara, Informatica
Informatica's Krish Vitaldevara explains data readiness gaps, CLAIRE's evolution, multi-cloud neutrality, governance for GenAI, ROI metrics, and the impact of the Salesforce acquisition.
7 mins
December 2025
DataQuest
Customer Zero to Global Impact: Salesforce's Playbook for Intelligent Enterprise Transformation
At Dreamforce 2025, Salesforce unveiled Agentforce 360, highlighting how context-aware AI agents are driving measurable business transformation across India and ASEAN.
3 mins
December 2025
DataQuest
DisCERNing Quantum – And not as some Shiny-Pink Uni-saurus
Noise control, fault tolerance, error-correction, superconducting circuits, trapped ions, photonic systems, hardware stability, hardware scalability, algorithmic maturity, strong-enough qubits - everything matters when it comes to the difference between reality and disillusionment with the Quantum Advantage.
6 mins
December 2025
DataQuest
Improving Efficiency and Supplier Relations through Accounts Payable Automation
AP automation transforms accounts payable from a cost centre into a strategic enabler, driving efficiency, transparency, and stronger supplier relationships.
4 mins
December 2025
Translate
Change font size

