sdecoret - stock.adobe.com
Google Cloud's BigQuery gets AI injection, Looker to follow
The tech giant's latest analytics and data management moves include the general availability of Gemini in BigQuery and the imminent availability of Gemini in Looker.
Google Cloud on Thursday made Gemini in BigQuery generally available, providing customers with generative AI capabilities in the same environment in which they carry out data engineering and exploration tasks as they prepare data for analysis.
In addition, the tech giant provided an update on the Gemini multi-modal large language model (LLM) in Looker, revealing that the feature is still in preview but expected to be generally available shortly.
Google Cloud revealed the generative AI-related news during Google Cloud Next Tokyo, a user conference held in Japan.
Doug Henschen, an analyst at Constellation Research, noted that after unveiling a spate of generative AI capabilities in preview in 2023, this year has marked the realization of many of those promised features. For example, in February, Google Cloud made AlloyDB AI generally available as well as Gemini models for BigQuery by integrating BigQuery with Vertex AI, a machine learning and AI platform that provides access to third-party LLMs.
Gemini in BigQuery continues the tech giant's push to deliver generative AI capabilities, as will the pending general availability of Gemini in Looker.
"Google Cloud and its rivals have all spent 2024 exposing and maturing GenAI capabilities across their clouds," Henschen said. "The Gemini integrations … featured in this announcement continue in that vein."
Beyond bringing generative AI directly to data, Google Cloud unveiled new capabilities aimed at making BigQuery a unified platform for AI and data management such as support for Apache Iceberg for data formatting, Apache Spark for data processing and Apache Kafka for data streaming.
Like Henschen, David Menninger an analyst at ISG's Ventana Research, noted that Google Cloud's latest moves represent progression from theory to practice as the vendor works to make analytics processes simpler and more efficient.
"Google made a series of announcements [in April] that taken as a whole were impressive in their breadth, but which were in preview at the time," he said. "Now these features are becoming generally available. Collectively they are focused on leveraging generative AI."
Gemini and data
Google Cloud first launched Gemini in December 2023. Like the GPT models from Open AI, Gemini is an LLM that enables users to understand and generate text, code, images and videos, among other types of information.
When joined with data management and analytics platforms to understand an enterprise's proprietary data, LLMs let users interact with data using natural language.
The result is that non-technical workers who lack coding knowledge and data literacy training are now able to query and analyze data. In addition, data experts who previously spent large amounts of time coding are made more efficient.
Gerrit Kazmaier, vice president and general manager of data and analytics at Google Cloud, noted that since its inception, business intelligence has been constrained.
One limitation has been the technological expertise required to use complex platforms. Another has been the way analytics is consumed with reports and dashboards taking time to develop and only being able to convey a limited amount of information.
Still another constraint is that data has to be modeled before it can be analyzed, a time-consuming process that restricts what data can be used to inform a decision.
By combining Gemini with BigQuery and Looker, joining generative AI and analytics in a single environment, Google Cloud aims to remove the constraints that have held BI back. The tech giant is attempting to remove analysis from the limitations of dashboards and reports to make generative AI the interface through which users engage with data.
"It's a paradigm shift from activating data through dashboards to activating data through AI and agents -- a different way of consuming data," Kazmaier said. "We have to make it as easy to search and interrogate enterprise data as Google search makes it to search the public web [so users can] access to data through natural language and easy experiences."
To break through currently existing constrictions, Gemini in Looker includes customized LLM agents that are expert in selecting data, performing analysis and summarizing results, according to Kazmaier.
In addition, the generative AI in Looker proactively mines data and learns from user activity to automatically deliver suggestions.
"We couldn't quite get it done in time for Tokyo. But it will be a quick follow," Kazmaier said.
While Gemini in Looker will bring generative AI to analytics, Gemini in BigQuery now brings generative AI to Google Cloud's platform for AI development with AI-powered data engineering, exploration, governance and security.
Specific features include code generation, a data canvas, partition and cluster recommendations, and code completion and explanation for SQL and Python.
Between Gemini in BigQuery and Gemini in Looker, perhaps the most intriguing features are the LLM agents, according to Menninger, who saw them in action when they were unveiled in preview during Google Cloud's Next 2024 user conference in April. In Looker, they help users discover and select data in addition to assisting analysis and in BigQuery they aid data preparation.
"I was most impressed by the Gemini data agents to assist with things like data preparation and exploration," Menninger said.
Unified platform for AI
While Gemini in BigQuery and Gemini in Looker bring generative AI to data, Google Cloud is also in the process of turning BigQuery from a data warehouse into a more expansive suite for developing AI models and applications.
Each year, Google Cloud conducts a survey of customers and publishes a Data and AI Trends Report. The 2024 version found that 96% of chief experience officers think generative AI is critical to their business. However, two-thirds of the data that can be used to inform generative AI models and applications and inform decisions -- much of it unstructured data such as text and images -- is inaccessible and unused.
Google Cloud aims to enable access to that "dark" data by developing a multimodal data platform that enables access to all types of data, whether structured, unstructured, at rest or streaming, according to Kazmaier.
Gerrit KazmaierVice president and general manager of data and analytics, Google Cloud
"It's not just about taking public large language models and using them as assistants," he said. "It's about activating enterprise data through AI. In order to get to incredible AI, you need to have incredible data."
Toward that end, Google Cloud now provides a fully managed version of Apache Iceberg in BigQuery with fully managed support for Apache Kafka and Spark now in preview. In addition, support for Delta Lake is now generally available.
Iceberg and Delta Lake are the two most popular storage formats for data lakehouses, which enable storage of traditional structured data with unstructured data such as text and images in the same location. By enabling access to all data types in BigQuery, Google Cloud is making unstructured data more easily accessible so that much of data that currently goes unused can be activated to inform AI and analytics applications.
In addition, by making BigQuery a unified platform for all data types to enable developers to more easily build and train generative AI models, Google Cloud is distancing itself from competitors such as Microsoft and Amazon Web Services, according to Henschen.
"Google Cloud has differentiated through its unification moves around BigQuery, giving customers the ability to analyze structured, unstructured and open-format data seamlessly," he said. "Where competitors have lots of different database services, Google has done a better job of unifying around BigQuery as the center of data gravity, with key service integrated with this core platform."
Menninger likewise stated that the general availability of a series of features helps to at least temporarily give Google Cloud a slight advantage over its competitors including AWS and Microsoft in the race to provide customers with meaningful generative AI capabilities.
"It's such a competitive market that it's difficult to stay ahead of competitors for long," he said. "All the data platform vendors and hyperscalers are working on ways to incorporate generative AI. But I would have to give a nod to Google for the breadth of its generative AI capabilities that are now generally available. They span nearly all aspects of the data and analytics lifecycle."
Next steps
Beyond infusing BigQuery and Looker with generative AI and moving to make BigQuery a unified platform for developing AI and analytics applications, Google Cloud unveiled numerous other analytics-related features on Thursday.
Among many others are the general availability of real-time streaming data sharing in BigQuery's Analytics Hub and the general availability of a data migration program that includes credits to cover the cost of migrating data to BigQuery.
Looking ahead to Google Cloud's next set of new analytics and data management capabilities, Henschen said the tech giant could focus on unifying its data integration tools.
Google Cloud used to have three separate analytics platforms. In 2022, however, it consolidated those platforms under the Looker name. Data integration provides a similar opportunity for Google Cloud.
"I'd like to see a unified integration platform from Google," Henschen said. "What they have now is a collection of integration services that aren't unified into a single platform."
Menninger, meanwhile, suggested that not only Google Cloud but also many other AI and analytics vendors have room to improve their analytics and AI governance capabilities. Most have made strides in data governance. But as more enterprises treat data products as commodities and AI continues to gain momentum, they need to address analytics and AI governance.
"I still see holes in the market around governance of analytics and AI," Menninger said. "In my most recent … assessment of AI platforms, I was disappointed to see that most vendors, including Google, still have a long way to go to provide tooling to help enterprises fully trust and rely on their AI-based processes."
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.