SQL Pipe Syntax Is Now Available in Cloud Logging and BigQuery


 Cloud Logging and BigQuery now support the ground-breaking SQL pipe syntax.


The industry standard language for database programming is now SQL. Data access is now truly available to anyone thanks to its well-known syntax and established community. But let's be honest, SQL isn't perfect. The syntax of SQL has a number of issues that make it more challenging to read and write:

  • Rigid structure: A query must follow a certain order (SELECT … FROM … WHERE … GROUP BY), and subqueries or other complex patterns are required to do anything otherwise.
  • Awkward inside-out data flow: a query's FROM clauses, which are part of subqueries or common table expressions (CTE), are the first step. Logic is then constructed outward from there.
  • Verbose, repeating syntax: Are you tired of seeing the same columns in SELECT, GROUP BY, and ORDER BY, as well as in every subquery?
 
These issues could make SQL more difficult for new users. Even for seasoned users, it takes more work to read or write SQL than it should. A more useful syntax would be advantageous to all.

Over time, a large number of alternative languages and APIs have been proposed, some of which have demonstrated significant potential in particular applications. Many of these, including Apache Beam and Python DataFrames, use piped data flow, which makes it easier to create arbitrary searches. Many people find this syntax easier to learn and more useful than SQL.
 
The goal of introducing SQL pipe syntax to Google Cloud is to make data analysis easier to use. As a result, it is thrilled to present pipe syntax, a revolutionary innovation that adds the elegance of piped data flow to SQL in BigQuery and Cloud Logging.

What is the syntax for pipes?

In conclusion, pipe syntax is an extension of standard SQL syntax that makes SQL more adaptable, succinct, and straightforward. It offers the same basic operators as standard SQL, with the same semantics and nearly the same syntax, even though it allows applying operators in any order and in any number of times.

How it works:

  • A query can be started with FROM.
  • Operators are written consecutively using the |> pipe sign.
  • Each operator consumes its input table and then produces an output table.
  • Most pipe operators employ standard SQL syntax, which includes LIMIT, ORDER BY, JOIN, WHERE, SELECT, and so on.
  • Even within a single query, standard and pipe syntax can be combined as needed.

Real-world effects at HSBC

The global banking giant HSBC has already embraced pipe syntax after testing an early version in BigQuery and witnessing impressive results. They noticed significant improvements in efficiency and code readability, especially when dealing with large JSON collections.

The advantages of using SQL pipe syntax

The adoption of pipe syntax has multiple advantages for SQL developers. Here are a few instances:

Easy to comprehend

Learning and embracing new languages can be challenging, particularly in large organizations where it is better for everyone to use the same tools and languages. Pipe syntax is not a new language; rather, it is a new feature of the SQL language. For users who are already familiar with SQL, learning pipe syntax is rather simple because it primarily employs the same syntax and many of the same operators.

For users who are unfamiliar with SQL, learning pipe syntax is easier at first. They still need to learn the operators and some semantics (such inner and outer joins), but they can use those operators to express their intended queries directly, avoiding some of the complexities and workarounds required when creating queries in standard SQL.

Easy to implement gradually and without the need for migrations

Everyone is aware that implementing a new language or system may be expensive, time-consuming, and error-prone. Since pipe syntax is a feature of GoogleSQL, you don't need to migrate anything before you can start utilizing it. The new syntax can be applied sparingly where it is helpful, and all existing queries continue to work. Any new SQL code is fully compatible with existing SQL code. For example, queries employing pipe syntax can call standard views built in standard syntax, and vice versa. Using pipe syntax in new SQL code does not render any existing SQL obsolete or unusable.

No effect on performance or cost

Pipe syntax works on popular platforms like BigQuery without the need for extra layers (such translation proxies), which could raise latency, cost, or reliability problems and make debugging or adjusting more difficult.

There is also no additional fee. The SQL query optimizer will still restructure the query to execute faster because pipe syntax queries still follow SQL's declarative semantics. In other words, queries performed in pipe syntax or standard syntax typically execute the same.

What uses is pipe syntax appropriate for?

Whether you're looking at data, creating dashboards, building data pipelines, or looking at logs, pipe syntax lets you create SQL queries that are simpler to comprehend, more efficient, and easier to maintain. Additionally, because pipe syntax supports most common SQL operators, you can utilize it whenever you write queries. Here are some apps to get you started:

Ad hoc analysis and query debugging

In order to find out what is in a table, you typically start data exploration by looking at its rows (starting with a FROM clause). Filters, aggregations, joins, sorting, and other operations are then applied. Pipe syntax makes this kind of study quite easy because you can start with a FROM clause and work your way up from there. Each step's current results can be viewed, a pipe operator can be added, and the query can be performed to see the modified results.

Using pipe syntax also helps with query debugging. A query prefix can be highlighted and run, displaying the intermediate result up to that point. Every query prefix up to a pipe symbol is also a valid query, which is a nice property of queries in pipe syntax.

Data engineering lifecycle

As data volume rises, data processing and transformation become more challenging and time-consuming. When there is a lot of data, it usually takes a lot of technical work to build, manage, and maintain a data pipeline. Pipe syntax's more approachable syntax and linear query pattern make data engineering easier. Bid farewell to the heavily nested queries and CTEs that frequently show up when normal SQL is utilized. This most recent version of GoogleSQL reimagines how you parse, extract, and transform data, making it easier to design and manage data pipelines.

Using SQL and LLMs in plain language

Research suggests that large language models (LLMs) may find it challenging to understand or generate SQL for the same reasons that humans may find it challenging to read and write. In contrast, pipe syntax breaks down queries into discrete stages that closely resemble the desired logical data flow. By employing pipe syntax, the LLM may more easily specify a desired data flow and make the resulting queries simpler and easier for humans to interpret. Additionally, this greatly facilitates human validation of the generated queries.

Pipe syntax also makes better code helpers and auto-completion possible because it makes it much easier to understand what's going and what's possible. Furthermore, it permits recommendations for local changes to a particular pipe operator as opposed to global changes to a query as a whole. additional intelligent AI-generated code suggestions and additional operators based on natural language in a query are great methods to boost user productivity.

Explore pipe syntax's possibilities immediately

For fifty years, SQL has been the global language of data due to its effectiveness. SQL shines in several areas when it comes to expressing queries as declarative combinations of relational operators.

That does not, however, mean that SQL cannot be enhanced. Pipe syntax advances SQL into the future by fixing its main usability problems and creating new opportunities for extending and interacting with SQL. This has nothing to do with replacing SQL or developing a new language. Despite being a more expressive, adaptable, and user-friendly form of SQL, SQL with pipe syntax is still SQL.

Post a Comment

0 Comments