Release 1.36.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.36.0.

This release comes 3 months after 1.35.0, contains contributions from 30 contributors, and resolves 125 issues.

Among other new features, it’s worth highlighting the adding of 30 new SQL functions in various libraries such as BigQuery and Spark, many improvements hardening TABLESAMPLE , and also the following features:

  • [CALCITE-129] Support recursive WITH queries
  • [CALCITE-6022] Support CREATE TABLE ... LIKE DDL in server module
  • [CALCITE-5962] Support parse Spark-style syntax LEFT ANTI JOIN in Babel parser
  • [CALCITE-5184] Support LIMIT start, ALL in MySQL conformance, equivalent to OFFSET start
  • [CALCITE-5889] Add a RelRule that converts Minus into UNION ALL..GROUP BY...WHERE

In addition to new features, it’s also worth highlighting the integrating of SQL Logic Test suite

See the release notes; download the release.

Release 1.35.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.35.0.

This release comes 4 months after 1.34.0, contains contributions from 36 contributors, and resolves 140 issues.

Among other new features, it adds more than 40 new SQL functions in various libraries such as BigQuery and Spark.

It is worth highlighting the following improvements:

  • Some improvements in calcite core.
    • [CALCITE-5703] Reduce amount of generated runtime code
    • [CALCITE-5479] FamilyOperandTypeChecker is not readily composable in sequences
    • [CALCITE-5425] Should not pushdown Filter through Aggregate without group keys
    • [CALCITE-5506] RelToSqlConverter should retain the aggregation logic when Project without RexInputRef on the Aggregate
  • Some improvements in simplifying an expression.
    • [CALCITE-5769] Optimizing CAST(e AS t) IS NOT NULL to e IS NOT NULL
    • [CALCITE-5780] Simplify 1 > x OR 1 <= x OR x IS NULL to TRUE
    • [CALCITE-5798] Improve simplification of (x < y) IS NOT TRUE when x and y are not nullable
    • [CALCITE-5759] SEARCH(1, Sarg[IS NOT NULL]) should be simplified to TRUE
    • [CALCITE-5639] RexSimplify should remove IS NOT NULL check when LIKE comparison is present

See the release notes; download the release.

Release 1.34.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.34.0.

This release comes 1 month after 1.33.0, contains contributions from 18 contributors, and resolves 34 issues.

It’s worth highlighting the introduction of QUALIFY clause ([CALCITE-5268]), which facilitates filtering the results of window functions. Among other improvements and fixes, it adds roughly 15 new functions in BigQuery library for handling dates, times, and timestamps, and provides a fix ([CALCITE-5522]) for a small breaking change in DATE_TRUNC function ([CALCITE-5447]), which was introduced accidentally in 1.33.0.

See the release notes; download the release.

Release_1.33.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.33.0.

This release comes five months after 1.32.0, contains contributions from 33 contributors, and resolves 107 issues.

Among others, it is worth highlighting the following improvements:

  • Many improvements to the BigQuery dialect as part of [CALCITE-5180]
    • [CALCITE-5269] Implement BigQuery TIME_TRUNC and TIMESTAMP_TRUNC functions
    • [CALCITE-5360] Implement TIMESTAMP_ADD function (compatible with BigQuery)
    • [CALCITE-5389] Add STARTS_WITH and ENDS_WITH functions (for BIG_QUERY compatibility)
    • [CALCITE-5404] Implement BigQuery’s POW() and TRUNC() math functions
    • [CALCITE-5423] Implement TIMESTAMP_DIFF function (compatible with BigQuery)
    • [CALCITE-5430] Implement IFNULL() for BigQuery dialect
    • [CALCITE-5432] Implement BigQuery TIME_ADD/TIME_DIFF
    • [CALCITE-5436] Implement DATE_SUB, TIME_SUB, TIMESTAMP_SUB (compatible w/ BigQuery)
    • [CALCITE-5447] Add DATE_TRUNC for BigQuery
  • [CALCITE-5105] Add MEASURE type and AGGREGATE aggregate function
  • [CALCITE-5155] Custom time frames
  • [CALCITE-5280] Implement geometry aggregate functions
  • [CALCITE-5314] Prune empty parts of a query by exploiting stats/metadata

See the release notes; download the release.

Release 1.32.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.32.0.

This release fixes CVE-2022-39135, an XML External Entity (XEE) vulnerability that allows a SQL query to read the contents of files via the SQL functions EXISTS_NODE, EXTRACT_XML, XML_TRANSFORM or EXTRACT_VALUE.

Coming 1 month after 1.31.0 with 19 issues fixed by 17 contributors, this release also replaces the ESRI spatial engine with JTS and proj4j, adds 65 spatial SQL functions including ST_Centroid, ST_Covers and ST_GeomFromGeoJSON, adds the CHAR SQL function, and improves the return type of the ARRAY and MULTISET functions.

See the release notes; download the release.

Release 1.31.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.31.0.

This release comes four months after 1.30.0, contains contributions from 28 contributors, and resolves 81 issues.

Among others, it is worth highlighting the following improvements:

  • [CALCITE-4865] Allow table functions to be polymorphic
  • [CALCITE-5107] Support SQL hint for Filter, SetOp, Sort, Window, Values
  • [CALCITE-35] Support parsing parenthesized joins
  • [CALCITE-3890] Derive IS NOT NULL filter for the inputs of inner join
  • [CALCITE-5085] Firebolt dialect implementation

See the release notes; download the release.

Release 1.30.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.30.0.

This release comes over two months after 1.29.0, contains contributions from 29 authors, and resolves 36 issues.

Among others, it is worth highlighting the following.

See the release notes; download the release.

Release 1.28.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.28.0.

This release comes four months after 1.27.0, contains contributions from 38 authors, and resolves 76 issues. New features include the UNIQUE sub-query predicate, the MODE aggregate function, PERCENTILE_CONT and PERCENTILE_DISC inverse distribution functions, an Exasol dialect for the JDBC adapter, and improvements to materialized view recognition.

This release contains some breaking changes (described below) due to the replacement of ImmutableBeans with Immutables. Two APIs are deprecated and will be removed in release 1.29.

Breaking changes to ImmutableBeans

In 1.28, Calcite converted the recently introduced configuration system from an internal system based on ImmutableBeans to instead use the Immutables annotation processor. This library brings a large number of additional features that should make value-type classes in Calcite easier to build and leverage. It also reduces reliance on dynamic proxies, which should improve performance and reduce memory footprint. Lastly, this change increases compatibility with ahead-of-time compilation technologies such as GraalVM. As part of this change, a number of minor changes have been made and key methods and classes have been deprecated. The change was designed to minimize disruption to existing consumers of Calcite but the following minor changes needed to be made:

  • The RelRule.Config.EMPTY field is now deprecated. To create a new configuration subclass, you can either use your preferred interface-implementation based construction or you can leverage Immutables. To do the latter, configure your project to use the Immutables annotation processor and annotate your subclass with the @Value.Immutable annotation.
  • Where RelRule.Config subclasses were nested 2+ classes deep, the interfaces have been marked deprecated and are superceded by new, uniquely named interfaces. The original Configs extend the new uniquely named interfaces. Subclassing these work as before and the existing rule signatures accept any previously implemented Config implementations. However, this is a breaking change if a user stored an instance of the DEFAULT object using the Config class name (as the DEFAULT instance now only implements the uniquely named interface).
  • The RelRule.Config.as() method should only be used for safe downcasts. Before, it could do arbitrary casts. The exception is that arbitrary as() will continue to work when using the deprecated RelRule.Config.EMPTY field. In most cases, this should be a non-breaking change. However, all Calcite-defined DEFAULT rule config instances use Immutables. As such, if one had previously subclassed a RelRule.Config subclass and then used the DEFAULT instance from that subclass, the as() call will no longer work to coerce the DEFAULT instance into a arbitrary subclass. In essence, outside the EMPTY use, as() is now only safe to do if a Java cast is also safe.
  • ExchangeRemoveConstantKeysRule.Config and ValuesReduceRule.Config now declare concrete bounds for their matchHandler configuration. This is a breaking change if one did not use the Rule as a bounding variable.
  • Collections used in Immutables value classes will be converted to Immutable collection types even if the passed in parameter is mutable (such as an ArrayList). As such, consumers of those configuration properties cannot mutate the returned collections.

See the release notes; download the release.

Release 1.27.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.27.0.

This release comes eight months after 1.26.0. It includes more than 150 resolved issues, comprising a few new features, three minor breaking changes, many bug-fixes and small improvements, as well as code quality enhancements and better test coverage.

Among others, it is worth highlighting the following:

See the release notes; download the release.

Calcite Online Meetup January 2021

On January 20, we are organising an online meetup for Apache Calcite.

The main purpose is to bring the community together allowing newcomers and senior members to interact and exchange ideas on various topics.

During the occasion we will have a few presentations covering introductory Calcite concepts, recent & ongoing work on streams, spatial query implementation, and integration of Calcite in Hazelcast, followed by open discussion and virtual key signing party.

For more details check the agenda on meetup.

Release 1.26.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.26.0.

Warning: Calcite 1.26.0 has severe issues with RexNode simplification caused by SEARCH operator ( wrong data from query optimization like in CALCITE-4325, CALCITE-4352, NullPointerException), so use 1.26.0 for development only, and beware that Calcite 1.26.0 might corrupt your data.

This release comes about two months after 1.25.0 and includes more than 70 resolved issues, comprising a lot of new features and bug-fixes. Among others, it is worth highlighting the following.

See the release notes; download the release.

Release 1.25.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.25.0.

This release comes about one month after 1.24.0 and removes methods which were deprecated in the previous version. In addition, notable improvements in this release are:

See the release notes; download the release.

Release 1.24.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.24.0.

This release comes about two months after 1.23.0. It includes more than 80 resolved issues, comprising a lot of new features as well as performance improvements and bug-fixes. Among others, it is worth highlighting the following.

See the release notes; download the release.

Release 1.23.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.23.0.

This release comes two months after 1.22.0. It includes more than 100 resolved issues, comprising a lot of new features as well as performance improvements and bug-fixes. For some complex queries, the planning speed can be 50x or more faster than previous versions with built-in default rule set. It is also worth highlighting that Calcite now:

  • Supports top down trait request and trait enforcement without abstract converter (CALCITE-3896)
  • Improves VolcanoPlanner performance by removing rule match and subset importance (CALCITE-3753)
  • Improves VolcanoPlanner performance when abstract converter is enabled (CALCITE-2970)
  • Supports ClickHouse dialect (CALCITE-2157)
  • Supports SESSION and HOP Table function (CALCITE-3780, CALCITE-3737)

See the release notes; download the release.

Release 1.22.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.22.0.

This release comes five months after 1.21.0. It includes more than 250 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes. Among others, it is worth highlighting the following.

We have also fixed some important bugs:

  • The metadata cache is fixed for rare cases that RelSets are merging (CALCITE-2018)
  • The GROUP_ID now returns correct results (CALCITE-1824)
  • CORRELATE row count estimation has been fixed, it is always 1 before (CALCITE-3711)
  • The modulus precision inference of DECIMALs has been fixed (CALCITE-3435)

See the release notes; download the release.

Release 1.21.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.21.0.

This release comes two months after 1.20.0. It includes more than 100 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes.

It is worth highlighting that Calcite now:

See the release notes; download the release.

Release 1.20.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.20.0.

This release comes three months after 1.19.0. It includes more than 130 resolved issues, comprising of a few new features as well as general improvements and bug-fixes. It includes support for anti-joins, recursive queries, new functions, a new adapter, and many more bug fixes and improvements.

See the release notes; download the release.

Release 1.18.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.18.0.

With over 200 commits from 36 contributors, this is the largest Calcite release ever. To the SQL dialect, we added JSON functions, linear regression functions, and the WITHIN GROUP clause for aggregate functions; there is a new utility to recommend lattices based on past queries, and improvements to expression simplification, the SQL advisor, and the Elasticsearch and Apache Geode adapters.

See the release notes; download the release.

Release 1.17.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.17.0.

This release comes four months after 1.16.0. It includes more than 90 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes. Among others:

See the release notes; download the release.

Release 1.16.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.16.0.

This release comes three months after 1.15.0. It includes more than 80 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes to Calcite core. Among others:

See the release notes; download the release.

Release 1.15.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.15.0. In this release, three months after 1.14.0, 50 issues are fixed by 22 contributors. Among more modest improvements and bug-fixes, here are some features of note:

  • [CALCITE-707] adds DDL commands to Calcite for the first time, including CREATE and DROP commands for schemas, tables, foreign tables, views, and materialized views. We know that DDL syntax is a matter of taste, so we added the extensions to a new “server” module, leaving the “core” parser unchanged;
  • [CALCITE-2061] allows dynamic parameters in the LIMIT and OFFSET and clauses;
  • [CALCITE-1913] refactors the JDBC adapter to make it easier to plug in a new SQL dialect;
  • [CALCITE-1616] adds a data profiler, an algorithm that efficiently analyzes large data sets with many columns, estimating the number of distinct values in columns and groups of columns, and finding functional dependencies. The improved statistics are used by the algorithm that designs summary tables for a lattice.

Calcite now supports JDK 10 and Guava 23.0. (It continues to run on JDK 7, 8 and 9, and on versions of Guava as early as 14.0.1. The default version of Guava remains 19.0, the latest version compatible with JDK 7 and the Cassandra adapter’s dependencies.)

This is the last release that will support JDK 7.

See the release notes; download the release.

Release 1.14.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.14.0.

This release comes three months after 1.13.0. It includes 68 resolved issues with many improvements and bug fixes. This release brings some big new features. The GEOMETRY data type was added along with 35 associated functions as the start of support for Simple Feature Access. There are also two new adapters.

Firstly, the Elasticsearch 5 adapter which now exists in parallel with the previous Elasticsearch 2 adapter. Additionally there is now an OS adapter which exposes operating system metrics as relational tables. ThetaSketch and HyperUnique support has also been added to the Druid adapter. Several minor improvements are added as well including improved MATCH_RECOGNIZE support, quantified comparison predicates, and ARRAY and MULTISET support for UDFs.

See the release notes; download the release.

Release 1.13.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.13.0.

This release comes three months after 1.12.0. It includes more than 75 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes.

First, Calcite has been upgraded to use Avatica 1.10.0, which was recently released.

Moreover, Calcite core includes improvements which aim at making it more powerful, stable and robust. In addition to numerous bux-fixes, we have implemented a new materialized view rewriting algorithm and new metadata providers which should prove useful for data processing systems relying on Calcite.

In this release, we have also completed the work to support the MATCH_RECOGNIZE clause used in complex-event processing (CEP).

In addition, more progress has been made for the different adapters. For instance, the Druid adapter now relies on Druid 0.10.0 and it can generate more efficient plans where most of the computation can be pushed to Druid, e.g., using extraction functions.

See the release notes; download the release.

New Avatica Repository

The Apache Calcite PMC is pleased to announce further growth of its sub-project, Avatica.

Avatica has been slowly growing inside of Calcite for many years (dating back to Optiq-0.4.x!). The team has taken the next step to hoist the Avatica code out of the Calcite repository into its own. The team felt like this was the next logical step given the maturity of the project.

The previous “/avatica” directory in the Calcite repository has been removed, so further contributions should be submitted agains the new repository. The de-facto repository can be found at the ASF’s Git hosting, with a mirrored-copy also available on Github at apache/calcite-avatica.

Release 1.12.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.12.0.

In 2½ months, 29 contributors have resolved 95 issues. Here are some of the highlights.

Calcite now supports JDK 9 and Guava 21.0. (It continues to run on JDK 7 and 8, and on versions of Guava as early as 14.0.1. The default version of Guava remains 19.0, due to the Cassandra adapter’s dependencies, and the fact that Guava 21.0 requires JDK 8 or later.)

There are two new adapters:

  • The File adapter can read files of various formats (such as CSV, JSON, zipped files, and HTML) over various protocols (including file and HTTP). If reading HTML files, it can extract data from nested <TABLE> elements.
  • The Pig adapter provides a SQL interface to Apache Pig.

And there are continuing improvements in performance and stability of the Druid adapter. (The Druid project now embeds Calcite to provide SQL support, and there has been cross-fertilization between the projects.)

To err is human, as the saying goes. If you mis-type the name of a schema, table or column in a SQL statement, Calcite now helps you correct it. The error message indicates whether it was whether it was the schema, table or column that was not found; if the mistake was just due to an upper- or lower-case letter, it suggests the correct name.

New SQL syntax and functions:

  • HOP, TUMBLE and SESSION functions in the GROUP BY clause allow you to aggregate over window types (especially useful for streaming queries);
  • Experimental support for the MATCH_RECOGNIZE clause for Complex-Event Processing (CEP);
  • New YEAR, MONTH, WEEK, DAYOFYEAR, DAYOFMONTH, DAYOFWEEK, HOUR, MINUTE, SECOND, DATABASE, IFNULL, and USER functions to comply with the ODBC/JDBC standard. Also, EXTRACT now allows the corresponding time-unit arguments.

See the release notes; download the release.

Release 1.11.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.11.0.

Nearly three months after the previous release, there is a long list of improvements and bug-fixes, many of them making planner rules smarter. The following are some of the more important ones.

Several adapters have improvements:

  • The JDBC adapter can now push down DML (INSERT, UPDATE, DELETE), windowed aggregates (OVER), IS NULL and IS NOT NULL operators.
  • The Cassandra adapter now supports authentication.
  • Several key bug-fixes in the Druid adapter.

For correlated and uncorrelated sub-queries, we generate more efficient plans (for example, in some correlated queries we no longer require a sub-query to generate the values of the correlating variable), can now handle multiple correlations, and have also fixed a few correctness bugs.

New SQL syntax:

  • CROSS APPLY and OUTER APPLY;
  • MINUS as a synonym for EXCEPT;
  • an AS JSON option for the EXPLAIN command;
  • compound identifiers in the target list of INSERT, allowing you to insert into individual fields of record-valued columns (or column families if you are using the Apache Phoenix adapter).

A variety of new and extended built-in functions: CONVERT, LTRIM, RTRIM, 3-parameter LOCATE and POSITION, RAND, RAND_INTEGER, and SUBSTRING applied to binary types.

There are minor but potentially breaking API changes in [CALCITE-1519] (interface SubqueryConverter becomes SubQueryConverter and some similar changes in the case of classes and methods) and [CALCITE-1530] (rename Shuttle to Visitor, and create a new class Visitor<R>). See the cases for more details.

See the release notes; download the release.

Release 1.9.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.9.0.

This release includes extensions and fixes for the Druid adapter. New features were added, such as the capability to recognize and translate Timeseries and TopN Druid queries. Moreover, this release contains multiple bug fixes over the initial implementation of the adapter. It is worth mentioning that most of these fixes were contributed by Druid developers, which demonstrates the good reception of the adapter by that community.

We have added new SQL features too, e.g., support for LATERAL TABLE. There are multiple interesting extensions to the planner rules that should contribute to obtain better plans, such as avoiding doing the same join twice in the presence of COUNT DISTINCT, or being able to simplify the expressions in the plan further. In addition, we implemented a rule to convert predicates on EXTRACT function calls into date ranges. The rule is not specific to Druid; however, in principle, it will be useful to identify filter conditions on the time dimension of Druid data sources.

Finally, the release includes more than thirty bug-fixes, minor enhancements and internal changes to planner rules and APIs.

See the release notes; download the release.

Release 1.8.0

The Apache Calcite PMC is pleased to announce Apache Calcite release 1.8.0.

This release adds adapters for Elasticsearch and Druid. It is also now easier to make a JDBC connection based upon a single adapter.

There are several new SQL features: UNNEST with multiple arguments, MAP arguments and with a JOIN; a DESCRIBE statement; and a TRANSLATE function like the one in Oracle and PostgreSQL.

We also added support for SELECT without FROM (equivalent to the VALUES clause, and widely used in MySQL and PostgreSQL), and added a conformance parameter to allow you to selectively enable this and other SQL features.

And, as usual, there are a couple of dozen bug-fixes and enhancements to planner rules and APIs.

See the release notes; download the release.

Cassandra Adapter

A new Apache Calcite adapter allows you to access Apache Cassandra via industry-standard SQL.

You can map a Cassandra keyspace into Calcite as a schema, Cassandra CQL tables as tables, and execute SQL queries on them, which Calcite converts into CQL. Cassandra can define and maintain materialized views but the adapter goes further: it can transparently rewrite a query to use a materialized view even if the view is not mentioned in the query.

Read more about the adapter here.

The Cassandra adapter is available as part of Apache Calcite version 1.7.0, which has just been released. Calcite also has adapters for CSV and JSON files, and JDBC data source, MongoDB, Spark and Splunk.

Release 1.7.0

Apache Calcite 1.7.0 is the first release since Avatica became an independent project. Calcite now depends on Avatica in the same way as it does other libraries, via a Maven dependency. To see Avatica-related changes, see the release notes for Avatica 1.7.1.

We have added an adapter for Apache Cassandra. You can map a Cassandra keyspace into Calcite as a schema, Cassandra CQL tables as tables, and execute SQL queries on them, which Calcite converts into CQL. Cassandra can define and maintain materialized views but the adapter goes further: it can transparently rewrite a query to use a materialized view even if the view is not mentioned in the query.

This release adds an Oracle-compatibility mode. If you add fun=oracle to your JDBC connect string, you get all of the standard operators and functions plus Oracle-specific functions DECODE, NVL, LTRIM, RTRIM, GREATEST and LEAST. We look forward to adding more functions, and compatibility modes for other databases, in future releases.

We’ve replaced our use of JUL (java.util.logging) with SLF4J. SLF4J provides an API which Calcite can use independent of the logging implementation. This ultimately provides additional flexibility to users, allowing them to configure Calcite’s logging within their own chosen logging framework. This work was done in [CALCITE-669].

For users experienced with configuring JUL in Calcite previously, there are some differences as some the JUL logging levels do not exist in SLF4J: FINE, FINER, and FINEST, specifically. To deal with this, FINE was mapped to SLF4J’s DEBUG level, while FINER and FINEST were mapped to SLF4J’s TRACE.

See the release notes; download the release.

Streaming SQL in Samza

Julian Hyde gave a talk at the Apache Samza meetup in Mountain View, CA.

His talk asked the questions:

  • What is SamzaSQL, and what might I use it for?
  • Does this mean that Samza is turning into a database?
  • What is a query optimizer, and what can it do for my streaming queries?

The talk is available in [slides] and [video].

Calcite appoints Josh Elser to PMC

The Apache Calcite project management committee (PMC) today announced the appointment of Josh Elser to the committee.

Josh has only been a committer for a few months, but has become a prominent member of the Calcite project, and has taken leadership in several areas, not least in discussing the future of Avatica.

Release 1.6.0

As usual in this release, there are new SQL features, improvements to planning rules and Avatica, and lots of bug fixes. We’ll spotlight a couple of features make it easier to handle complex queries.

[CALCITE-816] allows you to represent sub-queries (EXISTS, IN and scalar) as RexSubQuery, a kind of expression in the relational algebra. Until now, the sql-to-rel converter was burdened with expanding sub-queries, and people creating relational algebra directly (or via RelBuilder) could only create ‘flat’ relational expressions. Now we have planner rules to expand and de-correlate sub-queries.

Metadata is the fuel that powers query planning. It includes traditional query-planning statistics such as cost and row-count estimates, but also information such as which columns form unique keys, unique and what predicates are known to apply to a relational expression’s output rows. From the predicates we can deduce which columns are constant, and following [CALCITE-1023] we can now remove constant columns from GROUP BY keys.

Metadata is often computed recursively, and it is hard to safely and efficiently calculate metadata on a graph of RelNodes that is large, frequently cyclic, and constantly changing. [CALCITE-794] introduces a context to each metadata call. That context can detect cyclic metadata calls and produce a safe answer to the metadata request. It will also allow us to add finer-grained caching and further tune the metadata layer.

See the release notes; download the release.

Release 1.5.0

This is our first release as a top-level Apache project! Thanks to everyone who has contributed to it.

In addition to a large number of bug fixes and minor enhancements, this release includes major improvements to Avatica, planner rules, and RelBuilder.

Further, we built Piglet, a subset of the classic Hadoop language Pig. Pig is particularly interesting because it makes heavy use of nested multi-sets. You can follow this example to implement your own query language, and immediately taking advantage of Calcite’s back-ends and optimizer rules.

See the release notes; download the release.

Calcite Graduates

On October 21st, 2015 the board of the Apache Software Foundation voted to establish Calcite as a top-level Apache project.

Calcite's graduation cake

Describing itself as “the foundation for your next high-performance database”, Calcite is a framework for building data management systems. Calcite includes a comprehensive implementation of relational algebra and an extensible cost-based query optimizer. It also includes an optional SQL parser and JDBC driver.

Calcite joined Apache as an incubator project in May, 2014. To graduate from the incubator, projects have to prove that they can create high quality releases, form a diverse community, and operate as a meritocracy.

Calcite’s committers have delivered eight releases during incubation (roughly one every two months) including the milestone 1.0 release in January, 2015.

The project has become a key component in many high-performance databases, including the Apache Drill, Apache Hive, Apache Kylin and Apache Phoenix open source projects, and several commercial products.

Also, in collaboration with Apache Samza and Apache Storm, Calcite is developing streaming extensions to standard SQL.

The Calcite community met at a hangout on October 27th, 2015, and celebrated with a graduation cake.

XLDB 2015 best lightning talk

Julian Hyde’s talk Apache Calcite: One planner fits all won Best Lightning Talk at the XLDB-2015 conference (with Eric Tschetter’s talk “Sketchy Approximations”).

XLDB is an annual conference that brings together experts from science, industry and academia to find practical solutions to problems involving extremely large data sets.

As a result of winning Best Lightning Talk, Julian will get a 30 minute keynote speaking slot at XLDB-2016.

The talk is available in slides and video.

Algebra builder

Calcite’s foundation is a comprehensive implementation of relational algebra (together with transformation rules, cost model, and metadata) but to create algebra expressions you had to master a complex API.

We’re solving this problem by introducing an algebra builder, a single class with all the methods you need to build any relational expression.

For example,

final FrameworkConfig config;
final RelBuilder builder = RelBuilder.create(config);
final RelNode node = builder
  .scan("EMP")
  .aggregate(builder.groupKey("DEPTNO"),
      builder.count(false, "C"),
      builder.sum(false, "S", builder.field("SAL")))
  .filter(
      builder.call(SqlStdOperatorTable.GREATER_THAN,
          builder.field("C"),
          builder.literal(10)))
  .build();
System.out.println(RelOptUtil.toString(node));

creates the algebra

LogicalFilter(condition=[>($1, 10)])
  LogicalAggregate(group=[{7}], C=[COUNT()], S=[SUM($5)])
    LogicalTableScan(table=[[scott, EMP]])

which is equivalent to the SQL

SELECT deptno, count(*) AS c, sum(sal) AS s
FROM emp
GROUP BY deptno
HAVING count(*) > 10

The algebra builder documentation describes the full API and has lots of examples.

We’re still working on the algebra builder, but plan to release it with Calcite 1.4 (see [CALCITE-748]).

The algebra builder will make some existing tasks easier (such as writing planner rules), but will also enable new things, such as writing applications directly on top of Calcite, or implementing non-SQL query languages. These applications and languages will be able to take advantage of Calcite’s existing back-ends (including Hive-on-Tez, Drill, MongoDB, Splunk, Spark, JDBC data sources) and extensive set of query-optimization rules.

If you have questions or comments, please post to the mailing list.

Calcite adds 5 committers

The Calcite project management committee today added five new committers for their work on Calcite. Welcome all!

  • Aman Sinha
  • Jesús Camacho-Rodríguez
  • Jinfeng Ni
  • John Pullokkaran
  • Nick Dimiduk

Release 1.2.0 Incubating

A short release, less than a month after 1.1.

There have been many changes to Avatica, hugely improving its coverage of the JDBC API and overall robustness. A new provider, JdbcMeta, allows you to remote an existing JDBC driver.

[CALCITE-606] improves how the planner propagates traits such as collation and distribution among relational expressions.

[CALCITE-613] and [CALCITE-307] improve implicit and explicit conversions in SQL.

See the release notes; download the release.

Release 1.1.0 Incubating

This Calcite release makes it possible to exploit physical properties of relational expressions to produce more efficient plans, introducing collation and distribution as traits, Exchange relational operator, and several new forms of metadata.

We add experimental support for streaming SQL.

This release drops support for JDK 1.6; Calcite now requires 1.7 or later.

We have introduced static create methods for many sub-classes of RelNode. We strongly suggest that you use these rather than calling constructors directly.

See the release notes; download the release.

Release 1.0.0 Incubating

Calcite’s first major release.

Since the previous release we have re-organized the into the org.apache.calcite namespace. To make migration of your code easier, we have described the mapping from old to new class names as an attachment to [CALCITE-296].

The release adds SQL support for GROUPING SETS, EXTEND, UPSERT and sequences; a remote JDBC driver; improvements to the planner engine and built-in planner rules; improvements to the algorithms that implement the relational algebra, including an interpreter that can evaluate queries without compilation; and fixes about 30 bugs.

See the release notes; download the release.

Release 0.9.2 Incubating

A fairly minor release, and last release before we rename all of the packages and lots of classes, in what we expect to call 1.0. If you have an existing application, it’s worth upgrading to this first, before you move on to 1.0.

See the release notes; download the release.

Calcite Twitter

The official @ApacheCalcite Twitter account pushes announcements about Calcite. If you give a talk about Calcite, let us know and we'll tweet it out and add it to the news section of the website.