This release comes 5 months after 1.37.0,
contains contributions from 39 contributors, and resolves 165 issues.
Highlights include the
AS MEASURE
clause to define measures and use them in
simple queries,
ASOF join,
the
EXCLUDE
clause in window aggregates, Postgres-compatible implementations of the
TO_DATE, TO_TIMESTAMP
and
TO_CHAR
functions, and the extension of the type system to allow
types with negative scale.
This release comes 5 months after 1.36.0,
contains contributions from 46 contributors, and resolves 138 issues. It’s worth highlighting the
introduction of adapter for Apache Arrow ([CALCITE-2040]),
StarRocks dialect ([CALCITE-6257]).
The release also added support for lambda expressions in SQL ([CALCITE-3679]),
‘Must-filter’ columns ([CALCITE-6219]).
For table function calls it is now possible to use them without TABLE() wrapper in FROM ([CALCITE-6254]).
Furthermore, there is support for optional FORMAT of CAST operator from SQL:2016 ([CALCITE-6254])
and more than 15 new SQL functions in various libraries such as BigQuery, PostgreSQL and Spark.
This release comes 3 months after 1.35.0,
contains contributions from 30 contributors, and resolves 125 issues.
Among other new features, it’s worth highlighting the adding of 30 new SQL functions in various libraries such as BigQuery and Spark, many improvements hardening TABLESAMPLE , and also the following features:
This release comes 1 month after 1.33.0,
contains contributions from 18 contributors, and resolves 34 issues.
It’s worth highlighting the introduction of QUALIFY clause ([CALCITE-5268]),
which facilitates filtering the results of window functions. Among other improvements and fixes, it
adds roughly 15 new functions in BigQuery library for handling dates, times, and timestamps, and
provides a fix ([CALCITE-5522])
for a small breaking change in DATE_TRUNC function ([CALCITE-5447]), which was
introduced accidentally in 1.33.0.
This release
fixesCVE-2022-39135,
an XML External Entity (XEE) vulnerability that allows a SQL query to
read the contents of files via the SQL functions EXISTS_NODE,
EXTRACT_XML, XML_TRANSFORM or EXTRACT_VALUE.
In 1.28, Calcite converted the recently introduced
configuration system
from an internal system based on
ImmutableBeans
to instead use the Immutables
annotation processor. This library brings a large number of additional
features that should make value-type classes in Calcite easier to
build and leverage. It also reduces reliance on dynamic proxies, which
should improve performance and reduce memory footprint. Lastly, this
change increases compatibility with ahead-of-time compilation
technologies such as GraalVM. As part of
this change, a number of minor changes have been made and key methods
and classes have been deprecated. The change was designed to minimize
disruption to existing consumers of Calcite but the following minor
changes needed to be made:
The
RelRule.Config.EMPTY
field is now deprecated. To create a new configuration subclass, you
can either use your preferred interface-implementation based
construction or you can leverage Immutables. To do the latter,
configure your project
to use the Immutables annotation processor and annotate your
subclass with the
@Value.Immutable
annotation.
Where RelRule.Config subclasses were nested 2+ classes deep, the
interfaces have been marked deprecated and are superceded by new,
uniquely named interfaces. The original Configs extend the new
uniquely named interfaces. Subclassing these work as before and the
existing rule signatures accept any previously implemented Config
implementations. However, this is a breaking change if a user stored
an instance of the DEFAULT object using the Config class name (as
the DEFAULT instance now only implements the uniquely named
interface).
The RelRule.Config.as() method should only be used for safe
downcasts. Before, it could do arbitrary casts. The exception is
that arbitrary as() will continue to work when using the
deprecated RelRule.Config.EMPTY field. In most cases, this should
be a non-breaking change. However, all Calcite-defined DEFAULT
rule config instances use Immutables. As such, if one had previously
subclassed a RelRule.Config subclass and then used the DEFAULT
instance from that subclass, the as() call will no longer work to
coerce the DEFAULT instance into a arbitrary subclass. In essence,
outside the EMPTY use, as() is now only safe to do if a Java
cast is also safe.
ExchangeRemoveConstantKeysRule.Config and
ValuesReduceRule.Config now declare concrete bounds for their
matchHandler configuration. This is a breaking change if one did not
use the Rule as a bounding variable.
Collections used in Immutables value classes will be converted to
Immutable collection types even if the passed in parameter is
mutable (such as an ArrayList). As such, consumers of those
configuration properties cannot mutate the returned collections.
This release comes eight months after 1.26.0.
It includes more than 150 resolved
issues, comprising a few new features, three minor breaking changes, many bug-fixes and small
improvements, as well as code quality enhancements and better test coverage.
Among others, it is worth highlighting the following:
On January 20, we are organising an online meetup for Apache
Calcite.
The main purpose is to bring the community together allowing newcomers and senior members to interact and exchange ideas
on various topics.
During the occasion we will have a few presentations covering introductory Calcite concepts, recent & ongoing work on
streams, spatial query implementation, and integration of Calcite in Hazelcast, followed by open discussion and
virtual key signing party.
Warning: Calcite 1.26.0 has severe issues with RexNode simplification caused by SEARCH operator (
wrong data from query optimization like in CALCITE-4325,
CALCITE-4352, NullPointerException),
so use 1.26.0 for development only, and beware that Calcite 1.26.0 might corrupt your data.
This release comes about two months after 1.25.0 and includes more than 70 resolved
issues, comprising a lot of new features and bug-fixes. Among others, it is worth highlighting the following.
This release comes about one month after 1.24.0 and removes methods
which were deprecated in the previous version. In addition, notable improvements in
this release are:
This release comes about two months after 1.23.0. It includes more than 80 resolved
issues, comprising a lot of new features as well as performance improvements
and bug-fixes. Among others, it is worth highlighting the following.
This release comes two months after 1.22.0. It includes more than 100 resolved
issues, comprising a lot of new features as well as performance improvements
and bug-fixes. For some complex queries, the planning speed can be 50x or more
faster than previous versions with built-in default rule set. It is also worth
highlighting that Calcite now:
Supports top down trait request and trait enforcement without abstract converter
(CALCITE-3896)
Improves VolcanoPlanner performance by removing rule match and subset importance
(CALCITE-3753)
Improves VolcanoPlanner performance when abstract converter is enabled
(CALCITE-2970)
This release comes five months after 1.21.0. It includes more than 250 resolved issues, comprising a large number of new features as well as general improvements and bug-fixes. Among others, it is worth highlighting the following.
Supports SQL hints for different kind of relational expressions (CALCITE-482)
This release comes two months after 1.20.0. It includes more than 100 resolved
issues, comprising a large number of new features as well as general improvements
and bug-fixes.
It is worth highlighting that Calcite now:
supports implicit type coercion in various contexts
(CALCITE-2302);
allows transformations of Pig Latin scripts into algebraic plans
(CALCITE-3122);
provides an implementation for the main features of MATCH_RECOGNIZE in the
Enumerable convention
(CALCITE-1935);
This release comes three months after 1.19.0. It includes more than 130 resolved issues, comprising of a few new features as well as general improvements and bug-fixes.
It includes support for anti-joins, recursive queries, new functions, a new adapter, and many more bug fixes and improvements.
This release comes three months after 1.18.0. It includes more than 80 resolved issues, comprising of a few new features as well as general improvements and bug-fixes. Among others, there have been significant improvements in JSON query support.
This release comes four months after 1.16.0. It includes more than 90 resolved
issues, comprising a large number of new features as well as general improvements
and bug-fixes. Among others:
This release comes three months after 1.15.0. It includes more than 80 resolved
issues, comprising a large number of new features as well as general improvements
and bug-fixes to Calcite core. Among others:
Calcite has been upgraded to use
Avatica 1.11.0,
which was recently released.
The Apache Calcite PMC
is pleased to announce
Apache Calcite release 1.15.0.
In this release, three months after 1.14.0, 50 issues are fixed by 22
contributors. Among more modest improvements and bug-fixes, here are
some features of note:
[CALCITE-707]
adds DDL commands to Calcite for the first time, including CREATE and DROP
commands for schemas, tables, foreign tables, views, and materialized views.
We know that DDL syntax is a matter of taste, so we added the extensions to a
new “server” module, leaving the “core” parser unchanged;
[CALCITE-2061]
allows dynamic parameters in the LIMIT and OFFSET and clauses;
[CALCITE-1913]
refactors the JDBC adapter to make it easier to plug in a new SQL dialect;
[CALCITE-1616]
adds a data profiler, an algorithm that efficiently analyzes large data sets
with many columns, estimating the number of distinct values in columns and
groups of columns, and finding functional dependencies. The improved
statistics are used by the algorithm that designs summary tables for a
lattice.
Calcite now supports JDK 10 and Guava 23.0. (It continues to run on
JDK 7, 8 and 9, and on versions of Guava as early as 14.0.1. The default
version of Guava remains 19.0, the latest version compatible with JDK 7
and the Cassandra adapter’s dependencies.)
This release comes three months after 1.13.0. It includes 68 resolved issues with many improvements and bug fixes.
This release brings some big new features.
The GEOMETRY data type was added along with 35 associated functions as the start of support for Simple Feature Access.
There are also two new adapters.
Firstly, the Elasticsearch 5 adapter which now exists in parallel with the previous Elasticsearch 2 adapter.
Additionally there is now an OS adapter which exposes operating system metrics as relational tables.
ThetaSketch and HyperUnique support has also been added to the Druid adapter.
Several minor improvements are added as well including improved MATCH_RECOGNIZE support, quantified comparison predicates, and ARRAY and MULTISET support for UDFs.
This release comes three months after 1.12.0. It includes more than 75 resolved issues, comprising
a large number of new features as well as general improvements and bug-fixes.
First, Calcite has been upgraded to use
Avatica 1.10.0,
which was recently released.
Moreover, Calcite core includes improvements which aim at making it more powerful, stable and robust.
In addition to numerous bux-fixes, we have implemented a
new materialized view rewriting algorithm
and new metadata providers which
should prove useful for data processing systems relying on Calcite.
In addition, more progress has been made for the different adapters.
For instance, the Druid adapter now relies on
Druid 0.10.0 and
it can generate more efficient plans where most of the computation can be pushed to Druid,
e.g., using extraction functions.
The Apache Calcite PMC
is pleased to announce further growth of its sub-project, Avatica.
Avatica has been slowly growing inside of Calcite for many years (dating back
to Optiq-0.4.x!). The team has taken the next step to hoist the Avatica code
out of the Calcite repository into its own. The team felt like this was the
next logical step given the maturity of the project.
The previous “/avatica” directory in the Calcite repository has been removed, so
further contributions should be submitted agains the new repository. The de-facto
repository can be found at the ASF’s Git hosting,
with a mirrored-copy also available on Github at apache/calcite-avatica.
Calcite now supports JDK 9 and Guava 21.0. (It continues to run on
JDK 7 and 8, and on versions of Guava as early as 14.0.1. The default
version of Guava remains 19.0, due to the Cassandra adapter’s
dependencies, and the fact that Guava 21.0 requires JDK 8 or later.)
There are two new adapters:
The File adapter
can read files of various formats (such as CSV, JSON, zipped files,
and HTML) over various protocols (including file and HTTP). If
reading HTML files, it can extract data from nested <TABLE>
elements.
And there are continuing improvements in performance and stability of
the Druid adapter. (The Druid project now
embeds Calcite to provide SQL support,
and there has been cross-fertilization between the projects.)
To err is human, as the saying goes. If you mis-type the name of a
schema, table or column in a SQL statement, Calcite now
helps you correct it.
The error message indicates whether it was whether it was the schema,
table or column that was not found; if the mistake was just due to an
upper- or lower-case letter, it suggests the correct name.
New SQL syntax and functions:
HOP, TUMBLE and SESSION functions in the GROUP BY clause
allow you to aggregate over window types (especially useful for
streaming queries);
Experimental support for the MATCH_RECOGNIZE clause for
Complex-Event Processing (CEP);
New YEAR, MONTH, WEEK, DAYOFYEAR, DAYOFMONTH, DAYOFWEEK,
HOUR, MINUTE, SECOND, DATABASE, IFNULL, and USER
functions to comply with the ODBC/JDBC standard. Also, EXTRACT now
allows the corresponding time-unit arguments.
Nearly three months after the previous release, there is a
long list of improvements and bug-fixes,
many of them making planner rules smarter. The following are some of
the more important ones.
Several adapters have improvements:
The JDBC adapter can now push down DML (INSERT, UPDATE, DELETE),
windowed aggregates (OVER), IS NULL and IS NOT NULL operators.
The Cassandra adapter now supports authentication.
Several key bug-fixes in the Druid adapter.
For correlated and uncorrelated sub-queries, we generate more
efficient plans (for example, in some correlated queries we no longer
require a sub-query to generate the values of the correlating
variable), can now handle multiple correlations, and have also fixed a
few correctness bugs.
New SQL syntax:
CROSS APPLY and OUTER APPLY;
MINUS as a synonym for EXCEPT;
an AS JSON option for the EXPLAIN command;
compound identifiers in the target list of INSERT, allowing you to
insert into individual fields of record-valued columns (or column
families if you are using the Apache Phoenix adapter).
A variety of new and extended built-in functions: CONVERT, LTRIM,
RTRIM, 3-parameter LOCATE and POSITION, RAND, RAND_INTEGER,
and SUBSTRING applied to binary types.
There are minor but potentially breaking API changes in
[CALCITE-1519]
(interface SubqueryConverter becomes SubQueryConverter and some
similar changes in the case of classes and methods) and
[CALCITE-1530]
(rename Shuttle to Visitor, and create a new class Visitor<R>).
See the cases for more details.
This release comes shortly after 1.9.0. It includes mainly bug fixes for the core and
Druid adapter. For the latest, we fixed an
important issue that
prevented us from handling consistently time dimensions in different time zones.
This release includes extensions and fixes for the Druid adapter. New features were
added, such as the capability to
recognize and translate Timeseries and TopN Druid queries.
Moreover, this release contains multiple bug fixes over the initial implementation of the
adapter. It is worth mentioning that most of these fixes were contributed by Druid developers,
which demonstrates the good reception of the adapter by that community.
We also added support for
SELECT without FROM
(equivalent to the VALUES clause, and widely used in MySQL and PostgreSQL),
and added a
conformance
parameter to allow you to selectively enable this and other SQL features.
And, as usual, there are a couple of dozen bug-fixes and enhancements to
planner rules and APIs.
A new Apache Calcite adapter allows you to access
Apache Cassandra via industry-standard SQL.
You can map a Cassandra keyspace into Calcite as a schema, Cassandra
CQL tables as tables, and execute SQL queries on them, which Calcite
converts into CQL.
Cassandra can define and maintain materialized views but the adapter
goes further: it can transparently rewrite a query to use a
materialized view even if the view is not mentioned in the query.
The Cassandra adapter is available as part of
Apache Calcite version 1.7.0,
which has just been released. Calcite also has
adapters
for CSV and JSON files, and JDBC data source, MongoDB, Spark and Splunk.
We have added
an adapter for
Apache Cassandra.
You can map a Cassandra keyspace into Calcite as a schema, Cassandra
CQL tables as tables, and execute SQL queries on them, which Calcite
converts into CQL.
Cassandra can define and maintain materialized views but the adapter
goes further: it can transparently rewrite a query to use a
materialized view even if the view is not mentioned in the query.
This release adds an
Oracle-compatibility mode.
If you add fun=oracle to your JDBC connect string, you get all of
the standard operators and functions plus Oracle-specific functions
DECODE, NVL, LTRIM, RTRIM, GREATEST and LEAST. We look
forward to adding more functions, and compatibility modes for other
databases, in future releases.
We’ve replaced our use of JUL (java.util.logging)
with SLF4J. SLF4J provides an API which Calcite can use
independent of the logging implementation. This ultimately provides additional
flexibility to users, allowing them to configure Calcite’s logging within their
own chosen logging framework. This work was done in
[CALCITE-669].
For users experienced with configuring JUL in Calcite previously, there are some
differences as some the JUL logging levels do not exist in SLF4J: FINE,
FINER, and FINEST, specifically. To deal with this, FINE was mapped
to SLF4J’s DEBUG level, while FINER and FINEST were mapped to SLF4J’s TRACE.
The Apache Calcite project management committee (PMC) today announced the
appointment of Josh Elser
to the committee.
Josh has only been a committer for a few months, but has become a prominent
member of the Calcite project, and has taken leadership in several areas,
not least in discussing the future of Avatica.
As usual in this release, there are new SQL features, improvements to
planning rules and Avatica, and lots of bug fixes. We’ll spotlight a
couple of features make it easier to handle complex queries.
[CALCITE-816]
allows you to represent sub-queries (EXISTS, IN and scalar) as
RexSubQuery,
a kind of expression in the relational algebra. Until
now, the sql-to-rel converter was burdened with expanding sub-queries,
and people creating relational algebra directly (or via
RelBuilder)
could only create ‘flat’ relational expressions. Now we have planner
rules to expand and de-correlate sub-queries.
Metadata is the fuel that powers query planning. It includes
traditional query-planning statistics such as cost and row-count
estimates, but also information such as which columns form unique
keys, unique and what predicates are known to apply to a relational
expression’s output rows. From the predicates we can deduce which
columns are constant, and following
[CALCITE-1023]
we can now remove constant columns from GROUP BY keys.
Metadata is often computed recursively, and it is hard to safely and
efficiently calculate metadata on a graph of RelNodes that is large,
frequently cyclic, and constantly changing.
[CALCITE-794]
introduces a context to each metadata call. That context can detect
cyclic metadata calls and produce a safe answer to the metadata
request. It will also allow us to add finer-grained caching and
further tune the metadata layer.
This is our first release as a top-level Apache project! Thanks to everyone who has contributed to it.
In addition to a large number of bug fixes and minor enhancements, this release includes major improvements to Avatica, planner rules, and RelBuilder.
Further, we built Piglet, a subset of the classic Hadoop language Pig. Pig is particularly interesting because it makes heavy use of nested multi-sets. You can follow this example to implement your own query language, and immediately taking advantage of Calcite’s back-ends and optimizer rules.
On October 21st, 2015 the board of the
Apache Software Foundation
voted to establish Calcite as a top-level Apache project.
Describing itself as “the foundation for your next high-performance
database”, Calcite is a
framework for building data management systems.
Calcite includes a comprehensive implementation of relational algebra
and an extensible cost-based query optimizer. It also includes an
optional SQL parser and JDBC driver.
Calcite joined Apache as an incubator project in May, 2014. To
graduate from the incubator, projects have to prove that they can
create high quality releases, form a diverse community, and operate as
a meritocracy.
Calcite’s committers have delivered eight releases during incubation
(roughly one every two months) including the
milestone 1.0 release in January, 2015.
The project has become a key component in many high-performance
databases, including the
Apache Drill,
Apache Hive,
Apache Kylin and
Apache Phoenix open source projects,
and several commercial products.
In addition to a large number of bug fixes and minor enhancements,
this release includes improvements to
lattices and
materialized views,
and adds a
builder API
so that you can easily create relational algebra expressions.
Julian Hyde’s talk Apache Calcite: One planner fits all won
Best Lightning Talk
at the XLDB-2015 conference (with Eric Tschetter’s talk “Sketchy
Approximations”).
XLDB is an annual conference that brings together experts from
science, industry and academia to find practical solutions to problems
involving extremely large data sets.
As a result of winning Best Lightning Talk, Julian will get a 30
minute keynote speaking slot at XLDB-2016.
Calcite’s foundation is a comprehensive implementation of relational
algebra (together with transformation rules, cost model, and metadata)
but to create algebra expressions you had to master a complex API.
We’re solving this problem by introducing an
algebra builder,
a single class with all the methods you need to build any relational
expression.
We’re still working on the algebra builder, but plan to release it
with Calcite 1.4 (see
[CALCITE-748]).
The algebra builder will make some existing tasks easier (such as
writing planner rules), but will also enable new things, such as
writing applications directly on top of Calcite, or implementing
non-SQL query languages. These applications and languages will be able
to take advantage of Calcite’s existing back-ends (including
Hive-on-Tez, Drill, MongoDB, Splunk, Spark, JDBC data sources) and
extensive set of query-optimization rules.
If you have questions or comments, please post to the
mailing list.
There have been many changes to Avatica, hugely improving its coverage of the
JDBC API and overall robustness. A new provider, JdbcMeta, allows
you to remote an existing JDBC driver.
[CALCITE-606]
improves how the planner propagates traits such as collation and
distribution among relational expressions.
This Calcite release makes it possible to exploit physical properties
of relational expressions to produce more efficient plans, introducing
collation and distribution as traits, Exchange relational operator,
and several new forms of metadata.
We add experimental support for streaming SQL.
This release drops support for JDK 1.6; Calcite now requires 1.7 or
later.
We have introduced static create methods for many sub-classes of
RelNode. We strongly suggest that you use these rather than
calling constructors directly.
Since the previous release we have re-organized the into the org.apache.calcite
namespace. To make migration of your code easier, we have described the
mapping from old to new class names
as an attachment to
[CALCITE-296].
The release adds SQL support for GROUPING SETS, EXTEND, UPSERT and sequences;
a remote JDBC driver;
improvements to the planner engine and built-in planner rules;
improvements to the algorithms that implement the relational algebra,
including an interpreter that can evaluate queries without compilation;
and fixes about 30 bugs.
A fairly minor release, and last release before we rename all of the
packages and lots of classes, in what we expect to call 1.0. If you
have an existing application, it’s worth upgrading to this first,
before you move on to 1.0.
The official @ApacheCalcite
Twitter account pushes announcements about Calcite. If you give a talk about
Calcite, let us know and we'll tweet it out and add it to the news section
of the website.