Morel release 0.7.0
I am pleased to announce Morel release 0.7.0, just one month after release 0.6.0.
This release has actually been under development for a long time.
Ordered and unordered collections and queries,
which are the centerpiece of this release, required major changes to
the type inference algorithm, not to mention a new
data type (bag
),
query step (unorder
),
and
expression (ordinal
).
The type inference changes have been under development for six months
(during which time there were two other Morel releases), and were so
extensive that we got
function overloading practically for free.
There are other changes to query syntax:
sorting on expressions,
atomic yield
steps, and
set operators in pipelines.
Morel aims to be a solid implementation of Standard ML and good
general-purpose programming language, in addition to being a
revolutionary query language, which means gradually completing our
implementation of Standard ML’s
Basis Library. This release we
have completed the
String
and Char
structures.
Let’s explore the key features. For complete details, see the official release notes.
1. Ordered and unordered collections and queries
The biggest change in 0.7.0 is the introduction of
ordered and unordered collections and queries.
Previously, every query was over a list
type, whose elements were
ordered and duplicates were allowed.
But saying that every collection and query is over a list
type
is a white lie. Consider this query:
from e in scott.emps
where e.sal > 1000.0
yield e.ename;
The collection scott.emps
maps to the EMP
table in the scott
database, and Morel’s goal is to push as much of the processing as
possible to where the data resides. In this case, Morel can generate
the SQL query
SELECT ENAME
FROM SCOTT.EMP
WHERE SAL > 1000.0;
SQL makes no guarantees about the order of results. If you execute
the query twice, a DBMS is free to return the results in a different
order each time. So Morel is being dishonest if it says that result
is a list
.
Could we redefine list
so that its iteration order is undefined?
Yes, but then we would be short-changing queries such as
from i in ["a", "b"],
j in [1, 2, 3]
yield (i, j);
> val it = [("a",1),("a",2),("a",3),("b",1),("b",2),("b",3)]
> : (string * int) list
which do have a defined order.
The fact is – even though the relational model tells us it ain’t so
– some data sets are ordered, and some are unordered. Adding distinct
bag
and list
types, relational operators that can work on both,
and relational operators to convert between them, was the way to go.
The features that we implemented are described in the article “Ordered and unordered data”.
2. Function overloading
In Standard ML, and in Morel until recently, a name could only have
one binding. Functions are values, and therefore inhabit the same
namespace as regular values. If I declare x
to be an int
value
val x = 42;
and then later try to declare x
to be a function
val x = fn y => y + 1;
then the previous declaration of x
is no longer accessible.
int z = x - 2;
> 0.0-0.0 Error: Cannot deduce type: conflict: fn(int, int) vs int
> raised at: 0.0-0.0
To create
overloaded functions,
we need declare that an identifier is special; we do this using the
new over
keyword:
over f;
> over f
Now we can define several instances of f
:
val inst f = fn (x : int, y : int) => x + y;
> val f = fn : int * int -> int
val inst f = fn list => length list;
> val f = fn : 'a list -> int
val inst f = fn SOME x => x ^ "!" | NONE => ":(";
> val f = fn : string option -> string
All must be functions, because the overloads are resolved based on the type of the first argument.
Calls to f
will be resolved based on the types of the arguments:
(* Call the "int * int -> int" overload. *)
f (7, 8);
> val it = 15 : int
(* Call the "'a list -> int" overload. *)
f ["a", "b", "c"];
> val it = 3 : int
f [1, 2, 3, 4];
> val it = 4 : int
f [];
> val it = 0 : int
(* Call the "string option -> string" overload. *)
f (SOME "happy");
> val it = "happy!" : string
f NONE;
> val it = ":(" : string
(* No overloads match "int option" or "(int, int, int)" arguments. *)
f (SOME 42);
> 0.0-0.0 Error: Cannot deduce type: no valid overloads
> raised at: 0.0-0.0
f (1, 2, 3);
> 0.0-0.0 Error: Cannot deduce type: no valid overloads
> raised at: 0.0-0.0
3. Sorting on expressions
There are only a few places in Morel syntax where you do not use an
expression, and the order
step used to be one of them. Previously,
order
was followed by a list of “order items”, each an expression
optionally followed by desc
. The items were separated by commas, and
the list could not be empty.
The commas were a problem. In the expression
foo (from i in [1, 2, 3] order i desc, j);
it is not clear whether j
is a second argument for the call to the
function foo
or the second item in the order
clause.
Another problem was the fact that the order
clause could not be
empty. The
ordered/unordered collections
feature introduced an unorder
step to convert a list
to a bag
,
and we need the opposite of that, a trivial sort whose
key has the same value for every element.
The answer was to
make the argument to order
an expression.
A composite sort specification is now a tuple, still separated by
commas, but now enclosed in parentheses. If a sort key is descending,
you now wrap it in the Descending
data type by preceding it with the
DESC
. Thus:
(* Old syntax *)
from e in scott.emps
order e.job, e.sal desc;
(* New syntax *)
from e in scott.emps
order (e.job, DESC e.sal);
You can now sort by any data type, including tuples, records,
sum-types such as Option
and Descending
, lists, bags, and any
combination thereof.
To achieve the trivial sort, you can sort by any constant value, such
as the integer 0
or the Option
constructor NONE
, but
conventionally you would sort by the empty tuple ()
:
from e in scott.emps
yield e.ename
order ();
> val it =
> ["SMITH","ALLEN","WARD","JONES","MARTIN","BLAKE","CLARK",
> "SCOTT","KING","TURNER","ADAMS","JAMES","FORD","MILLER"]
> : string list
The key thing is that the result is a list
. The elements are in
arbitrary order (because any order is consistent with the empty sort
key) but in converting the collection to a list
the arbitrary order
has become frozen and repeatable.
4. Atomic yield steps
At any step in a Morel query, there are generally several named fields
you can use to reference parts of the current row. For example, the
where
step in the following query refers to both fields, i
and
j
.
from i in [1, 2, 3],
j in [4, 5, 6]
where i + j > 7;
> i j
> - -
> 2 6
> 3 5
> 3 6
>
> val it : {i:int, j:int} list
But there is one circumstance where a step does not produce any named
fields: a yield
whose expression is not a record, what we call an
“atomic yield”. Here is an example:
from i in [1, 2, 3],
j in [4, 5, 6]
yield i + j;
That query is valid, but suppose we wished to sort or filter the
results. If we added an order
or where
step it would have no way
to refer to the current row. We allowed atomic yields because we
needed queries with non-record elements, but we made a rule that the
atomic yield had to be the last step.
That restriction was becoming more of a burden, and the final straw
was ordered/unordered queries, which often end in order
or
unorder
. So we decided to fix the problem.
We
added a new expression, current
,
that refers to the current element. (It is only available in query
steps, but you can use it inside a sub-expression or sub-query.) If
the value is atomic, current
is that value; if there are named
fields, current
is a record consisting of those fields. (In the
previous example, current
would be equivalent to {i, j}
.)
If a yield
is atomic but the expression has a clear name, as in
yield i
or yield e.deptno
, you can also use that name. (The
expression is still considered atomic, and the result of the query
will be a collection of that type, not a collection of records.)
Here are some examples of current
in action.
from i in [1, 2, 3],
j in [4, 5, 6]
yield i + j
order DESC current;
> val it = [9,8,8,7,7,7,6,6,5] : int list
from maker in ["ford", "ferrari"],
color in ["red", "green"]
order current.color;
> color maker
> ----- -------
> green ford
> green ferrari
> red ford
> red ferrari
>
> val it : {color:string, maker:string} list
from i in [1, 2, 3, 4]
yield 4 * (i mod 2) + (i div 2)
order current;
> val it = [1,2,4,5] : int list
from e in scott.emps
yield e.deptno
distinct
order current;
> val it = [10,20,30] : int list
from e in scott.emps
yield e.deptno
distinct
order deptno;
> val it = [10,20,30] : int list
5. Set operators in pipelines
The set operators (union
, intersect
and except
) were previously
available via functions but now have
dedicated steps in
the query pipeline.
The steps have slightly different semantics for ordered and unordered
collections, and have an optional distinct
keyword to eliminate
duplicates.
For example, here is a query that finds all employees in departments 10 and 20, but excludes those who are managers or clerks:
from e in scott.emps
where e.deptno = 10
union (from e in scott.emps where e.deptno = 20)
except (from e in scott.emps where e.job = "MANAGER"),
(from e in scott.emps where e.job = "CLERK");
If you have ever wondered about the semantics of intersect
and
except
with duplicates, wonder no more!
INTERSECT ALL, EXCEPT ALL, and the arithmetic of fractions
explains everything using a fun example.
6. String and Char structures
Morel now includes complete
String
and
Char
structures
following the
Standard ML Basis Library
specification.
This gives you comprehensive text manipulation capabilities:
String.size "hello world";
> val it = 11 : int
String.substring ("hello world", 6, 5);
> val it = "world" : string
String.tokens (fn c => c = #" ") "hello world morel";
> val it = ["hello","world","morel"] : string list
Char.isAlpha #"a";
> val it = true : bool
Char.toUpper #"a";
> val it = #"A" : char
String.map Char.toUpper "hello";
> val it = "HELLO" : string
These structures provide everything you need for serious text processing, from basic operations like substring extraction to advanced features like tokenization and character classification.
7. Breaking changes
This release includes some breaking changes to be aware of.
Database schema updates
The scott
sample database now uses
pluralized table names,
mapping the emps
value maps to the EMP
table, and depts
to the
DEPT
table.
(* Old *)
from e in scott.emp
join d in scott.dept on e.deptno = d.deptno;
(* New *)
from e in scott.emps
join d in scott.depts on e.deptno = d.deptno;
This change aligns with the modern programming convention that collections have plural names.
Type-based orderings
The previous order
syntax no longer works.
You should convert a following desc
to preceding DESC
:
(* Old syntax *)
from e in scott.emps
order e.sal desc;
(* New syntax *)
from e in scott.emps
order DESC e.sal;
and put parentheses around composite orderings:
(* Old syntax *)
from e in scott.emps
order e.job, e.sal desc;
(* New syntax *)
from e in scott.emps
order (e.job, DESC e.sal);
Conclusion
Release 0.7.0 represents a major evolution in Morel’s capabilities. Extensions to the query language, type system, and standard library make Morel a good solution for a wide range of data processing tasks, from simple queries to complex data transformations.
As always, you can get started with Morel by visiting GitHub. For more background, read about its goals and basic language, and find a full definition of the language in the query reference and the language reference.
If you have comments, please reply on Bluesky @julianhyde.bsky.social or Twitter:
I'm pleased to announce release 0.7 of @morel_lang! This is a huge release, adding support for ordered/unordered data, set operators, and revised order syntax. A major rework of Morel's type inference algorithm delivered function overloading. https://t.co/hERffT3Kxn
— Julian Hyde (@julianhyde) June 9, 2025