Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Transformer class and bin_expand function #30

Merged
merged 3 commits into from
Feb 28, 2023

Conversation

oscarbenjamin
Copy link
Owner

@oscarbenjamin oscarbenjamin commented Feb 28, 2023

There are several things mixed up here:

  • Add a bin_expand method to Expr.
  • Make a new class Transformer as an alternative to Evaluator
  • Small refactor of Evaluator to have an eval_atom method and rename call to eval_operation.
  • Add a protosym.core/exceptions module to define exception types.
  • Add an XFAIL test for printing f(x) in simplecas.

A precursor to adding lambdify with LLVM is having a bin_expand method that can turn an associative operation into a sequence of binary operations e.g. (x + y + z) -> ((x + y) + z). This is because LLVM IR only defines binary operations. Actually it is currently difficult in protosym to create a flattened Add like (x + y + z) because no flatten function is provided and all natural operations like + will give unflattened expressions consisting of binary operations. An expression converted from SymPy for example will have associative operations though:

In [1]: from protosym.simplecas import *

In [2]: x + y + x
Out[2]: ((x + y) + x)

In [3]: from sympy.abc import a, b, c

In [4]: Expr.from_sympy(a + b + c)
Out[4]: (a + b + c)

In [5]: type(_)
Out[5]: protosym.simplecas.Expr

Likewise when parsing is added I would want parse('x + y + z') to give a flattened Add.

When writing the bin_expand function I realised that it is quite awkward to do with Evaluator because Evaluator requires defining rules for all atoms and heads. In the case of bin_expand we only want to modify Add and Mul and leave all other expressions unchanged but we don't want to have to list rules for all of the expression types that we don't want to change. For this I added a Transformer class as a subclass of Evaluator, A Transformer is for converting a TreeExpr nto another TreeExpr and allows any atoms or heads without rules to pass through unmodified. See the tests and doctests for examples comparing this with Evaluator.

I also discovered that in simplecas this fails and added an XFAIL test for it:

In [5]: from protosym.simplecas import *

In [6]: f = Function('f')

In [7]: print(f(x))
---------------------------------------------------------------------------
KeyError: TreeAtom(Function('f'))

This is somewhat similar to the bin_expand/Evaluator issue. Basically there are printing rules for f as an atom:

In [9]: print(f)
f

However when f is a head as in f(x) the evaluator expects a rule for each head like a rule for sin(...) and rule forcos(...)`. There needs to be a way to give a default rule for the case when there is not a rule for the given head.

I think that it is always good practice to have an exceptions module that downstream code can import from and look in to see all of the possible exceptions so I've added that. Currently we have

$ git grep 'raise '
noxfile.py:    raise SystemExit(dedent(message)) from None
src/protosym/core/evaluate.py:            raise NoEvaluationRuleError(msg)
src/protosym/simplecas.py:        raise ExpressifyError
src/protosym/simplecas.py:            raise TypeError("First argument to Expr should be TreeExpr")
src/protosym/simplecas.py:    raise NotImplementedError("Cannot convert " + type(expr).__name__)

Perhaps the other exceptions raised should be changed to use something other than TypeError and NotImplementedError. I think it is generally good not to reuse exception types that are already used in other places because then someone trying to catch exceptions can catch the wrong one. Ideally an exception class should be used to identify a clear scope in the code for where the exception comes from if it is caught anywhere.

@oscarbenjamin oscarbenjamin added the enhancement New feature or request label Feb 28, 2023
@oscarbenjamin
Copy link
Owner Author

CC @brocksam (not sure if you get automatically notified otherwise).

I will make a separate PR to add lambdify without bin_expand for now since it works fine that way with most protosym expressions anyway.

Copy link
Collaborator

@brocksam brocksam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all seems logical. I understand the need for Transformer. Will be interested to see if any extra machinery is needed to get lambdify to work because it'd be nice to not have to introduce further Evaluator-like classes without good reason (my feeling is that Evaluator should be able to cover both interpretation and printing as complexity increases).

Transformer has got me thinking about simplification and rewriting. I'm now pondering whether it'd be possible to use (a slightly modified) Transformer to handle the shallow simplification rewrites that we discussed yesterday, e.g. -1*(-1*sin(x)) -> sin(x), or whether some form of pattern matching would still be required.

@oscarbenjamin
Copy link
Owner Author

However when f is a head as in f(x) the evaluator expects a rule for each head like a rule for sin(...) and rule forcos(...)`. There needs to be a way to give a default rule for the case when there is not a rule for the given head.

I think what is needed is something like Evaluator.add_op_generic method that is used as a fallback.

Most Evaluator instances probably would not want a fallback but there is another thing that would be wanted which is partial evaluation e.g.:

In [5]: N(x + sqrt(2))
Out[5]: x + 1.4142135623731

Here we can evalf part of the expression but then generate a symbolic result that includes the part that was evalfed. That could be done with Evaluator[T] but there would need to be a way to convert a T back into a TreeExpr so the Evaluator would need a function for doing that. There would need to be separate methods for evaluating completely to T or for evaluating to a TreeExpr representation that ideally is just a TreeAtom[T] in the way that SymPy's evalf does.

Ideally it would also handle things like evaluating a subset of args to an associative/commutative operator and recursing through operations that are undefined. SymPy's evalf has a number of deficiencies like this:

In [10]: f = Function('f')

In [11]: N(f(sqrt(2)))
Out[11]: f(sqrt(2))

In [12]: N(exp(sqrt(2)*x))
Out[12]: exp(sqrt(2)*x)

In [13]: N(sqrt(2)*x)
Out[13]: 1.4142135623731*x

@oscarbenjamin
Copy link
Owner Author

Thanks for the review!

I'll merge this now and I think I'll fix the f(x) printing bug first before lambdify.

@oscarbenjamin oscarbenjamin merged commit 01bf821 into main Feb 28, 2023
@oscarbenjamin oscarbenjamin deleted the pr_binexpand branch February 28, 2023 13:44
@oscarbenjamin
Copy link
Owner Author

This all seems logical. I understand the need for Transformer. Will be interested to see if any extra machinery is needed to get lambdify to work because it'd be nice to not have to introduce further Evaluator-like classes without good reason (my feeling is that Evaluator should be able to cover both interpretation and printing as complexity increases).

Yes, I've been wondering this as well. I think I always wanted a clean separation between "evaluation" and "transformation" but as discussed in my last comment it is not always possible to cleanly separate them e.g. because of partial evalf. I think that ideally we would do away with these classes altogether and just have a "collection of rules" type. Then Evaluator just becomes an evaluate function that takes a collection of rules (likewise for Transformer).

@oscarbenjamin
Copy link
Owner Author

oscarbenjamin commented Feb 28, 2023

Transformer has got me thinking about simplification and rewriting. I'm now pondering whether it'd be possible to use (a slightly modified) Transformer to handle the shallow simplification rewrites that we discussed yesterday, e.g. -1*(-1*sin(x)) -> sin(x), or whether some form of pattern matching would still be required.

I think Transformer can do that. It just needs a rule for Mul that special cases Integer. The potentially tricky part is doing it efficiently when there are many args and making it so that you don't need to loop over the args several times looking for Integer, Rational, Float etc one after another. In SymPy (and SymEngine and many other systems) Add and Mul apply the following canonicalisation steps that handle this:

  1. Flatten nested Adds and Muls: Add(Add(x, y), z) -> Add(x, y, z).
  2. Remove identities like 0 and 1.
  3. Collect all explicit numbers, combine and bring to the front: Add(1, x, 2.5) -> Add(1+2.5, x).
  4. Separate all coefficients in terms i.e. collect together 2*x and 3*x and combine the numeric coefficients to (2+3)*x. Likewise in a Mul collect x**2 and x**3 to make x**(2+3).
  5. Place all terms into a canonical ordering.

Bringing all numbers together at the front of an Add/Mul makes it very quick to check if there is a numeric coefficient because it's always at args[0]. Of course depending on that means breaking the handling of unevaluated expressions though:

In [14]: Mul(x, 2, evaluate=False).as_coeff_Mul()
Out[14]: (1, x*2)

In [15]: Mul(x, 2).as_coeff_Mul()
Out[15]: (2, x)

Probably many of these rules can be made generic in some sense as rules for:

  • flattening in associative operators.
  • collecting in associative operators.
  • combining explicit constants in associative/commutative operators.
  • collecting like terms in associative/commutative operators.
  • reordering in commutative operators.

The same rules could then ideally be used for Union, Intersection, And, Or etc. There are many associative/commutative operators with similar notions. I think in Wolfram language (WL) some of these are done using function properties ("flat", "orderless" and "one identity") rather than pattern matching:
https://reference.wolfram.com/language/guide/Attributes.html
I expect that Mathematica itself still has a special case implementation for common operations like Add, Mul, Pow at least under the hood though.

In so far as possible I would like to preserve the idea of not having unnecessary automatic simplification but we should keep in mind that there are probably some good reasons that very few CAS systems take that path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants