iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
👻

Playing with Visible Forall in GHC 9.10

に公開

English version: Playing with Visible Forall in GHC 9.10 - Mizuki's Blog

Today, GHC 9.10.1 was released. One of the new features is "visible forall" (the RequiredTypeArguments extension). In this article, I'd like to dive in and play around with it right away.

Official documentation for this feature can be found here:

Basic: The id Function

The simplest example is likely an id function that explicitly takes a type. A standard id function and an id function using visible forall can be written as follows:

{-# LANGUAGE RequiredTypeArguments #-}

-- From the User's Guide:

-- Standard id function
id :: forall a. a -> a
id x = x

-- id function using visible forall
id_vdq :: forall a -> a -> a
id_vdq a x = x

Let's run it in GHCi:

ghci> :set +t
ghci> id 42  -- Let the type be inferred
42
it :: Num a => a
ghci> id @Int 42  -- Explicitly provide the type
42
it :: Int
ghci> id_vdq _ 42  -- Let the type be inferred
42
it :: Num w => w
ghci> id_vdq Int 42  -- Explicitly provide the type (note that @ is not used!)
42
it :: Int

In other words, by declaring a function with forall ->, you can pass types without using @.

Note that if the same notation is used for both terms and types, the term interpretation takes precedence. Let's try passing the [Int] type:

ghci> id_vdq [Int] [42]
<interactive>:37:8: error: [GHC-83865]
    • Expected a type, but ‘[Int]’ has kind ‘[*]’
    • In the type ‘[Int]’
      In the expression: id_vdq [Int] [42]
      In an equation for ‘it’: it = id_vdq [Int] [42]

It resulted in an error because [Int] was interpreted as a "type-level list containing Int" rather than a "type of a list of Ints." There are two ways to work around this.

First, one is to use the type expression available with the ExplicitNamespaces extension.

ghci> :set -XExplicitNamespaces 
ghci> id_vdq (type [Int]) [42]
[42]
it :: [Int]

Another method is to avoid using the same notation for terms and types. For list and tuple types, aliases such as List and Tuple2 are provided in the Prelude.Experimental module.

ghci> :m + Prelude.Experimental
ghci> id_vdq (List Int) [42]
[42]
it :: [Int]

Binary Operators

Binary operators can also accept types. Let's consider the following function:

{-# LANGUAGE RequiredTypeArguments #-}

as :: forall a. a -> forall a' -> a ~ a' => a'
as x _ = x

When used as an infix operator, this function plays a role similar to a type annotation :::

ghci> :set +t
ghci> 42 `as` Integer
42
it :: Integer
ghci> 42 `as` Rational
42 % 1
it :: Rational
ghci> 42 `as` Double
42.0
it :: Double

While this by itself might not seem very useful, we can also easily create functions that specify "part of a type."

{-# LANGUAGE RequiredTypeArguments #-}

as' :: forall f a. f a -> forall a' -> a ~ a' => f a'
as' x _ = x
ghci> :m + Data.Functor.Identity
ghci> Identity 42 `as'` Int
Identity 42
it :: Identity Int
ghci> Identity 42 `as'` Rational
Identity (42 % 1)
it :: Identity Rational

Of course, specifying part of a type was already possible in conventional Haskell using the PartialTypeSignatures extension.

Type Classes

It would be useful if type class methods could accept types. For example, it would be much easier to write sizeOf Int instead of sizeOf (undefined :: Int). Is such a definition possible?

-- Hypothetical code
class NewStorable a where
  sizeOf :: forall a -> Int

Unfortunately, this does not work. The a from the type class and the a from the forall are treated as separate entities.

-- Actual interpretation
class NewStorable a where
  sizeOf :: forall a' -> Int

-- From an external perspective, the type becomes sizeOf :: forall a. NewStorable a => forall a' -> Int

The standard approach is to create a wrapper.

{-# LANGUAGE AllowAmbiguousTypes #-}

class NewStorable a where
  sizeOf_ :: Int

-- Wrapper
sizeOf :: forall a -> NewStorable a => Int
sizeOf a = sizeOf_ @a

Another method is to use a trick involving ~.

class NewStorable a where
  sizeOf :: forall a' -> a ~ a' => Int

However, using an extra => might cause subtle differences in the intermediate code. Consider the following code:

{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE RequiredTypeArguments #-}
import Debug.Trace
import Data.Proxy

newtype Tagged t a = MkTagged { unTagged :: a }

class Foo a where
  someValueAmb :: Int
  someValueTagged :: Tagged a Int
  someValueProxy :: Proxy a -> Int
  someValueVis :: forall a' -> a ~ a' => Int

instance Foo Float where
  someValueAmb = trace "some heavy computation 1" 42
  someValueTagged = MkTagged (trace "some heavy computation 2" 42)
  someValueProxy _ = trace "some heavy computation 3" 42
  someValueVis _ = trace "some heavy computation 4" 42

main :: IO ()
main = do
  print (someValueAmb @Float)
  print (someValueAmb @Float)
  print (unTagged (someValueTagged :: Tagged Float Int))
  print (unTagged (someValueTagged :: Tagged Float Int))
  print (someValueProxy (Proxy @Float))
  print (someValueProxy (Proxy @Float))
  print (someValueVis Float)
  print (someValueVis Float)

Suppose someValue involves a time-consuming computation. Here, we call trace instead of the actual calculation so we can see how many times it is evaluated.

In main, we call each someValue twice. How many times will the right-hand side of someValue be evaluated?

First, if optimization is enabled, each will be evaluated once.

$ ghc-9.10 -O1 Test.hs
$ ./Test
some heavy computation 1
42
42
some heavy computation 2
42
42
some heavy computation 3
42
42
some heavy computation 4
42
42

On the other hand, what if optimization is turned off?

$ ghc-9.10 -O0 Test.hs
$ ./Test
some heavy computation 1
42
42
some heavy computation 2
42
42
some heavy computation 3
42
some heavy computation 3
42
some heavy computation 4
42
some heavy computation 4
42

While someValueAmb and someValueTagged were only calculated once, someValueProxy and someValueVis were calculated twice. This likely reflects the difference in whether their underlying representations are the value Int or a function _ -> Int.

In such a simple program, turning on optimization eliminates the difference, but in more complex and intricate programs, optimization might not be sufficient, and differences could arise. It might be worth keeping in the back of your mind if efficiency is your top priority.

Theorem Proving

Type-level programming is common in Haskell. For example, the concatenation ++ for type-level lists can be defined as follows:

type (++) :: [k] -> [k] -> [k]
type family (++) xs ys
type instance '[] ++ ys = ys
type instance (x ': xs) ++ ys = x ': (xs ++ ys)

For type-level list concatenation, the associative law xs ++ (ys ++ zs) = (xs ++ ys) ++ zs holds, but the GHC type checker is not aware of it. To inform the GHC type checker about non-trivial equations, we perform theorem proving. In other words, we define a function of the following form:

import Data.Type.Equality ((:~:))

appendIsAssociative :: ... -> xs ++ (ys ++ zs) :~: (xs ++ ys) ++ zs

To prove the associative law, we can use structural induction on xs. Specifically, the case where xs = '[] is trivial, and the case where xs = x : xss can be proven as follows:

(x : xss) ++ (ys ++ zs)
  = x : (xss ++ (ys ++ zs))  (by definition of ++)
  = x : ((xss ++ ys) ++ zs)  (by inductive hypothesis)
  = (x : (xss ++ ys)) ++ zs   (by definition of ++)
  = ((x : xss) ++ ys) ++ zs   (by definition of ++)

Let's implement this. Since the proof requires case analysis on xs, we will define the following type:

data ProxyList xs where
  PNil :: ProxyList '[]
  PCons :: Proxy x -> ProxyList xs -> ProxyList (x ': xs)

Since case analysis is not necessary for ys and zs, Proxy-like arguments are sufficient. The type of the function representing the proof becomes:

appendIsAssociative :: ProxyList xs -> proxy2 ys -> proxy3 zs -> xs ++ (ys ++ zs) :~: (xs ++ ys) ++ zs

Writing the body of the proof according to the equational transformation:

appendIsAssociative PNil _ _ = Refl
appendIsAssociative (PCons (_ :: _ x) (xss :: _ xss)) (ys :: _ ys) (zs :: _ zs) = let
    pf1 :: (x : xss) ++ (ys ++ zs) :~: x : (xss ++ (ys ++ zs))
    pf1 = Refl
    pf2 :: x : (xss ++ (ys ++ zs)) :~: x : ((xss ++ ys) ++ zs)
    pf2 = apply Refl (appendIsAssociative xss ys zs)
    pf3 :: x : ((xss ++ ys) ++ zs) :~: (x : (xss ++ ys)) ++ zs
    pf3 = Refl
    pf4 :: (x : (xss ++ ys)) ++ zs :~: ((x : xss) ++ ys) ++ zs
    pf4 = Refl
  in pf1 `trans` pf2 `trans` pf3 `trans` pf4

Now the proof is complete, but the approach above writes each intermediate expression twice, which isn't very elegant. Can we simplify the code while maintaining the readability of the proof?

Ideally, it would be great if we could directly turn the previous derivation into code:

(x : xss) ++ (ys ++ zs)
  = x : (xss ++ (ys ++ zs))  (by definition of ++)
  = x : ((xss ++ ys) ++ zs)  (by inductive hypothesis)
  = (x : (xss ++ ys)) ++ zs   (by definition of ++)
  = ((x : xss) ++ ys) ++ zs   (by definition of ++)

Not too long ago, we might have utilized singleton types for this kind of task. But now we have RequiredTypeArguments. Let's use RequiredTypeArguments to implement this notation.

The basic flow is to construct the operators === and by so that:

⟨proof of a = b⟩ === ⟨c⟩ `by` ⟨proof of b = c⟩

has the type a :~: c. We use forall -> so that === can accept c as an argument. That is, the types of === and by will be:

(===) :: a :~: b -> forall c -> ???
by :: ??? -> b :~: c -> a :~: c

The ??? part needs to carry information about a, b, and c. Here, we will use (a :~: b, Proxy c).

In summary, we are now able to write the proof as follows:

{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE RequiredTypeArguments #-}
{-# LANGUAGE PartialTypeSignatures #-}
{-# OPTIONS_GHC -Wno-partial-type-signatures #-}
import Data.Type.Equality
import Data.Proxy
import Prelude hiding ((++))

type (++) :: [k] -> [k] -> [k]
type family (++) xs ys
type instance '[] ++ ys = ys
type instance (x ': xs) ++ ys = x ': (xs ++ ys)

infixl 1 ===, `by`

(===) :: a :~: b -> forall c -> (a :~: b, Proxy c)
(===) x _ = (x, Proxy)

by :: (a :~: b, Proxy c) -> b :~: c -> a :~: c
by (Refl, _) Refl = Refl

beginProof :: forall a -> a :~: a
beginProof _ = Refl

appendIsAssociative :: ProxyList xs -> proxy2 ys -> proxy3 zs -> xs ++ (ys ++ zs) :~: (xs ++ ys) ++ zs
appendIsAssociative PNil _ _ = Refl
appendIsAssociative (PCons (_ :: _ x) (xss_ :: _ xss)) (ys_ :: _ ys) (zs_ :: _ zs)
  = beginProof ((x : xss) ++ (ys ++ zs))
           === x : (xss ++ (ys ++ zs)) `by` Refl
           === x : ((xss ++ ys) ++ zs) `by` apply Refl (appendIsAssociative xss_ ys_ zs_)
           === (x : (xss ++ ys)) ++ zs `by` Refl
           === ((x : xss) ++ ys) ++ zs `by` Refl

data ProxyList xs where ...

There are a few points to note:

  • Using the type-level ++ as an argument to forall -> is easily confused with the term-level ++. While we could use the type keyword, it increases the amount of boilerplate, so here we have hidden the ++ from Prelude instead.
  • Previously, we used the same variable names for both terms and types, such as xss :: _ xss. Since this causes issues in this context, we have changed the term-level variable names.

Now the proof looks cool enough, but can we improve it even further? For example, when "using the inductive hypothesis," could we simply write appendIsAssociative ... instead of having to use apply Refl?

We can. Here is an example implementation.

{-# LANGUAGE GHC2021 #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE RequiredTypeArguments #-}
import Data.Type.Equality
import Data.Proxy
import Prelude hiding ((++))

type (++) :: [k] -> [k] -> [k]
type family (++) xs ys
type instance '[] ++ ys = ys
type instance (x ': xs) ++ ys = x ': (xs ++ ys)

infixl 1 ===, `by`

(===) :: a :~: b -> forall c -> b ~ c => a :~: c
(===) Refl _ = Refl

by :: (s ~ t => prop) -> s :~: t -> prop
by proof Refl = proof

beginProof :: forall a -> a :~: a
beginProof _ = Refl

appendIsAssociative :: forall xs -> ProxyListI xs => forall ys -> forall zs -> xs ++ (ys ++ zs) :~: (xs ++ ys) ++ zs
appendIsAssociative xs ys zs = case proxyList' @xs of
  PNil' -> Refl
  PCons' @x @xss ->
    beginProof ((x : xss) ++ (ys ++ zs))
           === x : (xss ++ (ys ++ zs))
           === x : ((xss ++ ys) ++ zs) `by` appendIsAssociative xss ys zs
           === (x : (xss ++ ys)) ++ zs
           === ((x : xss) ++ ys) ++ zs

data ProxyList' xs where
  PNil' :: ProxyList' '[]
  PCons' :: forall x xs. ProxyListI xs => ProxyList' (x ': xs)

class ProxyListI xs where
  proxyList' :: ProxyList' xs

instance ProxyListI '[] where
  proxyList' = PNil'

instance ProxyListI xs => ProxyListI (x ': xs) where
  proxyList' = PCons' @x @xs

Conclusion

I would like to thank everyone involved in the GHC 9.10 release, especially the GHC team at Serokell working on dependent types. Thank you!

Discussion