Indexed optics are occasionally very useful. They generalize `mapWithKey`

-like operations found for various containers.

`% _2) :: (i -> b -> c) -> Map i (a, b) -> Map i (a, c) iover (imapped `

Indexed lens are constructed with `ilens`

combinator.

`ilens :: (s -> (i, a)) -> (s -> b -> t) -> IxLens i s t a b`

It is implicit that the getter and the (indexed) setter part have to satisfy usual lens laws.

However there are problematic combinators, e.g. `indices`

:

```
indices :: (Is k A_Traversal, is `HasSingleIndex` i)
=> (i -> Bool) -> Optic k is s t a a -> IxTraversal i s t a a
```

An example usage is

```
>>> toListOf (itraversed %& indices even) "foobar"
"foa"
```

If we **combine** `ilens`

and `indices`

we get a nasty thing:

```
-> indices p (ilens (\a -> (a,a)) (\_ b -> b))
\p :: (i -> Bool) -> IxTraversal i i i i i
```

That is (almost) the type of `unsafeFiltered`

, which has a warning sign:

```
Note: This is not a legal
````Traversal`

., unless you are very careful not to invalidate the predicate on the target.

However, neither `indices`

nor `ilens`

have warning attached.

*There should be an additional indexed lens law(s)*.

My proposal is to require that indices and values are independent, which for indexed lens can be checked by following equation:

*Whatever you put in, you cannot change the index.*

`fst (iview l s) ≡ fst (iview l (iover l f s))`

where `l :: IxLens i s t a b`

, `s :: s`

, `f :: i -> a -> b`

.

This law is generalisable to other optic kinds. For traversals replace `fst`

and `iview`

with `map fst`

and `itoList`

. For setters it is harder to specify, but the idea is the same.

Similarly we can talk about indexed prisms or even isomorphisms. The independence requirement would mean that that index have to be *boring* (i.e. isomorphic to `()`

), thus there isn't any additional power.

However sometimes violating laws might be justified, (e.g. when we quotient types would made program correct, but we don't have them in Haskell).

This new law doesn't prohibit having duplicate indices in a traversal.

This observation also extends in to `TraversableWithIndex`

. As far as I can tell, all instances satisfy the above requirement (of indices being independent of values). Should we make that (IMHO natural) assumption explicit?

I originally posted these as a Twitter thread. Its point is to illustrate that constant factors matter, not only the whether it is , or (though quadratic is quite bad quite soon).

I have been playing with `discrimination`

package.

It offers linear-time grouping and sorting. `Data.List.nub`

vs `Data.Discrimination.nub`

chart is fun to look at. (Look at the x-axis: size of input).

Don't use `Data.List.nub.`

There is `ordNub`

(in many librairies, e.g. in `Cabal`

). And indeed, it is a good alternative. `Data.Discrimination.nub`

is still faster, when n is large enough (around hundred thousands for Word64 I use in these benchmarks).

`hashNub :: (Eq a, Hashable a) => [a] -> [a]`

performs well too:

All four variants on the same graph. Even for small n, nub is bad. ordNub and Discrimination.nub are about the same. hashNub is fastest.

(I have to find something better to play with than `Word64`

)

UUID is slightly more intestesting type.

```
data UUID = UUID !Word64 !Word64
deriving (Eq, Ord, Generic, Show, NFData, Hashable, Grouping, Sorting)
```

Same pattern: `hashNub`

is the fastest, `ordNub`

becomes slower then `Discrimination.nub`

when there are enough elements.

I was asked about zooming for small .

Because it is hard to see, we can try loglog plot. Everything looks linear there, but we can see crossover points better.

But what about sorting, you may ask.

It turns out that `Data.List.sort`

is quite good, at least if your lists are less than a million in length. Comparison with `Data.Discrimination.sort`

: (I sort UUIDs, for more fun).

Making a vector from a list, sorting it (using vector-algorithms) and converting back to list seems to be a good option too, (we only copy pointers, copying million pointers is ... cheap).

Something weird happens on GHC-9.0 though. `discrimination`

has the same performance, yet `vector`

based degrades.

Yet, @ekmett thinks there is still plenty of opportunity to make discrimination faster. In the meantime, I'll add (a variant of) these benchmarks to the repository.

]]>The term *coindexed optics* is sometimes brought up. But what are they? One interpretation is *optics with error reporting*, i.e. which can tell why e.g. `Prism`

didn’t match^{1}. For some time I started to dislike that interpretation. It doesn’t feel right.

Recently I run into documentation of `witherable`

. There is `Wither`

, which is like a lens, but not quite. I think that is closer to what *coindexed optics* could be. (However, there are plenty arrows to flip, and you may flip others).

This blog post is a literate Haskell file, so we start with language extensions

```
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE EmptyCase #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE TypeOperators #-}
```

The import list is shorter, `Data.SOP`

is the only module (from `sop-core`

) which is not in `base`

library.

```
import Control.Applicative (liftA2)
import Data.Kind (Type)
import Data.Bifunctor (Bifunctor (first))
import Data.Char (toUpper)
import Data.Coerce (coerce)
import Data.SOP (NP (..), NS (..), I (..))
import qualified Data.Maybe as L (mapMaybe)
```

I will use a variant of profunctor encoding of optics. The plan is to show

ordinary, unindexed optics;

then indexed optics;

and finally coindexed optics.

The profunctor encoding of optics is relatively simple. However, instead of ordinary profunctors, we will use a variant with additional *type level list* argument. This is similar to `indexed-profunctors`

, though there the type list is curried. Currying works well for indexed optics, but complicates coindexed story.

`type Optic p (is :: [Type]) (js :: [Type]) s t a b = p is a b -> p js s t`

To not make this post unnecessarily long, we will use only a subset of profunctor hierarchy: `Profunctor`

and `Mapping`

for now. `Profunctor`

is used to encode isomorphisms (isos), and `Mapping`

is used to encode setters.

```
class Profunctor (p :: [Type] -> Type -> Type -> Type) where
dimap :: (a -> b) -> (c -> d) -> p is b c -> p is a d
class Profunctor p => Mapping p where
roam :: ((a -> b) -> s -> t) -> p is a b -> p is s t
```

A go to example of a setter is `mapped`

. With `mapped`

we can set, or map over, elements in a `Functor`

.

```
mapped :: (Mapping p, Functor f)
=> Optic p is is (f a) (f b) a b
= roam fmap mapped
```

To implement the `over`

operation we need a concrete profunctor. If we used ordinary profunctors, the function arrow would do. In this setup we need a newtype to adjust the kind:

```
newtype FunArrow (is :: [Type]) a b =
FunArrow { runFunArrow :: a -> b }
```

Instance implementations are straight forward:

```
instance Profunctor FunArrow where
FunArrow k) = FunArrow (g . k . f)
dimap f g (
instance Mapping FunArrow where
FunArrow k) = FunArrow (f k) roam f (
```

Using `FunArrow`

we can implement `over`

. `over`

uses a setter to map *over* focused elements in a bigger structure. Note, we allow any index type-lists. We (or rather `FunArrow`

) simply ignore them.

```
over :: Optic FunArrow is js s t a b
-> (a -> b)
-> (s -> t)
= runFunArrow (o (FunArrow f)) over o f
```

Some examples, to show it works what we have written so far. `over mapped`

is a complicated way to say `fmap`

:

```
-- "FOOBAR"
example01 :: String
= over mapped toUpper "foobar" example01
```

Optics compose. `over (mapped . mapped)`

maps over two composed functors:

```
-- ["FOOBAR","XYZZY"]
example02 :: [String]
= over (mapped . mapped) toUpper ["foobar", "xyzzy"] example02
```

This was a brief refresher of profunctor optics. It is "standard", except that we added an additional type-level list argument to the profunctors.

We will next use that type-level list to implement indexed optics.

Indexed optics let us set, map, traverse etc. using an additional index. The operation we want to generalize with optics is provided by few classes, like `FunctorWithIndex`

:

```
class Functor f => FunctorWithIndex i f | f -> i where
imap :: (i -> a -> b) -> f a -> f b
```

`imap`

operation is sometimes called `mapWithKey`

(for example in `containers`

library).

Ordinary lists are indexed with integers. `Map k v`

is indexed with `k`

. We will use lists in the examples, so let us define an instance:

```
instance FunctorWithIndex Int [] where
= zipWith f [0..] imap f
```

Next, we need to make that available in optics framework. New functionality means new profunctor type class. Note how an indexed combinator conses the index to the list.

```
class Mapping p => IMapping p where
iroam :: ((i -> a -> b) -> s -> t) -> p (i ': is) a b -> p is s t
```

Using `iroam`

and `imap`

we can define `imapped`

, which is an example of *indexed setter*.

```
imapped :: (FunctorWithIndex i f, IMapping p)
=> p (i ': is) a b -> p is (f a) (f b)
= iroam imap imapped
```

Here, we should note that `FunArrow`

can be given an instance of `IMapping`

. We simply ignore the index argument.

```
instance IMapping FunArrow where
FunArrow k) = FunArrow (f (\_ -> k)) iroam f (
```

That allows us to use `imapped`

instead of `mapped`

. (both `optics`

and `lens`

libraries have tricks to make that efficient).

```
-- ["FOOBAR","XYZZY"]
example03 :: [String]
= over (mapped . imapped) toUpper ["foobar", "xyzzy"] example03
```

To actually use indices, we need new concrete profunctor. The `IxFunArrow`

takes a heterogeneous list, `NP I`

(n-ary product), of indices in addition to the element as an argument of an arrow.

```
newtype IxFunArrow is a b =
IxFunArrow { runIxFunArrow :: (NP I is, a) -> b }
```

The `IxFunArrow`

instances are similar to `FunArrow`

ones, they involve just a bit of additional plumbing.

```
instance Profunctor IxFunArrow where
IxFunArrow k) = IxFunArrow (g . k . fmap f)
dimap f g (
instance Mapping IxFunArrow where
IxFunArrow k) = IxFunArrow (\(is, s) -> f (\a -> k (is, a)) s) roam f (
```

`IMapping`

instance is the most interesting. As the argument provides an additional index `i`

, it is consed to the the list of existing indices.

```
instance IMapping IxFunArrow where
IxFunArrow k) = IxFunArrow $
iroam f (-> f (\i a -> k (I i :* is, a)) s \(is, s)
```

As I already mentioned, `indexed-profunctors`

uses curried variant, so the index list is implicitly encoded in uncurried form `i1 -> i2 -> ...`

. That is clever, but hides the point.

Next, the indexed `over`

. The general variant takes an optic with any indices list.

```
gen_iover :: Optic IxFunArrow is '[] s t a b
-> ((NP I is, a) -> b)
-> s -> t
= runIxFunArrow (o (IxFunArrow f)) (Nil, s) gen_iover o f s
```

Usually we use the single-index variant, `iover`

:

```
iover :: Optic IxFunArrow '[i] '[] s t a b
-> (i -> a -> b)
-> s -> t
= gen_iover o (\(I i :* Nil, a) -> f i a) iover o f
```

We can also define the double-index variant, `iover2`

(and so on).

```
iover2 :: Optic IxFunArrow '[i,j] '[] s t a b
-> (i -> j -> a -> b)
-> (s -> t)
= gen_iover o (\(I i :* I j :* Nil, a) -> f i j a) iover2 o f
```

Lets see what we can do with indexed setters. For example we can upper case every odd character in the string:

```
-- "fOoBaR"
example04 :: String
= iover imapped (\i a -> if odd i then toUpper a else a) "foobar" example04
```

In nested case, we have access to all indices:

```
-- ["fOoBaR","XyZzY","uNoRdErEd-cOnTaInErS"]
example05 :: [String]
= iover2
example05 . imapped)
(imapped -> if odd (i + j) then toUpper a else a)
(\i j a "foobar", "xyzzy", "unordered-containers"] [
```

We don’t need to index at each step, e.g. we can index only at the top level:

```
-- ["foobar","XYZZY","unordered-containers"]
example06 :: [String]
= iover
example06 . mapped)
(imapped -> if odd i then toUpper a else a)
(\i a "foobar", "xyzzy", "unordered-containers"] [
```

Indexed optics are occasionally very useful. We can provide extra information in indices, which would otherwise not fit into optical frameworks.

The indexed optics from previous sections can be flipped to be coindexed ones. As I mentioned in the introduction, I got an idea at looking at `witherable`

. package.

`witherable`

provides (among many things) a useful type-class, in a simplified form:

```
class Functor f => Filterable f where
mapMaybe :: (a -> Maybe b) -> f a -> f b
```

It is however too simple. (Hah!). The connection to indexed optics is easier to see using an `Either`

variant:

```
class Functor f => FunctorWithCoindex j f | f -> j where
jmap :: (a -> Either j b) -> f a -> f b
```

We’ll also need a `Traversable`

variant (`Witherable`

in `witherable`

):

```
class (Traversable f, FunctorWithCoindex j f)
=> TraversableWithCoindex j f | f -> j
where
jtraverse :: Applicative m => (a -> m (Either j b)) -> f a -> m (f b)
```

Instances for list are not complicated. The *coindex* of list is a unit `()`

.

```
instance FunctorWithCoindex () [] where
= L.mapMaybe (either (const Nothing) Just . f)
jmap f
instance TraversableWithCoindex () [] where
= pure []
jtraverse _ [] :xs) = liftA2 g (f x) (jtraverse f xs) where
jtraverse f (xLeft ()) ys = ys
g (Right y) ys = y : ys g (
```

With "boring" coindex, like the unit, we can recover `mapMaybe`

:

```
mapMaybe' :: FunctorWithCoindex () f => (a -> Maybe b) -> f a -> f b
= jmap (maybe (Left ()) Right . f) mapMaybe' f
```

With `TraversableWithCoindex`

class, doing the same tricks as previously with indexed optics, we get coindexed optics. Easy.

I didn’t manage to get `JMapping`

(a coindexed mapping) to work, so I’ll use `JTraversing`

. We abuse the index list for coindices.

```
class Profunctor p => Traversing p where
wander :: (forall f. Applicative f => (a -> f b) -> s -> f t)
-> p js a b -> p js s t
class Traversing p => JTraversing p where
jwander :: (forall f. Applicative f => (a -> f (Either j b)) -> s -> f t)
-> p (j : js) a b -> p js s t
```

Using `JTraversing`

we can define our first coindexed optic.

```
traversed :: (Traversable f, Traversing p) => p js a b -> p js (f a) (f b)
= wander traverse
traversed
jtraversed :: (TraversableWithCoindex j f, JTraversing p)
=> p (j : js) a b -> p js (f a) (f b)
= jwander jtraverse jtraversed
```

To make use of it we once again need a concrete profunctor.

```
newtype CoixFunArrow js a b = CoixFunArrow
runCoixFunArrow :: a -> Either (NS I js) b }
{
instance Profunctor CoixFunArrow where
CoixFunArrow p) = CoixFunArrow (fmap g . p . f)
dimap f g (
instance Traversing CoixFunArrow where
CoixFunArrow p) = CoixFunArrow $ f p
wander f (
instance JTraversing CoixFunArrow where
CoixFunArrow p) = CoixFunArrow $ f (plumb . p) where
jwander f ( plumb :: Either (NS I (j : js)) b -> Either (NS I js) (Either j b)
Right x) = Right (Right x)
plumb (Left (Z (I y))) = Right (Left y)
plumb (Left (S z)) = Left z plumb (
```

Interestingly, `Traversing CoixFunArrow`

instance looks like `Mapping FunArrow`

, and it seems to be impossible to write `Mapping IxFunArrow`

.

Anyway, next we define a coindexed `over`

, which I unimaginatively call `jover`

. Like in the previous section, I start with a generic version first.

```
gen_jover :: Optic CoixFunArrow is '[] s t a b
-> (a -> Either (NS I is) b) -> s -> t
= either nsAbsurd id
gen_jover o f s $ runCoixFunArrow (o (CoixFunArrow f)) s
jover :: Optic CoixFunArrow '[i] '[] s t a b
-> (a -> Either i b) -> s -> t
= gen_jover o (first (Z . I) . f)
jover o f
jover2 :: Optic CoixFunArrow '[i,j] '[] s t a b
-> (a -> Either (Either i j) b)
-> s -> t
= gen_jover o (first plumb . f) where
jover2 o f Left i) = Z (I i)
plumb (Right j) = S (Z (I j)) plumb (
```

And now the most fun: the coindexed optics examples. First we can recover the `mapMaybe`

behavior:

```
-- ["foobar"]
example07 :: [String]
= jover jtraversed
example07 -> if length s > 5 then Right s else Left ())
(\s "foobar", "xyzzy"] [
```

And because we have separate coindexes in the type level list, we can filter on the different levels of the structure! If we find a character `'y'`

, we skip the whole word, otherwise we skip all vowels.

```
-- ["fbr","nrdrd-cntnrs"]
example08 :: [String]
= jover2 (jtraversed . jtraversed)
example08
predicate"foobar", "xyzzy", "unordered-containers"]
[where
'y' = Left (Right ()) -- skip word
predicate | isVowel c = Left (Left ()) -- skip character
predicate c = Right c
predicate c
isVowel :: Char -> Bool
= elem c ['a','o','u','i','e'] isVowel c
```

Note, the coindex doesn’t need to mean filtering. For example, consider the following type:

```
newtype JList j a = JList { unJList :: [Either j a] }
deriving (Functor, Foldable, Traversable, Show)
```

It’s not `Filterable`

, but it can write a `FunctorWithCoindex`

instance:

```
instance FunctorWithCoindex j (JList j) where
JList xs) = JList (map (>>= f) xs) where
jmap f (
instance TraversableWithCoindex j (JList j) where
JList xs) = fmap JList (go xs) where
jtraverse f (= pure []
go [] Left j : ys) = fmap (Left j :) (go ys)
go (Right x : ys) = liftA2 (:) (f x) (go ys) go (
```

Using `JList`

we can do different things. In this example, we return why elements didn’t match, but that information is returned embedded inside the structure itself. We "filter" long strings:

```
jlist :: [a] -> JList j a
= JList . map Right
jlist
ex_jlist_a :: JList Int String
= jlist ["foobar", "xyzzy", "unordered-containers"]
ex_jlist_a
-- JList {unJList = [Left 6,Right "xyzzy",Left 20]}
example09 :: JList Int String
= jover jtraversed
example09 -> let l = length s in if l > 5 then Left l else Right s)
(\s ex_jlist_a
```

Similarly we can filter, or rather "change structure", on different levels, and these levels can have different coindices:

```
ex_jlist_b :: JList Int (JList Bool Char)
= fmap jlist ex_jlist_a
ex_jlist_b
example88b :: JList Int (JList Bool Char)
= jover2
example88b . jtraversed)
(jtraversed
predicate
ex_jlist_bwhere
'x' = Left (Right 0)
predicate 'y' = Left (Right 1)
predicate 'z' = Left (Right 2)
predicate | isVowel c = Left (Left (c == 'o'))
predicate c = Right c
predicate c
{-
[ Right [Right 'f',Left True,Left True,Right 'b',Left False,Right 'r']
, Left 0
, Right [Left False,Right 'n',Left True,Right 'r',Right 'd',Left False, ...
]
-}
example88b' :: [Either Int [Either Bool Char]]
= coerce example88b example88b'
```

The `"xyzzy"`

is filtered immediately, we see `Left 0`

as a reason. We can also see how vowels are filtered, and `'o'`

are marked specifically with `Left True`

.

Having coindices reside inside the structure makes composition just work. That what makes this different from "error reporting optics". And using coindices approach we can compose filters, the `Wither`

from `witherable`

doesn’t seem to compose with itself.

Obvious follow up question is whether we can have both indices and coindices. Why not, the concrete profunctor would look like:

```
newtype DuplexFunArrow is js a b = DuplexFunArrow
runDuplexFunArrow :: (NP I is, a) -> Either (NS I js) b } {
```

Intuitively, the structure traversals would provide additional information in indices, and we’ll be able to alter it by optionally returning coindices.

Would that be useful? I have no idea.

```
nsAbsurd :: NS I '[] -> a
= case x of {} nsAbsurd x
```

https://blog.fp-tower.com/2020-01-27-introducing-error-reporting-in-optics/ looks like an example of that. In Scala.↩︎

This post is my musings about a type system proposed by Conor McBride in I Got Plenty o' Nuttin' and refined by Robert Atkey in Syntax and Semantics of Quantitative Type Theory: *Quantitative Type Theory* or QTT for short. Idris 2 is based on QTT, so at the end there is some code too.

But let me start with recalling *Simply Typed Lambda Calculus with Products and Coproducts*, (that is a mouthful!). As the name already says there are three binary connectives, functions, products and coproducts (or sums).

But in Martin Löf Type Theory (not quantiative) we have only two: pi and sigma types.

We can recover ordinary functions, and ordinary pairs in a straigh-forward way:

However, to get coproducts we need something extra. One general way is to add *finite sets*. We get *falsehood* (empty finite set), *truth* (singleton finite set), booleans (binary finite sets), and so on. But these three are enough.

With booleans we can define

Which is very reasonable. That is how we represent sum types on physical machines: a tag and a payload which type depends on a tag.

A natural question is what we get if switch to . Unfortuntely, nothing new:

We get another way to describe pairs.

To summarise:

Let us next try to see how this works out with linear types.

In the intuitionistic linear calculus (ILC, also logic, ILL) we have four binary connectives: linear functions, times, plus and with.

Often an unary bang

is also added to allow writing non-linear programs as well.

Making linear *dependent* theory is hard, and Quantitative Type Theory is a promising attempt. It seems to work.

And it is not complicated. In fact, as far as we concerned, it still has just two connectives

but slightly modified to include *multiplicity* (denoted by ):

The multiplicity can be , i.e. linear single use. Or , the unrestricted use, but also , which is irrelevant usage. I find this quite elegant.

The rules are then set up so we don't run into problems with "types using our linear variables", as they are checked in irrelevant context.

As in the non-linear setting, we can recover two "simple" connectives immediately:

Other multiplicities allow us to create some "new" connectives

where is the irrelevant quantification.

You should guess that next we will recover and using booleans. However booleans in linear calculus are conceptually hard, when you start to think about resource interpretation of linear calculus. However, *booleans are not resources*, they are information. About introduction Atkey says

... no resources are required to construct the constants true and false

but the elimination is a bit more involved. The important bit is, however, that we have if-then-else which behaves reasonably.

Then

and

That behaves as we want. When we match on we learn the tag and payload. As tag has multiplicity 1, we can once match on it to learn the type of the payload. (Note: I should really write the typing rules, and derivations, but I'm quite confident it works this way. Highly likely I'm wrong :)

The with-connective, , is mind-juggling. It's a product (in fact, the product in CT-sense). We can extract parts, but if we have to decide which, cannot do both.

The value of type is a function, and we can only call it once, so we cannot write a value of type , nor the inverse.

So the and the are both product like, but different.

We redraw the table from the previous section. there are no more unelegant duplication:

It's often said that you cannot write

`diag :: a -> (a, a)`

in linear calculus.

This is true if we assume that tuples are tensors. That is natural assumption as is what makes curry with arrows.

However, the product is . I argue that "the correct" type of diag () is

```
: a -> a & a
diag = x :&: x diag x
```

And in fact, the adjointness is the same as in CT-of-STLC, (Idris2 agrees with me):

If we could just normalise the notation, then we'd use

But that would be sooo... confusing.

There is plenty of room for more confusion, it gets better.

Products and coproducts usually have units, so is the case in the linear calculus too.

Spot a source of possible confusion.

We know, that because is the product, its unit is the terminal object, . And now we have to be careful.

**Definition:** T is terminal object if for every object X in category C there exist unique morphism .

Indeed the in linear logic is such object. It acts like a dumpster. If we don't like some (resource) thing, we can map it to . If we already have (there is no way to get rid of it), we can tensor it with something else we don't like and map the resulting "pair" to another . Bottomless trash can!

In category theory we avoid speaking about objects directly (it is point-free extreme). If we need to, we speak about object , we rather talk about morphism (constant function). This works because, e.g. in Sets category:

There the use of comes from it being the unit of used to internalize arrows, i.e. define objects (and binary functions, currying, etc).

In linear logic, the "friend" of is, however, , and its unit is not terminal object.

So we rather have

which is again confusing, as you can confuse for initial object, , which it isn't. To help avoid that I used the subscript.

The takeaway is that and in linear logic are different objects, and you have to be very careful so ordinary lambda calculus (or e.g. Haskell) intuition doesn't confuse you.

I wish there were a category where linear stuff is separate. In Sets . Vector spaces are close, but they have own source of confusion (there , direct sum, which is a product, and coproduct).

All above is nicely encodable in Idris 2. If you want to play with linear logic concepts, I'd say that Idris2 is the best playground at the moment.

```
module QTT
-----------------------------------------------------------------------
-- Pi and Sigma
-----------------------------------------------------------------------
Pi1 : (a : Type) -> (a -> Type) -> Type
Pi1 a b = (1 x : a) -> b x
data Sigma1 : (a : Type) -> (a -> Type) -> Type where
Sig1 : (1 x : a) -> (1 y : b x) -> Sigma1 a b
-----------------------------------------------------------------------
-- Lollipop
-----------------------------------------------------------------------
Lollipop : Type -> Type -> Type
Lollipop a b = Pi1 a \_ => b
-- handy alias
-@) : Type -> Type -> Type
(-@) = Lollipop
(infixr 0 -@
-- for constructor, just write \x => expr
-- Lollipop elimination, $
: Lollipop a b -@ a -@ b
lollipopElim = f x
lollipopElim f x
-----------------------------------------------------------------------
-- Times
-----------------------------------------------------------------------
Times : Type -> Type -> Type
Times a b = Sigma1 a \_ => b
-- Times introduction
: a -@ b -@ Times a b
times = Sig1 x y
times x y
-- Times elimination
: Times a b -@ (a -@ b -@ c) -@ c
timesElim Sig1 x y) k = k x y
timesElim (
-----------------------------------------------------------------------
-- With
-----------------------------------------------------------------------
With : Type -> Type -> Type
With a b = Pi1 Bool \t => if t then a else b
-- With elimination 1
fst : With a b -@ a
fst w = w True
-- With elimination 2
snd : With a b -@ b
snd w = w False
-- There isn't really a way to write a function for with introduction,
-- let me rather write diag.
: a -@ With a a
diag True = x
diag x False = x
diag x
-- Also note, that even if With would be a built-in, it should
-- be non-strict (and a function is).
-- We may use the same resource in both halfs differently,
-- and the resource cannot be used until user have selected the half.
-----------------------------------------------------------------------
-- Plus
-----------------------------------------------------------------------
Plus : Type -> Type -> Type
Plus a b = Sigma1 Bool \t => if t then a else b
-- Plus introduction 1
: a -@ Plus a b
inl = Sig1 True x
inl x
-- Plus introduction 2
: b -@ Plus a b
inr = Sig1 False y
inr y
-- Plus elimination, either... with a with twist
-- Give me two functions, I'll use one of them, not both.
: With (a -@ c) (b -@ c) -@ Plus a b -@ c
plusElim Sig1 True x) = f True x
plusElim f (Sig1 False y) = f False y
plusElim f (
-----------------------------------------------------------------------
-- Extras
-----------------------------------------------------------------------
-- plusElim is reversible.
-- Plus -| Diag
: (Plus a b -@ c) -@ With (a -@ c) (b -@ c)
plusElimRev True = \x => f (inl x)
plusElimRev f False = \y => f (inr y)
plusElimRev f
-- Diag -| With
: (c -@ With a b) -@ With (c -@ a) (c -@ b)
adjunctFwd True = \z => f z True
adjunctFwd f False = \z => f z False
adjunctFwd f
: With (c -@ a) (c -@ b) -@ (c -@ With a b)
adjunctBwd True = f True c
adjunctBwd f c False = f False c
adjunctBwd f c
-----------------------------------------------------------------------
-- Hard exercise
-----------------------------------------------------------------------
-- What would be a good way to imlement Top, i.e unit of With.
--
-- fwd : a -@ With Top a
-- bwd : With Top a -@ a
--
-- I have ideas, I'm not sure I like them.
```

```
{-# OPTIONS --cubical --safe #-}
module TrueFalse where
open import Cubical.Foundations.Everything
open import Cubical.Data.Bool
-- The following is basic stuff
: true ≡ true
true≡true = refl
true≡true
-- in Cubical Agda _≡_ isn't a primitive, but a (constant) path
: Path Bool true true
true≡true₂ = refl
true≡true₂
-- but we can also have dependent paths.
: Path Type₀ Bool Bool
eq-Bool = refl
eq-Bool
-- i.e. paths which endpoints are values of
-- types of another path by another path endpoints
-- (and "sadly", we have to eta-expand the path)
: PathP (λ i → eq-Bool i) true true
true≡true₃ = refl
true≡true₃
-- Additionally, Cubical Agda allows to say that
-- Booleans are equal via the not-isomorphism
-- (this is notEq in cubical library)
: Bool ≡ Bool
eq-not-Bool = isoToPath (iso not not notnot notnot)
eq-not-Bool
-- And then, one can write what may look like non-sense
: PathP (λ i → eq-not-Bool i) true false
true≡false = toPathP refl
true≡false
-- "Unfortunately" Agda makes that all a bit to explicit,
-- otherwise you could see just
--
-- true≡false : PathP ... true false
-- true≡false = ... refl
--
-- and get very confused.
-- Devil is in the details (of dependent paths).
```

Where a big part of `Cabal`

is about interpreting `your-package.cabal`

file, also an important part of it and also `cabal-install`

are **filepaths**. After all, `cabal-install`

is a build tool.

Currently (as of `Cabal-3.4`

) the type used for all filepath needs is infamous

`type FilePath = String`

One can say that all paths in the codebase are *dynamically typed*. It is very hard to say whether paths are absolute or relative, and if relative to what.

A solution would be to use `path`

or `paths`

library.

I like `paths`

better, because it is set up to talk about relative paths to arbitrary roots, not only *absolute paths*.

Still, neither is good enough. Why I say so? Because `Cabal`

and `cabal-install`

have to deal with *three kinds of paths*.

- Abstract paths
- Paths on the host system
- Paths on the target system

It is that simple, but `path`

is very concretely the second kind, and `paths`

is somewhere in between first and second, but doesn't let you differentiate them.

Abstract paths are the ones written in `your-package.cabal`

file. For example `hs-source-dirs: src/`

. It is not Unix path. It is not Windows path. It is in fact something which should be interpretable as either, and also path inside tarball archive. In fact it currently have to be common denominator of it, which means that backslashes `\`

, i.e. Windows filepaths aren't portable, but I suspect they work *if you build on Windows*.

Just *thinking* about types uncovers a possible bug.

If we had a

```
-- | An abstract path.
data APath root = ...
```

Then we could enforce format, for example prohibiting some (i.e. all known) special characters.

*Note:* abstract paths are relative. There might be some *abstract* root, for example `PackageRoot`

, but its interpretation still depends.

The representation of `APath`

is not important. It, however, should be some kind of *text*.

These are the concrete paths on your disk.

```
-- | A path on host (build) system
data HPath root = ...
```

The `HPath`

can have different roots as well, for example `CWD`

(for current working directory), `HomeDir`

or `Absolute`

. Maybe even talking about `HPath PackageRoot`

is meaningful. My gut feeling says that we rather should be able to provide an operation to resolve `APath PackageRoot`

into `HPath Absolute`

, given a `HPath Absolute`

of package root.

Also `directory`

operations, i.e. IO operations, like `listDirectory`

are only meaningful for `HPath`

s. These are concrete paths.

`HPath`

s have to be represented in a systems native way. It can still be `FilePath`

in the first iteration, but e.g. absolute paths on Windows may start with `\\?\`

and use backslashes as directory separators (c.f. `APath`

which probably will look like POSIX path everywhere).

The third kind of paths are paths on the *target system*. While cross-compilation support in `Cabal`

is barely existing, having own type for paths for target system should help that improve.

One example are `YourPackage_Paths`

modules. Currently it contains hardcoded paths to e.g. `data-files`

directory of *installed* package, i.e. somewhere on the *target* system.

While having hardcodes absolute paths in `YourPackage_Paths`

is a bad idea nowadays, and the `data-files`

discovery should be based on some relative (relocatable, using abstract `APath`

s maybe?) system, having a

```
-- | A path on the target (run) system
data TPath root = ...
```

will at least show where we use (absolute) target system paths. Perfectly we won't have them anywhere, if that is possible. But identifying where we have them *now* will help to get rid of them.

Another example is running (custom) `./Setup`

or tests or benchmarks. I hope that we can engineer the code in a way that executables built for target system won't be callable, but will need to use a runner wrapper (which we have, but I don't know much about it). Even a *host = target* (common) system case, where the wrapper is trivial.

*Note:* whether `TPath`

is Windows path or POSIX path will depend on run-time information, so the conversion functions will need that bit of information. You couldn't be able to purely `convert :: APath -> TPath`

, we will need to pass an extra context.

Here again, better types should help guide the design process.

These are my current thoughts about how the paths will look in some future version of `Cabal`

. Instead of one `FilePath`

(or `Path`

) there will be three: `APath`

, `HPath`

and `TPath`

^{1}.

As I write down, this seems so obvious, this is how about paths have to be classified. Have anyone done something like this before? Please tell me, so I could learn from your experiences.

Names are subject to change, maybe

`SymPath`

(for symbolic),`HostPath`

and`TargetPath`

.↩︎

Quoting Wikipedia article: In number theory, the *integer square root* (`intSqrt`

) of a positive integer is the positive integer which is the greatest integer less than or equal to the square root of ,

How to compute it in Haskell? The Wikipedia article mentions Newton’s method, but doesn’t discuss how to make the initial guess.

In `base-4.8`

(GHC-7.10) we got `countLeadingZeros`

function, which can be used to get good initial guess.

Recall that finite machine integers look like

```
= 0b0......01.....
n ^^^^^^^^ -- @countLeadingZeros n@ bits
^^^^^^ -- @b = finiteBitSize n - countLeadingZeros n@ bits
```

We have an efficient way to get *``significant bits”* count , which can be used to approximate the number.

It is also easy to approximate the square root of numbers like :

We can use this approximation as the initial guess, and write simple implementation of `intSqrt`

:

```
module IntSqrt where
import Data.Bits
intSqrt :: Int -> Int
0 = 0
intSqrt 1 = 1
intSqrt = case compare n 0 of
intSqrt n LT -> 0 -- whatever :)
EQ -> 0
GT -> iter guess -- only single iteration
where
iter :: Int -> Int
0 = 0
iter = shiftR (x + n `div` x) 1 -- shifting is dividing
iter x
guess :: Int
= shiftL 1 (shiftR (finiteBitSize n - countLeadingZeros n) 1) guess
```

Note, I do only single iteration^{1}. Is it enough? My need is to calculate square roots of small numbers. We can test quite a large range exhaustively. Lets define a correctness predicate:

```
correct :: Int -> Int -> Bool
= sq x <= n && n < sq (x + 1) where sq y = y * y correct n x
```

Out of hundred numbers

```
= length
correct100 | n <- [ 0..99 ], let x = intSqrt n, correct n x ] [ (n,x)
```

the computed `intSqrt`

is correct for 89! Which are the incorrect ones?

```
=
incorrect100 8,3)
[ (24,5)
, (32,6), (33,6), (34,6), (35,6)
, (48,7)
, (80,9)
, (96,10), (97,10), (98,10), (99,10)
, ( ]
```

The numbers which are close to perfect square (, , …) are over estimated.

If we take bigger range, say 0...99999 then with single iteration 23860 numbers are correct, with two iterations 96659.

For my usecase (mangling the `size`

of `QuickCheck`

generators) this is good enough, small deviations are very well acceptable. Bit fiddling FTW!

Like infamous Fast inverse square root algorithm, which also uses only single iteration, because the initial guess is very good,↩︎

I spent this Sunday writing two small patches to `cabal-fmt`

.

`cabal-fmt`

reasonably assumes that the file it is formatting is a cabal package definition file. So it parses it as such. That is needed to correctly pretty print the fields, as some syntax, for example leading comma requires somewhat recent `cabal-version: 2.2`

(see Package Description Format Specification History for details).

However, there are other files using the same markup format, for example `cabal.project`

files or `cabal.haskell-ci`

configuration files used by haskell-ci tool. Wouldn't it be nice if `cabal-fmt`

could format these as well. In `cabal-fmt-0.1.4`

you can pass `-n`

or `--no-cabal-fmt`

flag, to prevent `cabal-fmt`

from parsing these files as cabal package file.

The downside is that the latest known cabal specification will be used. That shouldn't break `cabal.haskell-ci`

files, but it might break `cabal.project`

files if you are not careful. (Their parsing code is somewhat antique).

An example of reformatting the `cabal.project`

of this blog:

```
--- a/cabal.project
+++ b/cabal.project
@@ -1,9 +1,7 @@
index-state: 2020-05-10T17:53:22Z
with-compiler: ghc-8.6.5-
packages:- "."
- pkg/gists-runnable.cabal
+ "."
+ pkg/gists-runnable.cabal
-constraints:
- hakyll +previewServer
+constraints: hakyll +previewServer
```

So satisfying.

Another addition are *fragments*. They are best illustrated by an example. Imagine you have a multi-package project, and you use `haskell-ci`

to generate your `.travis.yml`

. Each `.cabal`

package file must have the same

```
...
tested-with: GHC ==8.4.4 || ==8.6.5 || ==8.8.3 || ==8.10.1
library
...
```

Then you find out that GHC 8.8.4 and GHC-8.10.2 were recently released, and you want to update your CI configuration. Editing multiple files, with the same change. Busy work.

With `cabal-fmt-0.1.4`

you can create a fragment file, lets call it `tested-with.fragment`

:

`tested-with: GHC ==8.4.4 || ==8.6.5 || ==8.8.4 || ==8.10.2`

And then edit your package files with a *cabal-fmt pragma* ( the fragment is probably in the root directory of project, but `.cabal`

files are inside a directory per package):

```
...
+-- cabal-fmt: fragment ../tested-with.fragment
tested-with: GHC ==8.4.4 || ==8.6.5 || ==8.8.3 || ==8.10.1
library ...
```

Then when you next time run

`cabal-fmt --inplace */*.cabal`

you'll see the diff

```
...
-- cabal-fmt: fragment ../tested-with.fragment-tested-with: GHC ==8.4.4 || ==8.6.5 || ==8.8.3 || ==8.10.1
+tested-with: GHC ==8.4.4 || ==8.6.5 || ==8.8.4 || ==8.10.2
library ...
```

for all libraries. *Handy*!

Some design comments:

- Fragment is only a single field or a single section (e.g. common stanzas). Never multiple single fields. (Easier to implement, least surprising behavior: pragma is attached to a single field or section).
- Field name or section header in the
`.cabal`

file and the fragment have to match. (To avoid mistakes). - Substitution is not recursive. (Guaranteed termination).
- Other pragmas in fragments are not executed. Neither are comments in fragments preserved. (Not sure whether that would be valuable).

Finally, you can use `cabal-fmt --no-cabal-fmt`

to format fragment files too, even they are reformatted when spliced.

`cabal-fmt-0.1.4`

is a small release. I made `--no-cabal-file`

to scratch my itch, and fragments partly to highlight that not every feature can exist in `Cabal`

, but is very fine for preprocessors. I do think that fragments could be very useful in bigger projects. Let me know!

I was lately thinking about fixed points, more or less.

A new version of `data-fix`

was released recently, and also corresponding version of `recursion-schemes`

.

Also I wrote a Fix-ing regular expressions post, about adding fixed points to regular expression.

This post is another exploration: *Fixed points of Indexed functors*. This is not novel idea at all, but I’m positively surprised this works out quite nicely in modern GHC Haskell. I define a `IxFix`

type and illustrate it with three examples.

*Note:* The `HFix`

in `multirec`

package is the same as `IxFix`

in this post. I always forget about the existence of `multirec`

.

In the following, the "modern GHC Haskell" is quite conservative, only eight extensions:

```
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
```

And this literate Haskell script is warning free, with `-Wall`

`{-# OPTIONS_GHC -Wall #-}`

On this trip

`module IxFix where`

we need a handful of imports

```
-- Type should be added to Prelude
import Data.Kind (Type)
-- Few newtypes
import Data.Functor.Identity (Identity (..))
import Data.Functor.Compose (Compose (..))
import Data.Functor.Const (Const (..))
-- dependently typed programming!
import Data.Fin (Fin)
import Data.Type.Nat
import Data.Vec.Lazy (Vec (..))
-- magic
import Data.Coerce (coerce)
```

Before we go further, let me remind you about ordinary fixed points, as defined in `data-fix`

package.

```
newtype Fix f = Fix { unFix :: f (Fix f) }
foldFix :: Functor f => (f a -> a) -> Fix f -> a
= go where go = f . fmap go . unFix foldFix f
```

Using `Fix`

we can define recursive types using non-recursive *base functors*, e.g. for a list we’d have

`data ListF a rec = NilF | ConsF a rec`

We then use `foldFix`

(or `cata`

and other recursion schemes in `recursion-schemes`

) to decouple "how we recurse" and "what we do at each step". I won’t try to convince you why this separation of concerns might be useful.

Instead I continue directly to the topic: define indexed fixed points. Why we need them? Because `Fix`

is not powerful enough to allow working with `Vec`

or polymorphically recursive types.

```
-- hello dependently typed world.
data Vec (n :: Nat) (a :: Type) where
VNil :: Vec 'Z a
(:::) :: a -> Vec n a -> Vec ('S n) a
```

Before talking about fixed points, we need to figure out what are indexed functors. Recall a normal functor is a thing of kind `Type -> Type`

:

```
class Functor f where
fmap :: (a -> b) -> (f a -> f b)
```

Indexed version is the one with `Type`

replaced with `k -> Type`

, for some index `k`

(it is still a functor, but in different category). We want morphisms to work for all indices, and preserve them. Thus we define a commonly used type alias^{1}

```
-- natural, or parametric, transformation
type f ~> g = forall (j :: k). f j -> g j
```

Using it we can define a `Functor`

variant^{2}: it looks almost the same.

```
class IxFunctor (f :: (k -> Type) -> (k -> Type)) where
ixmap :: (a ~> b) -> (f a ~> f b)
```

With `IxFunctor`

in our toolbox, we can define an `IxFix`

, note how the definition is again almost the same as for unindexed `Fix`

and `foldFix`

:

```
newtype IxFix f i = IxFix { unIxFix :: f (IxFix f) i }
foldIxFix :: IxFunctor f => (f g ~> g) -> IxFix f ~> g
= alg . ixmap (foldIxFix alg) . unIxFix foldIxFix alg
```

Does this work? I hope that following examples will convince that `IxFix`

is usable (at least in theory).

The go to example of recursion schemes is a folding a list, The go to example of dependent types is length indexed list, often called `Vec`

. I combine these traditions by defining `Vec`

as an indexed fixed point:

```
data VecF (a :: Type) rec (n :: Nat) where
NilF :: VecF a rec 'Z
ConsF :: a -> rec n -> VecF a rec ('S n)
```

`VecF`

is an `IxFunctor`

:

```
instance IxFunctor (VecF a) where
NilF = NilF
ixmap _ ConsF x xs) = ConsF x (f xs) ixmap f (
```

And we can define `Vec`

as fixed point of `VecF`

, with constructors:

```
type Vec' a n = IxFix (VecF a) n
nil :: Vec' a 'Z
= IxFix NilF
nil
cons :: a -> Vec' a n -> Vec' a ('S n)
= IxFix (ConsF x xs) cons x xs
```

Can we actually use it? Of course! Lets define concatenation^{3} `Vec' a n -> Vec' a m -> Vec' a (Plus n m)`

. We cannot use `foldIxFix`

directly, as `Plus n m`

is not the same index as `n`

, so we need to define an auxiliary `newtype`

to plumb the indices. Another way to think about these kind of `newtype`

s, is that they work around the lack of type-level anonymous functions in nowadays Haskell.

```
newtype Appended m a n =
Append { getAppended :: Vec' a m -> Vec' a (Plus n m) }
```

```
append :: forall a n m. Vec' a n -> Vec' a m -> Vec' a (Plus n m)
= getAppended (foldIxFix alg xs) ys where
append xs ys alg :: VecF a (Appended m a) j -> Appended m a j
NilF = Append id
alg ConsF x rec) = Append $ \zs -> cons x (getAppended rec zs) alg (
```

We can also define a refold function, which doesn’t mention `IxFix`

at all.

```
ixrefold :: IxFunctor f => (f b ~> b) -> (a ~> f a) -> a ~> b
= f . ixmap (ixrefold f g) . g ixrefold f g
```

And then, using `ixrefold`

we can define concatenation for `Vec`

from `vec`

package, which isn’t defined using `IxFix`

. Here we need auxiliary `newtype`

s as well.

```
newtype Swapped f a b =
Swap { getSwapped :: f b a }
newtype Appended2 m a n =
Append2 { getAppended2 :: Vec m a -> Vec (Plus n m) a }
append2 :: forall a n m. Vec n a -> Vec m a -> Vec (Plus n m) a
= getAppended2 (ixrefold f g (Swap xs)) ys where
append2 xs ys -- same as alg in 'append'
f :: VecF a (Appended2 m a) j -> Appended2 m a j
NilF = Append2 id
f ConsF z rec) = Append2 $ \zs -> z ::: (getAppended2 rec zs)
f (
-- 'project'
g :: Swapped Vec a j -> VecF a (Swapped Vec a) j
Swap VNil) = NilF
g (Swap (z ::: zs)) = ConsF z (Swap zs) g (
```

You may note that one can implement `append`

as induction over length, that’s how `vec`

implements them. Theoretically it is not right, and `IxFix`

formulation highlights it:

```
append3 :: forall a n m. SNatI n
=> Vec' a n -> Vec' a m -> Vec' a (Plus n m)
= getAppended3 (induction caseZ caseS) xs where
append3 xs ys caseZ :: Appended3 m a 'Z
= Append3 (\_ -> ys)
caseZ
caseS :: Appended3 m a p -> Appended3 m a ('S p)
= Append3 $ \(IxFix (ConsF z zs)) ->
caseS rec
cons z (getAppended3 rec zs)
-- Note: this is different than Appended!
newtype Appended3 m a n =
Append3 { getAppended3 :: Vec' a n -> Vec' a (Plus n m) }
```

Here we *pattern match* on `IxFix`

value. If we want to treat it as least fixed point, the only valid elimination is to use `foldIxFix`

!

However, the induction over length is the right approach if `Vec`

is defined as a data or type family:

```
type family VecFam (a :: Type) (n :: Nat) :: Type where
VecFam a 'Z = ()
VecFam a ('S n) = (a, VecFam a n)
```

Whether you want to have data or type-family or GADT depends on the application. (Even in Agda or Coq). Family variant doesn’t intristically know its length, which is sometimes a blessing, sometimes a curse. For what it’s worth, `vec`

package provides both variants, with almost the same module interface.

The `IxFix`

can also be used to define polymorphically recursive types like

```
data Nested a = a :<: (Nested [a]) | Epsilon
infixr 5 :<:
nested :: Nested Int
= 1 :<: [2,3,4] :<: [[5,6],[7],[8,9]] :<: Epsilon nested
```

A length function defined over this datatype will be polymorphically recursive, as the type of the argument changes from Nested a to Nested [a] in the recursive call:

```
-- >>> nestedLength nested
-- 3
nestedLength :: Nested a -> Int
Epsilon = 0
nestedLength :<: xs) = 1 + nestedLength xs nestedLength (_
```

We cannot represent `Nested`

as `Fix`

of some functor, and we can not use `recursion-schemes`

either. However, we can redefine `Nested`

as indexed fixed point.

An important observation is that we (often or always?) use polymorphic recursion as a solution to the lack indexed types. My favorite example is de Bruijn indices for well-scoped terms. Compare

```
data Expr1 a
= Var1 a
| App1 (Expr1 a) (Expr1 a)
| Abs1 (Expr1 (Maybe a))
```

and

```
data Expr2 a n
= Free2 a -- split free and bound variables
| Bound2 (Fin n)
| App2 (Expr2 a n) (Expr2 a n)
| Abs2 (Expr2 a ('S n)) -- extend bound context by one
```

Which one is *simpler* is a really good discussion, but for another time.

In `Nested`

example the single argument is also used for two purposes: the type of an base element (`Int`

) and container type (starts with `Identity`

and increases with extra list layer).

One approach is just use `Nat`

index and have a type family ^{4}

```
type family Container (n :: Nat) :: Type -> Type where
Container 'Z = Identity
Container ('S n) = Compose [] (Container n)
```

or

```
data NestedF a rec f
= f a :<<: rec (Compose [] f)
| EpsilonF
instance IxFunctor (NestedF a) where
EpsilonF = EpsilonF
ixmap _ :<<: xs) = x :<<: f xs ixmap f (x
```

We can convert from `Nested a`

to `IxFix (NestedF a) Identity`

and back. We use `coerce`

to help with `newtype`

plumbing.

```
convert :: Nested a -> IxFix (NestedF a) Identity
= aux . coerce where
convert aux :: Nested (f a) -> IxFix (NestedF a) f
Epsilon = IxFix EpsilonF
aux :<: xs) = IxFix (x :<<: aux (coerce xs))
aux (x
-- back left as an exercise
```

And then we can write `nestedLength`

as a fold.

```
-- >>> nestedLength2 (convert nested)
-- 3
nestedLength2 :: IxFix (NestedF a) f -> Int
= getConst . foldIxFix alg where
nestedLength2 alg :: NestedF a (Const Int) ~> Const Int
EpsilonF = Const 0
alg :<<: Const n) = Const (n + 1) alg (_
```

In the introduction I mentioned an ordinary list, which is a fixed point

where I use notation to represent least fixed points: . Note that we first introduce a type parameter with , and then make a fixed point with .

We can define an ordirinary list using `IxFix`

, by taking a fixed point of a `Type -> Type`

thing, i.e. first , and then .

```
data ListF1 rec a = NilF1 | ConsF1 a (rec a)
type List1 = IxFix ListF1
fromList1 :: [a] -> List1 a
= IxFix NilF1
fromList1 [] :xs) = IxFix (ConsF1 x (fromList1 xs)) fromList1 (x
```

Compare to Agda code:

```
-- parameter
data List (A : Set) : Set where
: List A
nil : A -> List A -> List A
cons
-- index
data List : Set -> Set where
: (A : Set) -> List A
nil : (A : Set) -> A -> List A -> List A cons
```

These types are subtly different. See https://stackoverflow.com/questions/24600256/difference-between-type-parameters-and-indices.

This gives a hint why `Agda`

people define `Vec (A : Set) : Nat -> Set`

, i.e. length as the last parameter: because you have to do that way. And Haskellers (usually) define as `Vec (n :: Nat) (a :: Type)`

, because then `Vec`

can be given `Functor`

etc instances. In other words, machinery in both languages forces an order of of type arguments.

Finally, we can write parametric version of `List`

using `IxFix`

too. We just use a dummy, boring index.

```
data ListF2 a rec (unused :: ()) = NilF2 | ConsF2 a (rec unused)
type List2 a = IxFix (ListF2 a) '()
fromList2 :: [a] -> List2 a
= IxFix NilF2
fromList2 [] :xs) = IxFix (ConsF2 x (fromList2 xs)) fromList2 (x
```

`IxFix`

is more general than `Fix`

, but if you don’t need an extra power, maybe you shouldn’t use it.

Do we need something even more powerful than `IxFix`

? I don’t think so. If we need more (dependent) indices, we can pack them all into a single index by tupling (or -mming) them.

We have seen `IxFix`

, fixed point of indexed functor. I honestly do not think that you should start looking in your code base whether you can use it. I suspect it is more useful as thinking and experimentation tool. It is an interesting gadget.

I’m sorry that tilde

`~>`

and dash`->`

arrows look so similar.↩︎Note that

`FFunctor`

in https://hackage.haskell.org/package/hkd-0.1/docs/Data-HKD.html (which is defined with different names in other packages as well) is of different kind.`IxFunctor`

in https://hackage.haskell.org/package/indexed-0.1.3/docs/Data-Functor-Indexed.html is again different. Sorry for proliferation of various functors. And for confusing terminology. Dominic Orchard et al uses terms graded (`k -> Type`

, this post) and parameterised (`k -> k -> Type`

,`indexed`

-package) in https://arxiv.org/abs/2001.10274v2. There is no monad-name for`hkd`

-package variant, as that cannot be made into monad-like thing.↩︎You may wonder why function name is

`append`

, but operation is concatenation? This is similar to having`plus`

for addition.↩︎Here one starts to wish that GHC had unsaturated type families, so we wouldn’t need to use newtypes...↩︎

There are various practices of authoring patches or commits in version control systems. If you are, like me, annoyed by *fix typo* fix up commits in pull or merge requests you get at work or in a open source project, or if you simply get too much contributions (which is good place to be, tell me how get there), then `git-badc0de`

is the tool for you.

The problem is clearly that it is too easy to make new commits. The solution is make creating commits harder. Git developers make `git`

interface saner with each release, and there are various tools (like GitHub web editing), which make writing *fix typo* commits child play easy.

`git-badc0de`

(GitHub: phadej/git-badc0de) takes an out-of-the-box approach. Instead of trying to encourage (or force) humans to put more effort into each commit, it makes their machines do the work.

`git-badc0de`

takes the `HEAD`

commit, and creates an altered copy, such that the commit hash starts with some (by default `badc0de`

) prefix. Obviously, I use `git-badc0de`

in the development process of `git-badc0de`

itself. Check the tail of `git log`

:

```
badc0dea Add CONTRIBUTING.md
badc0ded Make prefix configurable.
badc0de4 Add message to do git reset
badc0de6 Comment out some debug output
badc0de5 Initial commit
```

It's up to the project owners to decide how long prefix you want to have. Seven base16 characters (i.e. 28 bits out of 160) is doable on modern multi-core hardware in a minute, with good luck in less^{1}. These seconds are important. It is an opportunity to reflect, maybe even notice a typo in the commit message. Modern machines are so fast, and even some compilers too^{2}, that we don't pause and think of what we have just done.

*Git is content-addressable file system* is how a chapter in Pro Git book on Git objects starts. Very nice model, very easy to tinker with. You can try out to `git cat-file -p HEAD`

in your current Git project to see the `HEAD`

commit object data. In `git-badc0de`

one commit looks like:

```
tree 91aaad77e68aa7bf94219a5b9cea97f26e2cce2b
parent badc0dea0106987c4edfb1d169b5a43d95845569
author Oleg Grenrus <oleg.grenrus@iki.fi> 1596481157 +0300
committer Oleg Grenrus <oleg.grenrus@iki.fi> 1596481157 +0300
Rewrite README.md
PoW: HRYsAAAAAAF
```

Git commit hash is a hash of a header and these contents. A header for commit object looks like

`commit <content length as ASCII number><NUL>`

`git-badc0de`

takes the most recent commit data, and by adding some `PoW: DECAFC0FFEE`

*salts* to the end, tries to find one which makes commit hash with the correct prefix. It takes 11 characters to encode 64 bits in base64. Why base64, no particular reason. Based on this StackOverflow answer we could put salts into commit headers, to hide them from `git log`

. Something to do in the future.

When a valid salt is found, `git-badc0de`

writes the new commit object to the Git object store. At this point nothing is changed, only a new dangling object inside `.git`

directory. You can reset your branch to point to the new commit with `git reset`

, and `git-badc0de`

invites you to do so.

Yes, I'm dead serious (*No*). But I had fun implementing `git-badc0de`

. I was surprised that getting seven characters "right" in a hash is an easy job. That causes nice artifacts in GitHub web interface.

The top commit shown on the project main page is always `badc0de`

...

... and in fact all commits seem to have the same hash (prefix)...

Note how command line `git log`

is smart to show enough characters to make prefixes unambiguous. It is deliberate, check on some of your smaller projects, there `git log --oneline`

probably prints seven character abbreviations. In GHC (Haskell compiler) `git log --oneline`

prints ten characters for me (GitHub still shows just seven, so I assume it is hardcoded).

We can also use `git-badc0de`

to produce commits with ordered hashes! The downside is that you have to decide the maximum commit count at the start. Yet should be enough for about any project. See ordered branch, isn't that cool!?

How `git-badc0de`

is implemented? I have to confess: I started with a Python prototype. Python comes with all pieces needed, though I essentially only needed `hashlib`

.

The Haskell implementation has eleven dependencies at the moment of writing. Five of them are bundled with compiler, the rest six are not. Even for some basic tasks you have to go package shopping:

`async`

to parallelize computations`base16-bytestring`

`cryptohash-sha1`

`primitive`

to write some low level code`utf8-string`

to convert from byte UTF-8`ByteString`

representation to`String`

and back.`zlib`

to compress git objects, as they are stored compressed.

My first Haskell implementation was noticeably faster than Python3 version. I suspect that is because Haskell is simply better at gluing bytes together.

The motivation to use Haskell had two parts:

- I just use Haskell for everything. (Except for prototyping silly ideas). This is the most important reason.
- Haskell is good for writing parallel programs. This is a bonus.

To my surprise, my first Haskell parallelization attempt didn't work at all. An idea is to spawn multiple workers, which would try different salts. And then make them race, until one worker finds a valid salt. Adding more workers should not slowdown the overall program, minus maybe some small managerial overhead.

The overhead turned out to be quite large. Parallelism in Haskell works well when you deal with Haskell "native" data. `git-badc0de`

use case is however gluing bytes (`ByteString`

s) together and calling out to C implementation of SHA1 algorithm.

The nasty detail of, I guess any, higher level languages is that foreign function interface has culprits. I run into `foreign import unsafe`

issue. You may read about `foreign import unsafe`

in the excellent GHC User Guide.

GHCguaranteesthat garbage collection will never occur during an`unsafe`

call, ...

With many threads generating some amount of garbage, but also calling `unsafe`

foreign functions in a tight loop caused problems. Surprisingly, both `cryptohash-sha1`

and `bytestring`

use plenty of `unsafe`

calls (`cryptonite`

uses too).

My solution was to redo the loop. Less garbage generation and less foreign calls.

`cryptohash-sha1`

(and `cryptonite`

) import `_init`

, `_update`

and `_finalize`

C functions. The hashing context is allocated and plumbing done in Haskell. However, we can setup things in way such that we pass a single continuous block of memory to be hashed. This functionality is missing from the library, so I copied `cbits/`

from `cryptohash-sha1`

and added small C function, to do C plumbing in C:

```
void
const uint8_t *data, size_t len, uint8_t *out)
hs_cryptohash_sha1(
{struct sha1_ctx ctx;
hs_cryptohash_sha1_init(&ctx);
hs_cryptohash_sha1_update(&ctx, data, len);
hs_cryptohash_sha1_finalize(&ctx, out); }
```

This way we can have one `safe`

foreign call to calculate the hash. We have to make sure that pointers are pointing at pinned memory, i.e. memory which garbage collector won't move.

Next, the same problem is in the `bytestring`

library, I was naive to think that as byte data I work with is so small, that concatenating it (and thus `memcpy`

ing) won't be noticeable, hashing should dominate. Usually it isn't a problem, but as copying was done on each loop iteration and `memcpy`

is `foreign import unsafe`

in `bytestring`

library, that also contributed to slowdown. That was my hypothesis.

Figuring out how to do it better with `bytestring`

seemed difficult, so I opted for a different solution. Write some C-in-Haskell. Now each worker creates own *mutable* template, which is updated with new salt on each loop iteration. Salt length is fixed, so we don't need to change the commit object header. As a bonus, the loop become essentially non-allocating (I didn't check though).

After that change, `git-badc0de`

started to use all the cores, and not just spin in GC locks. The runtime system statistics are nice to look at

```
Tot time (elapsed) Avg pause Max pause
Gen 0 0 colls, 0 par 0.000s 0.000s 0.0000s 0.0000s
Gen 1 1 colls, 0 par 0.000s 0.000s 0.0004s 0.0004s
...
Productivity 100.0% of total user, 99.9% of total elapsed
```

No time is spent in garbage collection. Productivity is an amount of time used to do actual work and not collecting garbage. Disclaimer: it seems that waiting for GC locks is not counted towards GC time, but as there was only a single collection, that doesn't matter.

I could optimize further: as the salt is at the end of the content it is silly to rehash whole commit object every time. Yet, `git-badc0de`

is silly project to begin with, and I am satisfied with the current state.

The lesson here is that foreign function interface (FFI) is not easy, you have to think and test.

"Luckily" I learned about the `unsafe`

issue recently in `postgresql-libpq`

, so was able to think about it causing my problems. In this case, `unsafe`

doesn't mean that "I know what I'm doing" (as e.g. with `unsafePerformIO`

), but rather the opposite.

Also, I don't think that we (= Haskell ecosystem) have a good tooling to benchmark how code behaves in highly parallel environments. I *hope* that `Data.ByteString.Builder`

, for example, doesn't use any `unsafe`

foreign calls, Ecosystem relies on that module for constructing JSON (in `aeson`

) and HTML (both `blaze-markup`

and `lucid`

). Something for someone to test, maybe fix and document.

Does this mean that Haskell is crap, and the promise for easy parallel and concurrent programming is a lie, and we should all use Rust instead?

Well, no. In this isolated example, Rust would probably shine. There are, however, also other parts than hashing loop even in this simple program, and they have to be written as well. There Haskell feels a lot like Python, in a sense that I can just write consice code which works.

Python was quite nice in the very early prototyping stage, as it happened to have all needed functionality available in the repl. The "early prototyping stage" lasted for maybe 10 or 15 minutes, that was enough to verify that basic idea might work. With Haskell, you would need to restart repl to add a new library, losing all the context, which would killed the flow. For some other "problem"I might start to prototype directly in Haskell. I have no experience with how nice repl experience Rust has.

If `git-badc0de`

project were to develop further, I would rewrite the hashing loop in C, instead of writing C-in-Haskell. Maybe. Or in Rust, if I that was easier to setup. (GHC knows how to invoke C compiler).

Haskell is a great glue language, among many other great properties it has. Don't believe anyone who tells you otherwise.

There is prior art, this is not a novel idea. Just search the internet for *git commit prefix miner*, e.g.

- Rust version: https://github.com/gunnihinn/git-commit-mine
- Bash: https://github.com/franckverrot/git-mine-commit

or *git commit vanity hash*

- Python and OpenCL: https://github.com/tochev/git-vanity
- Ruby: https://github.com/mattbaker/git-vanity-sha

None is meta-used, i.e. use themselves on own commits, so I cannot be sure that they work :)

Eight base16 characters (i.e. 4 bytes) took one and half hour of CPU time or "just" 5.5 minutes of wall clock time. I run that experiment only few times. Take a look at deadc0de branch.↩︎

If your programming language of choice is a compiled one.↩︎