Modelling data relationships with C# types

A C# example implementation of Ghosts of Departed Proofs. This article continues where Modelling data relationships with F# types left off. It ports the F# example code to C#. If you don't read F# source code, you may instead want to read Implementing rod-cutting to get a sense of the problem being addressed. I'm going to assume that you've read enough of the previous articles to get a sense of the example, but in short, this article examines if it's possible to use the type system to model data relationships. Specifically, we have methods that operate on a collection and a number. The precondition for calling these methods is that the number is a valid (one-based) index into the collection. While you would typically implement such a precondition with a Guard Clause and communicate it via documentation, you can also use the Ghosts of Departed Proofs technique to instead leverage the type system. Please see the previous article for an overview. That said, I'll repeat one point here: The purpose of these articles is to showcase a technique, using a simple example to make it, I hope, sufficiently clear what's going on. All this machinery is hardly warranted for an example as simple as this. All of this is a demonstration, not a recommendation. Size proofs # As in the previous article, we may start by defining what a 'size proof' looks like. In C#, it may idiomatically be a class with an internal constructor. public sealed class Size {     public int Value { get; }     internal Size(int value)     {         Value = value;     }     // Also override ToString, Equals, and GetHashCode... } Since the constructor is internal it means that client code can't create Size instances, and thereby client code can't decide a concrete type for the phantom type T. Issuing size proofs # How may client code create Size objects? It may ask a PriceList object to issue a proof: public sealed class PriceList {     public IReadOnlyCollection Prices { get; }     internal PriceList(IReadOnlyCollection prices)     {         Prices = prices;     }     public Size? TryCreateSize(int candidate)     {         if (0  0)     {         sizes.Add(cuts[size.Value - 1].Size);         size = new Size(size.Value - cuts[size.Value - 1].Size.Value);     }     return sizes; } This is another instance method on PriceList, which is where T is defined. Proof-based revenue API # Finally, we may also implement a method to calculate the revenue from a given sequence of cuts. public int CalculateRevenue(IReadOnlyCollection cuts) {     var arr = Prices.ToArray();     return cuts.Sum(c => arr[c.Value - 1]); } Not surprisingly, I hope, CalculateRevenue is another instance method on PriceList. The cuts will typically come from a call to Solve, but it's entirely possible for client code to create an ad-hoc collection of Size objects by repeatedly calling TryCreateSize. Running client code # How does client code use this API? It calls an Accept method with an implementation of this interface: public interface IPriceListVisitor {     TResult Visit(PriceList priceList); } Why 'visitor'? This doesn't quite look like a Visitor, and yet, it still does. Imagine, for a moment, that we could enumerate all types that T could inhabit. TResult Visit(PriceList priceList); TResult Visit(PriceList priceList); TResult Visit(PriceList priceList); // ⋮ TResult Visit(PriceList priceList); Clearly we can't do that, since T is infinite, but if we could, the interface would look like a Visitor. I find the situation sufficiently similar to name the interface with the Visitor suffix. Now we only need a class with an Accept method. public sealed class RodCutter(IReadOnlyCollection prices) {     public TResult Accept(IPriceListVisitor visitor)     {         return visitor.Visit(new PriceList(prices));     } } Client code may create a RodCutter object, as well as one or more classes that implement IPriceListVisitor, and in this way interact with the library API. Let's see some examples. We'll start with the original CLRS example, written as an xUnit.net test. [Fact] public void ClrsExample() {     var sut = new RodCutter([1, 5, 8, 9, 10, 17, 17, 20, 24, 30]);     var actual = sut.Accept(new CutRodVisitor(10));     var expected = new[] {         ( 1,  1),         ( 5,  2),         ( 8,  3),         (10,  2),         (13,  2),         (17,  6),         (18,  1),         (22,  2),         (25,  3),         (30, 10)     };     Assert.Equal(expected, actual); }

Feb 3, 2025 - 20:36
 0
Modelling data relationships with C# types

A C# example implementation of Ghosts of Departed Proofs.

This article continues where Modelling data relationships with F# types left off. It ports the F# example code to C#. If you don't read F# source code, you may instead want to read Implementing rod-cutting to get a sense of the problem being addressed.

I'm going to assume that you've read enough of the previous articles to get a sense of the example, but in short, this article examines if it's possible to use the type system to model data relationships. Specifically, we have methods that operate on a collection and a number. The precondition for calling these methods is that the number is a valid (one-based) index into the collection.

While you would typically implement such a precondition with a Guard Clause and communicate it via documentation, you can also use the Ghosts of Departed Proofs technique to instead leverage the type system. Please see the previous article for an overview.

That said, I'll repeat one point here: The purpose of these articles is to showcase a technique, using a simple example to make it, I hope, sufficiently clear what's going on. All this machinery is hardly warranted for an example as simple as this. All of this is a demonstration, not a recommendation.

Size proofs #

As in the previous article, we may start by defining what a 'size proof' looks like. In C#, it may idiomatically be a class with an internal constructor.

public sealed class Size<T>
{
    public int Value { get; }
 
    internal Size(int value)
    {
        Value = value;
    }
 
    // Also override ToString, Equals, and GetHashCode...
}

Since the constructor is internal it means that client code can't create Size<T> instances, and thereby client code can't decide a concrete type for the phantom type T.

Issuing size proofs #

How may client code create Size<T> objects? It may ask a PriceList<T> object to issue a proof:

public sealed class PriceList<T>
{
    public IReadOnlyCollection<int> Prices { get; }
 
    internal PriceList(IReadOnlyCollection<intprices)
    {
        Prices = prices;
    }
 
    public Size<T>? TryCreateSize(int candidate)
    {
        if (0 < candidate && candidate <= Prices.Count)
            return new Size<T>(candidate);
        else
            return null;
    }
 
    // More members go here...

If the requested candidate integer represents a valid (one-indexed) position in the PriceList<T> object, the return value is a Size<T> object that contains the candidate. If, on the other hand, the candidate isn't in the valid range, no object is returned.

Since both PriceList<T> and Size<T> classes are immutable, once a 'size proof' has been issued, it remains valid. As I've previously argued, immutability makes encapsulation simpler.

This kind of API does, however, look like it's turtles all the way down. After all, the PriceList<T> constructor is also internal. Now the question becomes: How does client code create PriceList<T> objects?

The short answer is that it doesn't. Instead, it'll be given an object by the library API. You'll see how that works later, but first, let's review what such an API enables us to express.

Proof-based Cut API #

As described in Encapsulating rod-cutting, returning a collection of 'cut' objects better communicates postconditions than returning a tuple of two arrays, as the original algorithm suggested. In other words, we're going to need a type for that.

public sealed record Cut<T>(int RevenueSize<TSize);

In this case we can get by with a simple record type. Since one of the properties is of the type Size<T>, client code can't create Cut<T> instances, just like it can't create Size<T> or PriceList<T> objects. This is what we want, because a Cut<T> object encapsulates a proof that it's valid, related to the original collection of prices.

We can now define the Cut method as an instance method on PriceList<T>. Notice how all the T type arguments line up. As input, the Cut method only accepts Size<T> proofs issued by a compatible price list. This is enforced at compile time, not at run time.

public IReadOnlyCollection<Cut<T>> Cut(Size<Tn)
{
    var p = Prices.Prepend(0).ToArray();
    var r = new int[n.Value + 1];
    var s = new int[n.Value + 1];
    r[0] = 0;
    for (int j = 1; j <= n.Value; j++)
    {
        var q = int.MinValue;
        for (int i = 1; i <= ji++)
        {
            var candidate = p[i] + r[j - i];
            if (q < candidate)
            {
                q = candidate;
                s[j] = i;
            }
        }
        r[j] = q;
    }
 
    var cuts = new List<Cut<T>>();
    for (int i = 1; i <= n.Value; i++)
    {
        var revenue = r[i];
        var size = new Size<T>(s[i]);
        cuts.Add(new Cut<T>(revenuesize));
    }
    return cuts;
}

For good measure, I'm showing the entire implementation, but you only need to pay attention to the method signature. The point is that n is constrained by the type system to be in a valid range.

Proof-based Solve API #

The same technique can be applied to the Solve method. Just align the Ts.

public IReadOnlyCollection<Size<T>> Solve(Size<Tn)
{
    var cuts = Cut(n).ToArray();
    var sizes = new List<Size<T>>();
    var size = n;
    while (size.Value > 0)
    {
        sizes.Add(cuts[size.Value - 1].Size);
        size = new Size<T>(size.Value - cuts[size.Value - 1].Size.Value);
    }
    return sizes;
}

This is another instance method on PriceList<T>, which is where T is defined.

Proof-based revenue API #

Finally, we may also implement a method to calculate the revenue from a given sequence of cuts.

public int CalculateRevenue(IReadOnlyCollection<Size<T>> cuts)
{
    var arr = Prices.ToArray();
    return cuts.Sum(c => arr[c.Value - 1]);
}

Not surprisingly, I hope, CalculateRevenue is another instance method on PriceList<T>. The cuts will typically come from a call to Solve, but it's entirely possible for client code to create an ad-hoc collection of Size<T> objects by repeatedly calling TryCreateSize.

Running client code #

How does client code use this API? It calls an Accept method with an implementation of this interface:

public interface IPriceListVisitor<TResult>
{
    TResult Visit<T>(PriceList<TpriceList);
}

Why 'visitor'? This doesn't quite look like a Visitor, and yet, it still does.

Imagine, for a moment, that we could enumerate all types that T could inhabit.

TResult Visit(PriceList<Type1priceList);
TResult Visit(PriceList<Type2priceList);
TResult Visit(PriceList<Type3priceList);
// ⋮
TResult Visit(PriceList<TypeNpriceList);

Clearly we can't do that, since T is infinite, but if we could, the interface would look like a Visitor.

I find the situation sufficiently similar to name the interface with the Visitor suffix. Now we only need a class with an Accept method.

public sealed class RodCutter(IReadOnlyCollection<intprices)
{
    public TResult Accept<TResult>(IPriceListVisitor<TResultvisitor)
    {
        return visitor.Visit(new PriceList<object>(prices));
    }
}

Client code may create a RodCutter object, as well as one or more classes that implement IPriceListVisitor<TResult>, and in this way interact with the library API.

Let's see some examples. We'll start with the original CLRS example, written as an xUnit.net test.

[Fact]
public void ClrsExample()
{
    var sut = new RodCutter([1, 5, 8, 9, 10, 17, 17, 20, 24, 30]);
 
    var actual = sut.Accept(new CutRodVisitor(10));
 
    var expected = new[] {
        ( 1,  1),
        ( 5,  2),
        ( 8,  3),
        (10,  2),
        (13,  2),
        (17,  6),
        (18,  1),
        (22,  2),
        (25,  3),
        (30, 10)
    };
    Assert.Equal(expectedactual);
}

CutRodVisitor is a nested class that implements the IPriceListVisitor<TResult> interface:

private class CutRodVisitor(int i) :
    IPriceListVisitor<IReadOnlyCollection<(intint)>>
{
    public IReadOnlyCollection<(intint)> Visit<T>(PriceList<TpriceList)
    {
        var n = priceList.TryCreateSize(i);
        if (n is null)
            return [];
        else
        {
            var cuts = priceList.Cut(n);
            return cuts.Select(c => (c.Revenue, c.Size.Value)).ToArray();
        }
    }
}

The CutRodVisitor class returns a collection of tuples. Why doesn't it just return cuts directly?

It can't, because it wouldn't type-check. Think about it for a moment. When you implement the interface, you need to pick a type for TResult. You can't, however, declare it to implement IPriceListVisitor<Cut<T>> (where T would be the T from Visit<T>), because at the class level, you don't know what T is. Neither does the compiler.

Your Visit<T> implementation must work for any T.

Preventing misalignment #

Finally, here's a demonstration of how the phantom type prevents confusing or mixing up two (or more) different price lists. Consider this rather artificial example:

[Fact]
public void NestTwoSolutions()
{
    var sut = new RodCutter([1, 2, 2]);
    var inner = new RodCutter([1]);
 
    (intint)? actual = sut.Accept(new NestedRevenueVisitor(inner));
 
    Assert.Equal((3, 1), actual);
}

This unit test creates two price arrays and calls Accept on one of them (the 'outer' one, you may say), while passing the inner one to the Visitor, which at first glance just looks like this:

private class NestedRevenueVisitor(RodCutter inner) :
    IPriceListVisitor<(intint)?>
{
    public (intint)? Visit<T>(PriceList<TpriceList)
    {
        return inner.Accept(new InnerRevenueVisitor<T>(priceList));
    }
 
    // Inner visitor goes here...
}

Notice that it only delegates to yet another Visitor, passing the 'outer' priceList as a constructor parameter to the next Visitor. The purpose of this is to bring two PriceList<T> objects in scope at the same time. This will enable us to examine what happens if we make a programming mistake.

First, however, here's the proper, working implementation without mistakes:

private class InnerRevenueVisitor<T>(PriceList<TpriceList1) : IPriceListVisitor<(intint)?>
{
    public (intint)? Visit<T1>(PriceList<T1priceList2)
    {
        var n1 = priceList1.TryCreateSize(3);
        if (n1 is null)
            return null;
        else
        {
            var cuts1 = priceList1.Solve(n1);
            var revenue1 = priceList1.CalculateRevenue(cuts1);
 
            var n2 = priceList2.TryCreateSize(1);
            if (n2 is null)
                return null;
            else
            {
                var cuts2 = priceList2.Solve(n2);
                var revenue2 = priceList2.CalculateRevenue(cuts2);
 
                return (revenue1revenue2);
            }
        }
    }
}

Notice how both priceList1 and priceList2 are now both in scope. So far, they're not mixed up, so the Visit implementation queries first one and then another for the optimal revenue. If all works well (which it does), it returns a tuple with the two revenues.

What happens if I make a mistake? What if, for example, I write priceList2.Solve(n1)? It shouldn't be possible to use n1, which was issued by pricelist1, with priceList2. And indeed this isn't possible. With that mistake, the code doesn't compile. The compiler error is:

Argument 1: cannot convert from 'Ploeh.Samples.RodCutting.Size' to 'Ploeh.Samples.RodCutting.Size'

When you look at the types, that makes sense. After all, there's no guarantee that T is equal to T1.

You'll run into similar problems if you mix up the two 'contexts' in other ways. The code doesn't compile. Which is what you want.

Conclusion #

This article demonstrates how to use the Ghosts of Departed Proofs technique in C#. In some ways, I find that it comes across as more idiomatic in C# than in F#. I think this is because rank-2 polymorphism is only possible in F# when using its object-oriented features. Since F# is a functional-first programming language, it seems a little out of place there, whereas it looks more at home in C#.

Perhaps I should have designed the F# code to make use of objects to the same degree as I've done here in C#.

I think I actually like how the C# API turned out, although having to define and implement a class every time you need to supply a Visitor may feel a bit cumbersome. Even so, developer experience shouldn't be exclusively about saving a few keystrokes. After all, typing isn't a bottleneck.


This blog is totally free, but if you like it, please consider supporting it.