C++ Boost::Range vs .NET LINQ vs Scala

In my day-job most of my development is in C++ so I’m always comparing ideas from other languages against what’s available in C++. I’ve been experimenting with boost::range, a C++ library that makes it easier to compose algorithms in a functional style. I’m comparing it to .NET LINQ and to for-comprehensions in Scala.

My example: List of orders, each order has a discount and list of order-items. We want the total cost of all Bread items with discounts applied.

Scala for-comprehension

def orders: List[Order] = List(
    new Order( 25, List( 
        new OrderItem("Eggs", 12, 0.50), 
        new OrderItem("Bread", 4, 1.25) ) ),
    new Order( 50, List( 
        new OrderItem("Bread", 2, 1.25), 
        new OrderItem("Milk", 2, 0.40) ) )
)

test("Orders for-comprehension") {
    val totals = for ( 
        o <- orders;
        i <- o.orderItems
        if i.product == "Bread" )
      yield i.amount * i.price * (1-o.discountPercent/100)
    val sum = totals.sum

    assert( sum == 5.00, s"Sum: ${sum}" )
}
  
test("Orders flatMap") {
    val totals = 
      orders.flatMap( o => 
        o.orderItems.filter( i=> i.product == "Bread" ).
        map( i => i.amount * i.price * (1-o.discountPercent/100) ) )
    val sum = totals.sum
        
    assert( sum == 5.00, s"Sum: ${sum}" )
}

This took about 30 minutes to come up with an example and code it. It was pretty obvious how to approach given even a little Scala experience: we want the orders and order-items, filter on product, map to cost, and sum. Scala as a functional language pushes you towards for-comprehensions or map/filter functions.

.NET LINQ query

static IList orders = new List {
	new Order( 25, new List { 
		new OrderItem( "Eggs", 12, 0.50 ), 
		new OrderItem( "Bread", 4, 1.25) } ),
	new Order( 50, new List { 
		new OrderItem( "Bread", 2, 1.25 ), 
		new OrderItem( "Milk", 2, 0.40 )} )
};

[TestMethod]
public void Orders_Linq_Query()
{
	var costs =
		from order in orders
		from item in order.orderItems
		where item.product == "Bread"
		select item.amount * item.price * (1 - order.discountPercent / 100);
	var sum = costs.Sum();

	Assert.AreEqual(5.0, sum);
}

[TestMethod]
public void Orders_Linq_Methods()
{
	var sum = orders.SelectMany( order => order.orderItems.Select( item => Tuple.Create( order, item ) ) ).
		Where( t => t.Item2.product == "Bread").
		Select( t => t.Item2.amount * t.Item2.price * (1-t.Item1.discountPercent/100)).
		Sum();

	Assert.AreEqual(5.0, sum);
}

This version took about 15 minutes. It’s also pretty easy to see how to do it in LINQ. Method names are different than in Scala but the operations are basically the same. LINQ has quickly become the standard way to approach this type of data manipulation in C# so it’s the obvious approach. The support from Visual Studio and the C# compiler is also great, with popups showing expected parameters, the type system stopping many errors and when you make a mistake often getting a single error message giving a strong hint how to correct it. This makes it very quick to write, and lets you stay thinking about the result you want.

Aside: This exercise showed one of the biggest reasons for using Scala for-comprehension or .NET LINQ query syntax which I’d never quite understood before. Mostly I’ve used chained methods as you can do more, not just map/filter but take(n) to get just the first few items or sum() to get totals etc. Too often when I’ve started out using query syntax I’ve hit a wall and switched to methods.  But where they shine is in using joins and groups. Both Scala for-comprehensions and .NET LINQ query syntax avoid the complicated nesting needed when using flatMap/SelectMany to flatten or join collections and make it much easier to see you’re writing the correct query.

C++ with Boost Range

vector orders{
    { 25, {{ "Eggs", 12, 0.50 }, { "Bread", 4, 1.25 } } },
    { 50, {{ "Bread", 2, 1.25 }, { "Milk", 2, 0.40 } } }
};

TEST_METHOD( Orders_ImperativeLoops ) {
    double sum = 0.0;
    for ( const auto& order : orders ) {
        for ( const auto& item : order.orderItems ) {
            if ( item.product == "Bread" )
                sum += item.amount * item.price * (1.0 - order.discountPercent / 100.0);
        }
    }

    Assert::AreEqual( 5.0, sum );
}

TEST_METHOD( Orders_LoopAndAccumulate ) {
    double sum = 0.0;
    for ( const auto& order : orders ) {
        sum += accumulate( 
            order.orderItems 
            | filtered( []( const OrderItem& item ){
                return item.product == "Bread"; } )
            | transformed( function([&order]( const OrderItem& item ){ 
                return item.amount * item.price * (1.0 - order.discountPercent / 100.0); } ) ),
            0.0
        );
    }

    Assert::AreEqual( 5.0, sum );
}

TEST_METHOD( Orders_AccumulateTwice ) {

    struct SumOrderCost : public std::unary_function {
        SumOrderCost( string product ) : product( product )  {}
        double operator()( const Order& order ) const {
            return accumulate(
                order.orderItems
                    | filtered( [=]( const OrderItem& item ){
                        return item.product == product; } )
                    | transformed( function( []( const OrderItem& item ){
                        return item.amount * item.price; } ) ),
                0.0 )
                * (1.0 - order.discountPercent / 100.0);
        }
    private:
        string product;
    };

    const double sum = accumulate(
        orders | transformed( SumOrderCost( "Bread" ) ),
        0.0);
    Assert::AreEqual( 5.0, sum );
}

This took several hours because I played with different approaches as there’s no single obvious way to do it in C++.

Despite the STL being available for more than 15 years it’s not been the standard approach for most C++ programmers. This is partly due to C++ not having lambda functions until recently which made the STL harder to use. And partly due to C++ programmers commonly thinking at a very low level so they tend to focus on the implementation detail and hand-roll a loop instead of using a library function. This is changing, C++11 and a new emphasis on how the language is presented is changing what tools C++ programmers have at the top of their mental toolbox.

There are also issues with differences in tools. The strong C++ type system prevents many errors.  But the C++ compilers are slower.  I’m sure there are reasons for this, but after hearing Herb Sutter saying at Going Native 2013 that the C# language spec was just as long as the C++ language spec, it’s disappointing when the C++ compiler takes 10 times as long as C# to recompile after an edit, and any delays break a developer’s concentration, get them out of flow and reduce productivity a little.  Also whereas the C#  compiler gives few succinct errors for a syntax error, the C++ compiler gives a page full of long messages, giving a lot of detail but not giving a clear idea of what to change to fix it.  So the developer spends time to read and understand the message, then thinks how that relates to the code, then thinks of ways to change the code to fix it.  Again this extra time breaks developers out of flow and they are not thinking about the original problem, they are only thinking about implementation details.

But now to the biggest reason why I had more trouble with C++ Boost Range for this example: no equivalent of flatMap/SelectMany. For a map or a filter it is easy to see the equivalent:

  • map == Select == transformed
  • filter == Where == filtered.

Boost Range does a good job of allowing many simple operations. But for flatMap/SelectMany there’s nothing currently available so you have to hunt for alternatives. This stops you thinking at a high-level about what you want, and forces you down to a much lower-level considering how to do it. This context switching from one layer of abstraction to a lower layer slows the programmer down.

Boost Range makes it much simpler to compose the existing C++ STL algorithms. It starts with the same approach suggested by Bjarne Stroustrup in "The C++ Programming Language" of giving a simpler interface to common algorithms by passing the container instead of begin/end iterators. But adding Range Adapters takes it much closer to the fluid composability available in functional languages. This approach will not be appropriate in all environments where C++ is used, and the gaps from the standard functional feature set can cause delays finding workarounds, but it shows the approach holds promise.

I’ve only scratched the surface so far. There are other C++ functional libraries already out there to investigate, and plenty more bits of Boost Range to play with. But functional-style C++ has already become one of my standard tools.

Links:

No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *