Trying F#
I’ve previously mentioned a passing interest in F# because it was sufficiently different from languages I knew and had many concepts that sounded immediately useful. It took me almost 2 years to take the plunge and implement a non-trivial app in F#. It took 3 tries to get something that felt usable and idiomatic. This the first part of my journal of lessons learned.
Starting out
I didn’t buy any books or take any classes, but I did read every bit of literature online before writing anything at all. I had been following the F# weekly blog for some time and devoured every piece of content there. Honestly, the biggest kick to get started was just straight up reading the MSDN documentation for the language. Some of it felt like going through the toolbox of an alien spaceship, where I had no idea what was going on but it still looked really cool. F# is a very large language in contrast with the last language I picked up. It felt like most of documentation ink was directed towards the corner cases for C# interop. I tried to focus on what I saw as learning the core of the language, the collection functions.
The first task I took on as a learning project was to port an existing C++14 project. I planned to keep about 80% of the original functionality and added 20% more functionality than the original. What I was really looking to find is if the F# version of the program would be quicker to create, and then easier to extend and maintain than if I added that 20% of functionality back to the original program. The program’s functionality was well defined and self contained. It’s sent streaming scientific data from a key-value store and then writes results back to that same store. It needs to maintain some internal state to perform analysis on values that are streamed in and have a sub second response time. This worked well enough in C++, but the solution needed to grow in ways that would require changing major components and code organization. Conceptually, most of the business logic was simple enough to be checked by a quick spreadsheet, but the C++ code used many layers of abstractions to hide the implementation details and provide a core computational model that was closer to the domain. These layers of abstractions were costly to maintain, but did a good job in keeping the code clean. Another draw of a functional-first language was for better support for accessible high-level abstractions of data flow, storage and representations. The C++ code base is tiny, about 35k lines in total, with only about 13k in the core application library. It has 80% code coverage and a separate integration test harness written in Python. The goal was to end up with an order of magnitude less code with higher test coverage and less verbose abstractions.
From C++ and Python
Across the board I’ve been doing more Python and less C++. Python’s been consistently easier/faster to get projects from start to shipped and the ecosystem provides fantastic support for my domain. My ongoing C++ work has been taking many cues from my favorite bits of Python: lots of maps, lambdas, anemic objects, and fewer explicit type specifications. The C++ code has been entirely high-level application logic, so much of the control and speed offered by that platform has been wasted while I jump through hoops for transformations that Python includes out of the box. The only thing I’ve come to dislike about Python is the gap between the number of errors caught by a linter and basic unit tests vs. errors completely prevented static types, which is part of the reason why the strong F# types looked promising in the first place.
Naive Conversion
At the start I still didn’t quite understand how to architect a medium sized program within F#, so I copied my OO-like solution. I had types with mostly mutable data members, member functions to mutate them, and some events to connect them. I thought this was good since I was setting up class compositions without inheritance in a similar method to my functional-style C++. It wasn’t obvious how to even structure the project layout, so I was adding and moving files around just to get it to compile. The OO thinking of classes relying on each other quickly turned into layers defined by files. Once I figured this out I sometimes wanted to make it even more restrictive with folders where modules on the same level couldn’t see each other, but it was easy enough to just not include types and functions within modules where they aren’t needed. The clean separation continued with great first class support for F# actors/mailboxes. The previous solution used Qt signals to communicate work between objects on different threads, and F# made this pattern much cleaner in terms of setup and management. F# didn’t force me to make major changes to the program’s organization, and the changes I did need to make were easy and necessary.
The code style of the previous solution translated cleanly because I had already separated my data from the transformations, so most of the work was just figuring out the syntax and looking up the best functions to use. I quickly fell in love with pattern matching and discriminated unions, which were great replacements for the switching and routing logic I had with my previous data model. I picked up the syntax and the core functions on collections quickly, but it didn’t change my thinking on how to solve problems with this new set of tools at my disposal. Most of the time it was just shorthand for what I was doing with previous 5-10 line snippets, but it did eventually add up to much less code. I got so wrapped up with the new features that it wasn’t long before I got lost with what the real strengths of the language were. I didn’t feel impressed with F# as a whole because I wasn’t doing anything different with it. Just porting OO code to a new syntax doesn’t provide much additional flexibility, although it was significantly shorter. I’d compare it to playing with Duplos instead of Legos; the blocks were bigger and easier to snap together, but I had to do just as much work if I wanted the code to implement an algorithm from C++ exactly. F# was pliable enough to write mostly-OO code, but it was more difficult to write OO-shaped code in F# than it was in C++. The barriers that F# syntax put up around things like mutation and type-centric methods made my solution obviously in-elegant. The experience of a few very complicated methods for things that were simpler in C++ was a very big hint that I was doing it wrong.
I did start to use F# specific features that had no direct equivalent in C++. I had read enough on the language to have a long list of features that I thought would be really useful and I planned my solution heavily around trying to leverage them as much as possible. I was most interested in Units of Measure, which sounded great for the application’s scientific domain. The rest of the architecture heavily utilized options, events, and actors. With the OO architecture these didn’t all fit together as I had hoped. I could see the potential of the features and it started to change how much of the control logic was structured, but it was constrained in strange ways, especially with the type inference. I often found that I was doing very complicated transformation functions within functions and still needed explicit annotations in order to keep the type system happy. This wasn’t any better than the hoops for organizing containers in C++ in order to perform higher level operations across them. The specific features of the language weren’t that useful without the right context, and creating/finding that context was harder than I anticipated.
I thought I had some experience with ’thinking functionally’, but I hadn’t done it in a statically typed functional-first language. I had used list comprehensions extensively in Python, and I loved how compact and flexible they could be when working with complex data structures. Within F#, I was often at a loss for how to iterate a subset of collection without creating multiple copies. Also, I found it was annoying to specify and convert if the collection was List, Array, Map or Seq, if all I wanted to do within the function was iterate it. These additional type constraints didn’t help me at first when I wasn’t comfortable choosing the tradeoffs of the specific containers. As for other functional constructs, the options for function composition didn’t play well with how I was using objects to expose only minimal interfaces to data. It took quite a bit of refactoring to see that exposing and passing raw collections was far more flexible and powerful because it provided a common set of tools to use. I had once followed a similar strategy in Python, but it was error prone because I had to make many assumptions about the collection’s structure that could only be validated by complex tests. Once I figured out that the type removed the need for most assumptions like this, I started to really fall in love the strong static type inference. I heavily relied on it to figure out the desired structure my data. Instead of starting with a layout and then working out the transformations to get the necessary information, I started writing the business logic and then worked backwards to store the data in a way to easily enable those calculations! The requirements of structure for types and collections, while jarring at first, was a huge strength of the F# paradigm.
I did quickly find that F# is full of compromises to provide OO and FP tools. F# has some great features, but it doesn’t cover up the implementation as well as Python does. I had to figure out the differences between ref and mutable the hard way because I had tried to use them interchangeably. I got caught a few times with object and array mutations leaking across actors, parse functions returning a pair with bool instead of an Option, and hitting C# exceptions with delegate interop. Most of this was because my first program setup relied heavily on internal mutation and other OO patterns. I often looked for more idiomatic ways to do things, but I sometimes fell back on what I knew well. For function scoping I thought I was being functional because I split my pure functions from my member functions that required the record definition. And to follow this pattern of separating functionality from data I avoided adding methods to records that made assumptions about properties that I couldn’t encode in the type system like ordering, length or frequency. The classes I created held a single piece of state and provided accessors to manage all of the pure functions to internally update itself. In OO, this kind of setup avoids tight coupling by allowing callers independence of the internal representation. I went even further and tried not to have my pure functions rely on the struct definitions, so I had many layers of routing and wrapping to translate between the object, the struct, and the pure function with value arguments. I thought it was a good design compromise at the time, but the separation of purity, data structures and mutability wasn’t well enforced by anything within the language. I was really hoping for more help from the type system for things like generic contracts, but instead I had to learn about the proper use of the inline keyword.
The OO style organization eventually got very burdensome when I started adding more classes and complex relationships. I was trying to encode as much as I could within the type system, but it fell short when characterizing relationships among collections of values and contracts between types. I couldn’t find a good way to compose and connect the class together without adding multiple routing layers to access the data. To make moving data between dependent classes easier I used events within the actor. It took quite a bit of effort to get this setup correctly; and again, I knew I had to be doing it wrong because the end result wasn’t that much cleaner if I had just done a clean rewrite in C++. Once I got to this point I realized how much of a second class citizen F# projects were in the Visual Studio ecosystem. The project and included code stubs were minimal, most of the code analysis features were C# specific, and I was writing some very obvious utility functions to help with options and function composition. Even with community extensions the tooling wasn’t there for automatic refactoring and debugging. Automatic formatting would routinely fail on code that compiled cleanly and there wasn’t a great way to set breakpoints within nested functions.
My F# code was far shorter than than equivalent C++ code, but uglier. The C++ program required so many layers that while each one was very clean by itself, the real complexity was seeing through the layers of abstraction to see which objects were really involved and how tedious it was to cross those boundaries. Visual Studio did a great job of making that easier to manage with the ability to easily jump and peak between files, whereas I feel the only parts of the IDE I used in F# was the hover-over type hints and squiggly line errors. I only have minimal C# experience, but it looked like Visual Studio and other commercial extensions gave it enough code generation and refactoring tools to take that routing pain away while keeping everything clean and separated. I started wondering if F# was as good a fit as I’d hoped for the project that I thought fit well with it’s strengths and features.
Results
I didn’t finish the project in this state. I got the most complex component working before losing steam. As a first try, I enjoyed doing something new, but I knew I could do better. In summary:
- It wasn’t difficult to just start building things with F#
- The necessary syntax and keywords are pretty minimal
- You can find C#/.NET peeking through the covers
- A rewrite/port offers some interesting perspective
- Fresh syntax doesn’t immediately expose the architectural implications
- An example influences the conceptual problem more than the implementation
- Good design choices transcend language feature differences
To be continued…