GM produced a self-driving car prototype…in 1958. There’s a short documentary about it. Required wires in the road rather than machine learning.
Presumably their PR machine said, "They’ll be commercially viable by 1959."
Insight. Applied.
By editor
GM produced a self-driving car prototype…in 1958. There’s a short documentary about it. Required wires in the road rather than machine learning.
Presumably their PR machine said, "They’ll be commercially viable by 1959."
By editor
Gary Marcus addresses the nonsense in the popular press about Google’s LaMDA AI system being sentient.
He leads with a great quote, from Abeba Birhane, that sums up the whole thing.
we have arrived at peak AI hype accompanied by minimal critical thinking
By editor
Imperative vs functional programming. It’s a debate that goes back to the birth of high level languages—Fortran vs Lisp.
In later years, it was retreaded as object-oriented vs function programming (OOP vs FP)—OOP having become the (massively) dominant software development paradigm.
And, I’m a fully paid up member. I embraced Object Pascal via Delphi 1 on 1995 and have been on the train ever since. I now do a lot of development in C# and teach best-practices.
But, just between you and me, I’ve never been truly happy with OOP. I understand the technology fully, but it’s never felt elegant to me. My adoption of Object Pascal had nothing to do with object-orientation. I was seduced by Borland’s state-of-the-art tooling—an IDE that was years ahead of its time. Object Pascal just came along for the ride.
On balance, I’ve found OOP to provide more pain than benefits. Take the three pillars of OOP
Encapsulation is great. I’m fully on board with it. But, a module system that allows me to group/isolate my code under namespaces, and hide private code, achieves that. Most modern languages—OOP or FP—deliver on encapsulation.
The benefits of inheritance are massively oversold. I have the scars. I’ve battled the fragile base class problem too many times—grappling with someone’s ill-conceived OO hierarchy that looked compelling in UML.
As for polymorphism, well, again, I’ve no problem with this, but it doesn’t need OOP. Polymorphism can be achieved with lightweight interfaces.
I’ve also failed to see many design benefits of using OOP. Mapping objects to the real-world is an incredibly leaky abstraction. Once you get beyond the high-level design it’s positively unhelpful. It also doesn’t fit very well with TDD. Designing objects to map to the real-world isn’t the same as creating testable classes.
The close coupling of data and behavior also feels unnatural. Maybe it’s my background as a data scientist, but I see code and data as separate things. My code is a pipeline through which data flows and is transformed.
OOP has, however, been wildly successful. It just seems that this success is a consequence of the significant education effort, impressive tooling and modelling techniques that have long been part of the OOP ecosystem. There’s a massive industry that supports, and it supported by, OOP.
FP isn’t new—Lisp dates from 1958. However, there’s been renewed interest in it in recent years. Much of this is down to growth in "parallel" environments, such as
Functional languages tend to be more naturally parallelizable. They encourage the use of immutable data structures which reduce the side-effects that make code hard to run on multiple processors.
Apache Spark, the current darling of the big data world, is written in the functional language Scala. There’s even a (Haskell-based) functional language for programming FPGAs (CλaSH).
Many of the major OOP languages are also adopting functional features. .NET has LINQ—and most of my C# code is now LINQ with object-oriented plumbing. Java 8 introduced Lambdas. Idiomatic JavaScript is increasingly functional…notwithstanding the introduction of classes in ES6. Swift is often talked about as a functional language. Two prominent modern languages, Rust and Go, avoid classes altogether.
FP is also a natural fit for data science work. R, a popular language amongst data scientists, is functional (as is Excel). Functional languages translate well to interactive, REPL (or playground) environments, making it easy to experiment with code/analysis.
A blog article expressing one guy’s opinion. Well, there’s a novelty. And, the flurry of interest in FP might be no more than fashion. Has anyone done any research?
When looking into this, I found a presentation given at Utah Valley University that pointed to some interesting experiments.
In one study a team at Yale asked teams to code solutions to a problem using a range of programming languages, including
Criteria used to evaluate the solutions were
Haskell, a functional language, was the clear winner. Given the possible variation in the skills of the teams, the study authors then had a graduate student learn Haskell for a week before attempting to code the solution. While not as effective as the experienced Haskell developers’ solution, the student’s submission came in second.
Now, this study was conducted in 1994, and development has moved on a long way since then. So…
Fast forward to 2014 and researchers at the University of California, Davis studied the following question
What is the effect of programming languages on software quality?
To do this, they took a dataset from GitHub. This dataset covered
The projects were real-world products—such as Linux, MySQL, bitcoin, etc.
They concluded that
The emphasis on the first point is mine.
Obviously, at the end of the day, use whatever makes you most productive. All experienced developers come with a history (baggage?) that makes then more efficient with certain paradigms, languages, environments, frameworks and technologies—regardless of the objective merits of those technologies.
However, if you are an OOP developer who’s never given FP a serious look (i.e. used it to develop a real-world application), I recommend giving it a try. It’s no longer an academic curiosity. React, the most popular front-end library for web development, encourages FP.
We’ll benefit from continued research into the effectiveness of different programming languages. Having data is so much more useful that a barrage of strong opinions (of which I’m as guilty as the next dev).
By editor
Clustering is one of the most powerful and widely used of the machine learning techniques. It’s very seductive. Throw some data into the algorithm and let it discover hitherto unknown relationships and patterns.
k-means is the most popular of all the cluster algorithms. It’s easy to understand—and therefore implement—so it’s available in almost all analysis suites. It’s also fast. What’s not to like?
When people are first exposed to machine learning k-means clustering is one of the techniques that creates immediate excitement. They "get it" pretty quickly and start to wonder what it might show when they get back to the office and run it on their own data.
Let’s apply k-means to the following two-dimensional dataset.
If we ask the algorithm to identify four clusters we get
No surprise there. That’s good. The clusters are are clear as day. If the k-means algorithm suggested anything else we’d be unimpressed.
However, the effectiveness of k-means rests on a number of (usually implicit) assumptions about your dataset. These assumptions match our intuition about what a cluster is—which makes them all the more dangerous. There are traps for the unwary.
Two assumptions made by k-means are
Imagine manually identifying clusters on a scatter plot. You’d take your pen and circle distinct groups. That’s similar to how k-means operates. It identifies spherical clusters.
The assumption about similar-sized clusters is less intuitive. We’d have no problem manually identifying small, isolated, distinct clusters in a dataset. However, the optimization approach used by k-means—effectively minimizing the distance between all the points in each cluster—can lead it astray.
k-means lacks any judgement. When its simple rules fail it has no ability to reflect on the trade-offs.
Let’s see examples of k-means breaking spectacularly when we deviate from these assumptions.
Examine the following scatter plot.
Two clusters, right? Easy. One small ring surrounded by a larger ring. Clear separation between them.
However, only ring is a spherical cluster—the inner one. If you drew a circle around the outer cluster/ring it would have to encompass the inner one. How will k-means handle this violation of one if its core assumptions?
Let’s see.
We know there are two clusters so we’ll help out by telling it that’s how many we’d like identified. Here’s what it finds.
Oh dear.
However, we can help the algorithm out. If we understand our domain—and we do in this simple case—we can transform the data into a form that adheres to the aforementioned assumptions.
As we are dealing with circles, if we transform our Cartesian (x vs y) coordinates to polar (arc vs radius) coordinates we end up with two distinct rectangular clusters. The have the same arc range, but are completely partitioned by their radii.
Running k-means on the transformed dataset gives us the following two clusters—displayed using the original Cartesian coordinates.
Perfect. The job of the data scientist is often to set the ball up so that the techniques can hit the back of the net.
Consider the following dataset.
Again there are two obvious clusters. One small, tightly grouped cluster and another, larger, more dispersed cluster. These are spatially grouped so no problem on that front.
Let’s use k-means to identify our two clusters.
Hmmm. Not good. What happened here?
k-means tries to produce "tight" clusters. In attempting to minimize the intra-cluster distances between the points in the large cluster it’s "overdone" things and produced two clusters that have similar intra-cluster distances. However, it’s clear that this is a terrible solution for our dataset.
Unfortunately we can’t treat machine learning as a black box into which we shovel coal and expect diamonds at the other end. We need to understand the implicit and explicit assumptions in the tools we use and consider how they will impact our results.
k-means clustering is powerful. But it’s blind. And, occasionally, it can make spectacular mistakes. The same is true for all other machine learning methods. Use with caution.
There’s no substitute for being intimately familiar with your data. That’s why the best data science tends to be performed by those who have, or have access to, domain expertise.
By editor
Psychologist Bertram Forer gave 39 of his students a personality test. Each was given a personalised profile based on their answers.
Except they were all given the same profile…taken from an astrology book.
When asked to assess how well it described them, on a 0-5 scale, the students reported an average of 4.3.
Wikipedia describes the Forer effect as
…a common psychological phenomenon whereby individuals give high accuracy ratings to descriptions of their personality that supposedly are tailored specifically to them, yet which are in fact vague and general enough to apply to a wide range of people. This effect can provide a partial explanation for the widespread acceptance of some paranormal beliefs and practices, such as astrology, fortune telling, aura reading, and some types of personality tests.