Recent blog entries for oubiwann

28 Jul 2014 (updated 29 Jul 2014 at 04:13 UTC) »

The Future of Programming - Adopting The Functional Paradigm?

Series Links

Survivors' Breakfast

The previous post covered some thoughts on the future-looking programming themes present at OSCON 2014.

Following that wonderful conference, long-time Open Source advocate, Pythonista, and instructor Steve Holden, was kind enough to host his third annual "OSCON Survivors' Breakfast" with tens of esteemed attendees, speakers, and organizers enjoying great company and conversation, relaxing together after the flurry of conference activity, planning a leisurely day in Portland, and -- most immediately -- having some much-needed breakfast.

The view from the 23rd floor was quite an eyeful, and the conversation ranged across equally panoramic topics. Sitting with Alex Martelli, Anna Ravenscroft, and Katie Miller, the conversation inevitably turned to thoughts programmatical. One thread of the discussion was so compelling that it helped crystallize this series of blog posts. That was kicked off with Katie's question:

Why [have some large companies] not embraced functional programming to the extent that other large ones have?

Multiple points of discussion spawned from this, some of which still continue. The rest of this post explores these. 


Large Companies?

What constitutes a large company? We settled on discussing Fortune 500 companies, which, by definition are:
  • U.S. Companies
  • Ranked by gross revenue (after adjustments for excise taxes).

Afterwards, I looked up the 2013 top 25 tech companies in the Fortune 500. I've listed them below; in parentheses is the Fortune 500 ranking. After the dash are the functional programming languages used on various company projects -- these are listed only if I have talked to someone who has worked on a project (or interviewed for a job that used the language), or if I have read an article by an employee who has stated that they use the listed language(s) [1].
  1. Apple (6) - Swift, Clojure, Scala
  2. AT&T (11) - Haskell
  3. HP (15) - F#, Scala
  4. Verizon Communications (16) - Scala
  5. IBM (20) - Scala
  6. Microsoft (35) - F#, F*
  7. Comcast (46) - Scala
  8. Amazon (49) - Haskell, Scala, Erlang
  9. Dell (51) - Erlang, Scala
  10. Intel (54) - Haskell, SML, PLT Scheme
  11. Google (55) - Haskell [2]
  12. Cisco (60) - Scala
  13. Ingram Micro (76) - ?
  14. Oracle (80) - Scala
  15. Avnet (117) - ?
  16. Tech Data (119) - ?
  17. Emerson Electric (123) - ?
  18. Xerox (131) - Scala
  19. EMC (133) - Scala
  20. Arrow Electronics (141) - ?
  21. Century Link (150) - ?
  22. Computer Sciences Corp. (176) - ?
  23. eBay (196) - Scala 
  24. TI (218) - ?
  25. Western Digital (222) - ?

The companies which have committed to projects guessed to be of significant business value written in FP languages include: Apple, HP, and eBay. Possibly also Oracle and Intel. So, a rough estimate of between 3 to 5 of the top 25 U.S. tech companies have made a significant investment in FP.

Why not Google?

The next two sections offer summaries of some views on this.


Ideal Use Case?

Is an FP language suitable for large organisations? Are smaller companies better served by them? During breakfast, It was postulated that dealing with such things as immutable data, handling I/O in pure FP languages, and creating/using higher order functions is easier for small startups due to the shorter amount of time required to hire or train a critical mass of skilled programmers.

It is certainly true that it will take larger organisations longer to train its personnel simply due to sheer numbers and, even with enough trainers, logistics. But this argument can be made for any corporate level of instruction; in my book, this cancels out on both sides and is not an argument unique to hard topics, even less, specifically pertinent to FP adoption.


Brain Fit?

I've heard this one a bit: "Some people just don't think in FP terms." They need loops and iteration, not higher order functions and recursion. Joel Spolsky makes reference to this in his article The Guerrilla Guide to Interviewing. In particular, he says that "For some reason most people seem to be born without the part of the brain that understands pointers." This has been applied to topics in FP as well as C.

To be fair, Joel's comment was probably made with a bit of lightness and not meant to be a statement on the nature of mind or a theory of cognition. The context of the article is a very practical one: hiring. When trying to identify whether a programmer would be an asset for your team, you're not living in the space of cognitive theory, rather you inhabit the realm of quick approximations, gut instincts, and fast, economical decisions.

Regardless, I find this perspective -- Type Physicalism [3] -- fairly objectionable. This is because I see it as a kind of intellectual "racism." Early social sciences utilized this form of reasoning to justify all sorts of discriminatory thinking in the name of "science", reinforcing a rigid mentality of "us" vs. "them." In my own experience, I've seen this sort of approach used to shutdown exploration, to enforce elitism, and dismiss ideas that threaten the authority of the status quo.

Rather than seeing the problem of comprehending FP as a physical limitation of the individual, I see instructional failure as the obstacle to overcome. If we start with the proposition that certain brains are deficient, we are essentially abandoning education. It is the responsibility of the instructor to engage creatively with each student's learning style. When adhering to the idea that certain brains are limited, one discards creative engagement; one doesn't even consider working with the students and their learning styles. This is a view that, however implicitly, can be used to shun diversity and dismiss potential.

I believe the essence of what Joel was shooting for can be approached in a much kinder fashion (adapted for an FP discussion):

None of us was born knowing GOTO statements, global state, mutable data, or for loops. There are many programmers alive, though, whose first contact with programming involved one or more of these. That's their "home town", as it were; their programmatic birth place. Having utilized -- as well as taught -- imperative, OOP, and functional styles of programming, I do not feel that one is intrinsically any harder than another. However, they are sometimes so vastly different from each other in style or syntax or semantics that once a student has solidified around the concepts of a particular paradigm, it can be a challenge retraining to work easily in another.


Why the Objections?

If both "ideal use case" and "brain fit" are given as arguments against adopting FP (or any other new paradigm) in large organisations, and neither are considered logically or philosophically valid, what's at the root of the resistance?

It is not uncommon for changes in an industry or field of study to be met with resistance. The bigger or more different the change from the status quo, very often is proportional to the amount of resistance. I suspect that this is really what we're seeing when companies take a stance against FP. There are very often valid business concerns: "we've made an investment in OOP" or "it will cost too much to train/hire/migrate to FP." 

I would remind those company leaders, though, that new sources of revenue, that product innovation and changes in market adoption do not often come from maintaining or enforcing the current state. Instead, that is an identifying characteristic of companies whose relevance is fading.

Even if your company has market dominance or is a monopoly, there is still a good incentive for exploring alternative paradigms. At the very least, one can uncover inefficiencies and apply new knowledge to remove duplication of efforts, increase margins, etc.


Careers

As a manager, I have found that about half of the senior engineers up for promotion have very little to no interest in taking on different (new to them) programmatic paradigms. They consider current burdens sufficient (or too much) and would rather spend what little free time they have available to them in improving existing systems.

Senior engineers who have a more academic or research bent (or are easily bored) are much more likely to embrace this sort of change. Interestingly, senior engineers who have little to no competitive drive will more readily pick up something new if the need arises. This may be due to such things as not perceiving accumulated knowledge as territory to defend, for example.

Younger engineers with less experience (and less of an investment made in a particular school of thought) are much more willing to take on new challenges. I believe there are many reasons for this, one of which may include an interest in becoming more professionally competitive with their peers.

Junior or senior, I have found that programmers who are currently looking to find new employment are nearly invariably not only willing to take on the challenge of learning different paradigms, but are usually going about that proactively and engaging in self-study.

I want to work with programmers who can take on any problem space in any paradigm and find creative solutions, contributing as valued members of a team. This is certainly an ideal set of characteristics, but one that I have seen in the wilds of the workplace on multiple occasions. It has nothing to do with FP or OOP paradigms, but rather with the people themselves.

Even if a company is locked into well-established processes and views on programming, they may find it in their best interests to provide a more open-minded approach with their employees who would enjoy that. Their retention rates could very well increase dramatically.


Do We Need To?

Philosophy and hiring strategies aside, do we -- as programmers, software projects, or organizations that support programming -- need to take on the burden of learning or adopting functional programming? Quite possibly not.

If Google's plans around Go involve building a new operating system (in the spirit of 1970s C and UNIX), the systems programmers may find pure functions too cumbersome to work with. FP may be too burdensome a fit for that type of work.

If one is not tied to a historical analogy with UNIX, as Mozilla is not with Rust, doing something like creating a new browser engine (or running a remote services company) may be a good fit for FP, especially if one has data showing reduced error counts when using type systems.

As we shall see illustrated in the next post, the usual advice continues to apply: the decision of which paradigm to employ for any given project should be dictated by the best fit and not ideological inflexibility. The bearing this has on programming is innovation: it is the early adopters who have the best chance of leading us into the future.

Up next: Retrospective on Programming Paradigms
Previously: Themes at OSCON 2014


Footnotes

Syndicated 2014-07-28 13:00:00 (Updated 2014-07-29 03:53:21) from Duncan McGreggor

27 Jul 2014 (updated 28 Jul 2014 at 06:16 UTC) »

The Future of Programming - Themes at OSCON 2014

Series Links


A Qualitative OSCON Debrief

As you might have noticed from the OSCON Twitter-storm this year, the conference was a blast. Even if you weren't physically present, given the 17 tracks, you can imagine that the presentations -- and subsequent conversations -- were deeply varied.

This was the second OSCON I'd attended; the first was was in 2008 as a guest of Michael Bernstein, a friend who was speaking there. OSCON 2008 was a zoo - I'm not sure of the actual body count, but I've heard that attendees + vendors + miscellaneous topped 12,000 people over the course of the week (I would love to hear if someone has hard data on that -- googling didn't reveal much). OSCON 2008 was dominated by Big Data, Hadoop, endless buzzword bingo, and business posturing by all sorts. The most interesting bits of that conference were the outlines that formed around the conversations people weren't having. In fact, over the following 6 months, that's what I spent my spare time pondering: what people didn't say at OSCON.

This year's conference seemed like a completely different animal. It felt like easily 1/2 to 1/3rd the number of attendees in 2008. Where that one had all the anonymizing feel of rush-hour in a major metropolitan hub, OSCON 2014 had a distinctly small-town vibe to it -- I was completely charmed. Conversations (overheard as well as participated in) were not littered with examples from the latest bizspeak, but rather focused on essence. The interactions were not continually distracted, but rather steadily focused, allowing people to form, express, and dispute complete thoughts with their peers.


Conversations

So what were people talking about this year? Here are some of the topics I heard covered during lunches, in hallways, and at podiums; at pubs, in restaurants and at parks [1]:
  • What communities are thriving?
  • Which [projects, organisations, companies, etc.] are treating their people right?
  • What successful processes are being followed at [project, organisation, etc.]?
  • Who is hiring and why should someone want to work there?
  • Where can I go to learn X? Who is teaching X? Who shares the most about X?
  • Which [projects, organisations] support X?
  • Why don't more [people, projects, organisations] care about [possible future X]?
  • Why don't more [people, projects, organisations] spend more time investigating the history of X for "lessons learned"?
  • There was so much more X in computing during the 60s and 70s -- what happened? [2]
  • Why are we reinventing X?
  • When is X going to be invented, and who's going to do it?
  • Everything is changing! I can't keep up anymore.
  • I want to keep up, but how?
  • Why can't we stop making so many X?
  • Nobody cares about Y anymore; we're all doing X now.
  • Full stack developers!
  • Haskell!
  • Fault-tolerant systems!

After lots of reflection, here's how I classified most of the conversations I heard:
  • Developing communities,
  • Developing careers and/or personal/professional qualities, and
  • Developing software, 

along lines such as:
  • Effective maintenance, maturity, and health,
  • Focusing on the "art",  eventual mastery, and investments of time,
  • Tempering bare pragmatism with something resembling science or academic excellence,
  • Learning the new to bolster the old,
  • Inspiring innovation from a place of contemplation and analysis,
  • Mining the past for great ideas, and
  • Figuring out how to better share and spread the adoption of good ideas.


Themes

Generalized to such a degree, this could have been pretty much any congregation of interested, engaged minds since the dawn of civilization. So what does it look like if we don't normalize quite so much? Weighing these with what may well be my own bias (and the bias of like-minded peers), I submit to your review these themes:

  • A very strong interest in programming (thinking and creating) vs. integration (assessing and consuming).
  • An express desire to become better at abstraction (higher-order functions, composition, and types) to better deal with growing systems complexities.
  • An interest in building even more complicated systems.
  • A fear of reimplementing past mistakes or of letting dust gather on past intellectual achievements.

As you might have guessed, these number very highly among the reasons why the conference was such an unexpected pleasure for me. But it should also not come as a surprise that these themes are present:

  • We have had several years of companies such as Google and Amazon (AWS) building and deploying some of the most sophisticated examples of logic-made-manifest in human history. This has created perceived value in our industry and many wish to emulate it. Similarly, we have single purpose distributed systems being purchased for nearly 20 billion USD -- a different kind of complexity, with a different kind of perceived reward.
  • In the 70s and 80s, OOP adoption brought with it the ability to create large software systems in ways that people had not dared dream or were impractical to realize. Today's growing adoption of the Functional paradigm is giving early signs of allowing us to better integrate complex systems with more predictability and fewer errors.
  • Case studies of improvements in productivity or the capacity to handle highly complex or previously intractable problems with better abstractions, has ignited the passions of many. Not wanting to limit their scope of knowledge or sources of inspiration, people are not simply limiting themselves to the exploration of such things as Category Theory -- they are opening the vaults of computer science with such projects as Papers We Love.

There's a brave new world in the making. It's a world for programmers and thinkers, for philosophers and makers. There's a lot to learn, but it's really not so different from older worlds: the same passions drive us, the same idealism burns brightly. And it's nice to see that these themes arise not only in small, highly specialized venues such as university doctoral programs and StrangeLoop (or LambdaJam), but also in larger intersections of the industry like OSCON (or more general-audience ones like Meetups).

Up next: Adopting the Functional Paradigm?
PreviouslyAn Overview


Footnotes

[1] It goes without saying that any one attendee couldn't possibly be exposed to enough conversations to form a perfectly accurate sense of the total distribution of conversation topics. No claim to the contrary is being made here :-)

here in the section titled "Why did all these ideas happen during this particular time period?"


Syndicated 2014-07-27 21:25:00 (Updated 2014-07-28 05:25:57) from Duncan McGreggor

27 Jul 2014 (updated 28 Jul 2014 at 06:16 UTC) »

The Future of Programming - An Overview

Art by Philip Straub
There's a new series of blog posts coming, inspired by on-going conversations with peers, continuous inspection of the development landscape, habitual navel-gazing, and participation at the catalytic OSCON 2014. As you might have inferred, these will be on the topic of "The Future of Programming."

Not to be confused with Bret Victor's excellent talk last year at DBX, these posts will be less about individual technologies or developer user experience, and more about historic trends and viewing the present (and near future) through such a lense.

In this mini-series, the aim is to present posts on following topics:

I did a similar set of posts, conceived in late 2008 and published in 2009 on the future of cloud computing entitled After the Cloud. It was a very successful series and the cloud industry seems to be heading towards some of the predictions made in it -- ZeroVM and Docker are an incremental step towards the future of distributed processes/functions outlined in To Atomic Computation and Beyond

In that post, though, are two quotes from industry greats. These provide an excellent context for this series as well, hinting at an overriding theme:
  • Alan Kay, 1998: A crucial key to growing large systems is effective communications between components.
  • Joe Armstrong, 2004: To effectively model and solve problems in a distributed manner, we need concurrency... this is made easier when we isolate processes and do not share data.

In the decade since these statements were made, we have seen individuals, projects, and companies take that vision to heart -- and succeeding as a result. But as an industry, we continue to struggle with the definition of our art; we still are tormented by change -- both from within and externally -- and do not seem to adapt to it well.

These posts will peer into such places ... in the hope that such inspection might guide us better through the tangled forest of our present into the unimagined forest of our future.

Syndicated 2014-07-27 18:02:00 (Updated 2014-07-28 05:17:59) from Duncan McGreggor

3 Jul 2014 (updated 18 Jul 2014 at 06:16 UTC) »

Uncovering Inherent Structures in Organizations

Vladimir Levenshtein
This post should have a subtitle: "Using Team Analysis and Levenshtein Distance to Reveal said Structure." It's the first part of that subtitle that is the secret, though: being able to correctly analyze and classify individual teams. Without that, using something clever like Levenshtein distance isn't going to be very much help.

But that's coming in towards the end of the story. Let's start at the beginning.

What You're Going to See

This post is a bit long. Here are the sections I've divided it into:

  • What You're Going to See
  • Premise
  • Introducing ACME
  • Categorizing Teams
  • Category Example
  • Calculating the Levenshtein Distance of Teams
  • Sorting and Interpretation
  • Conclusion

However, you don't need to read the whole thing to obtain the main benefits. You can get the Cliff Notes version by reading the Premise, Categorizing Teams, Interpretation, and the Conclusion.

Premise

Companies grow. Teams expand. If you're well-placed in your industry and providing in-demand services or products, this is happening to you. Individuals and small teams tend to deal with this sort of change pretty well. At an organizational level, however, this sort of change tends to have an impact that can bring a group down, or rocket it up to the next level.

Of the many issues faced by growing companies (or rapidly growing orgs within large companies), the structuring one can be most problematic: "Our old structures, though comfortable, won't scale well with all these new teams and all the new hires joining our existing teams. How do we reorganize? Where do we put folks? Are there natural lines along which we can provide better management (and vision!) structure?"

The answer, of course, is "yes" -- but! It requires careful analysis and a deep understanding every team in your org.

The remainder of this post will set up a scenario and then figure out how to do a re-org. I use a software engineering org as an example, but that's just because I have a long and intimate knowledge of them and understand the ways in which one can classify such teams. These same methods could be applied a Sales group, Marketing groups, etc., as long as you know the characteristics that define the teams of which these orgs are comprised.



Introducing ACME

ACME Corporation is the leading producer of some of the most innovative products of the 20th century. The CTO had previously tasked you, the VP of Software Development to bring this product line into the digital age -- and you did! Your great ideas for the updated suite are the new hottness that everyone is clamouring for. Subsequently, the growth of your teams has been fast, and dare we say, exponential.

More details on the scenario: your Software Development Group has several teams of engineers, all working on different products or services, each of which supports ACME Corporation in different ways. In the past 2 years, you've built up your org by an order of magnitude in size. You've started promoting and hiring more managers and directors to help organize these teams into sensible encapsulating structures. These larger groups, once identified, would comprise the whole Development Group.

Ideally, the new groups would represent some aspect of the company, software development, engineering, and product vision -- in other words, some sensible clustering of teams doing related work. How would you group the teams in the most natural way?

Simply dividing along language or platform lines may seem like the obvious answer, but is it the best choice? There are some questions that can help guide you in figuring this out:
  • How do these teams interact with other parts of the company? 
  • Who are the stakeholders in feature development? 
  • Which sorts of customers does each team primarily serve?
There are many more questions you could ask (some are implicit in the analysis data linked below), but this should give a taste.

ACME Software Development has grown the following teams, some of which focus on products, some on infrastructure, some on services, etc.:
  • Digital Anvil Product Team
  • Giant Rubber Band App Team
  • Digital Iron Carrot Team
  • Jet Propelled Unicycle Service Team
  • Jet Propelled Pogo Stick Service Team
  • Ultimatum Dispatcher API Team
  • Virtual Rocket Powered Roller Skates Team
  • Operations (release management, deployments, production maintenance)
  • QA (testing infrastructure, CI/CD)
  • Community Team (documentation, examples, community engagement, meetups, etc.)

Early SW Dev team hacking the ENIAC

Categorizing Teams

Each of those teams started with 2-4 devs hacking on small skunkworks projects. They've now blossomed to the extent that each team has significant sub-teams working on new features and prototyping for the product they support. These large teams now need to be characterized using a method that will allow them to be easily compared. We need the ability to see how closely related one team is to another, across many different variables. (In the scheme outlined below, we end up examining 50 bits of information for each team.)

Keep in mind that each category should be chosen such that it would make sense for teams categorized similarly to be grouped together. A counter example might be "Team Size"; you don't necessarily want all large teams together in one group, and all small teams in a different group. As such, "Team Size" is probably a poor category choice.

Here are the categories which we will use for the ACME Software Development Group:
  • Language
  • Syntax
  • Platform
  • Implementation Focus
  • Supported OS
  • Deployment Type
  • Product?
  • Service?
  • License Type
  • Industry Segment
  • Stakeholders
  • Customer Type
  • Corporate Priority
Each category may be either single-valued or multi-valued. For instance, the categories ending in question marks will be booleans. In contrast, multiple languages might be used by the same team, so the "Language" category will sometimes have several entries.

Category Example

(Things are going to get a bit more technical at this point; for those who care more about the outcomes than the methods used, feel free to skip to the section at the end: Sorting and Interpretation.)

In all cases, we will encode these values as binary digits -- this allows us to very easily compare teams using Levenshtein distance, since the total of all characteristics we are filtering on can be represented as a string value. An example should illustrate this well.

(The Levenshtein distance between two words is the minimum number of single-character edits -- such as insertions, deletions or substitutions -- required to change one word into the other. It is named after Vladimir Levenshtein, who defined this "distance" in 1965 when exploring the possibility of correcting deletions, insertions, and reversals in binary codes.)

Let's say the Software Development Group supports the following languages, with each one assigned a binary value:
  • LFE - #b0000000001
  • Erlang - #b0000000010
  • Elixir - #b0000000100
  • Ruby - #b0000001000
  • Python - #b0000010000
  • Hy - #b0000100000
  • Clojure - #b0001000000
  • Java - #b0010000000
  • JavaScript - #b0100000000
  • CoffeeScript - #b1000000000
A team that used LFE, Hy, and Clojure would obtain its "Language" category value by XOR'ing the three supported languages, and would thus be #b0001100001. In LFE, that could be done by entering the following code the REPL:

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!