Page 2 of 2

Programmer Anarchy, a Post-Agile Model

Just came across a software development organizational theory called Programmer Anarchy, or Developer Anarchy, in a podcast by Software Engineering Daily. Fred George and Antonio Terreno developed the theory back in 2010 as sort of an “Extreme Agile” guerilla-style model to trim fat from a software development organization.

The theory seems to have a Marxist flavor, in that it seeks to eliminate the roles of the capitalist and bourgeois in favor of maximum productivity. Just as a proletarian revolution creates a new kind of state that empowers workers, Programmer Anarchy creates a new kind of organization that empowers software developers, perhaps at the expense of workers of other functions in the organization. Parallel to the political analogy, this model probably works best for a company at its infant stage.

At its heart, Programmer Anarchy might be closer to the idea of “creative destruction,” developed into an economic idea by Joseph Schumpeter in Capitalism, Socialism, and Democracy. The term “anarchy” has a very specific meaning in politics, although the specificity and intensity of the word has somewhat diluted with its frequent use in popular culture. The branding of the theory is sort of a hyperbole and a marketing gimmick; perhaps “Programmer Liberation” might be better in terms of sensitivity, but then again, the word “liberation” has its own controversies.

Individual angle: career development

Compared to traditional organizational structures, this model seems to foster a more holistic environment for individual career development. It promotes continuous learning by doing, and provides a high degree of freedom for doing so, provided that production goals are met.

This rings true for many modern professions, particularly in fields with rapidly changing landscape, such as media and technology. In the case of technology,  there is the constant race to the top for funding and market share. In the case of media, technology is responsible for shaking out a lot of media jobs, in the transition from print to digital.

“When your skill becomes commodity, and people put spreadsheets together and tell you what rates to charge,” George said. “This is when you lose your jobs to the offshore firms.”

Which is roughly how large corporations view their employees: a figure on a spreadsheet to be “streamlined.” That said, only high-performing and highly committed individuals can benefit from the Programmer Anarchy model. In this model, virtually all support roles (quality assurance, project management, business analyst) are permanently eliminated. Only those with high-level skills in software development will remain.

Organizational angle: efficiency

The traditional way is to keep people who are specialized at something to do only that thing, and absolutely nothing else. There is more friction against getting things done, and there are hand-off costs between specialists, because people have to spend time communicating to each other. The more people you have, this process repeats the more, and more man-hours are spent. Programmer Anarchy speeds up the workflow by removing the hand-off costs.

It dissolves the bureaucracy that sometimes prevents things from getting done. In a more traditional setting, a non-developer internal client might rely on a developer to resolve a problem, and no developer would dare to do a thing until a “user story” has been entered into a Jira ticket. This means you need to talk to at least three “managers” before everyone would even know what’s going on.

Developments

Terreno, one of the authors of the original paper for the theory, reported mixed results in 2012, two years into the Programmer Anarchy experiment. George is still promoting the model pretty hard, and I’d be interested to see where it goes.

Autopsy: Hierarchical Clustering

What do normal people do when they want to rank a bunch of cities according to some features? Import the data on an Excel sheet, calculate a composite score across the features, and sort according to the composite score. What do crazy people do?

Why, a clustering, of course.

I had about 20 features of 100 cities, and I wanted to put them into groups according to similarities across the 20 features. I loaded the data into R and did a hierarchical clustering analysis. It works like this. The algorithm loops through all the observations, defines some sort of dissimilarity measure between each pair of observations, and spits out a tree diagram with a fancy name: a dendrogram!

Rplot01

Looks pretty intimidating. I felt very sophisticated. But I wanted the names of the cities at the end of the tree, and R just wouldn’t listen. What to do? After a few hours:

cluster

Hierarchical clustering is a type of “unsupervised learning,” which is what you do when you have no idea what kind of pattern or grouping you are looking for. It’s also a “bottom-up” agglomerative clustering. It starts by comparing individual observations, and works its way up until every observation is covered.

In terms of a tree diagram, it starts from the “leaves” level and works its way up the branches, combining branches up to the trunk of the “tree.” The algorithm usually measures dissimilarity in terms of “Euclidean distance,” which is a fancy way of measuring the distance between two sets of points in an abstract space.

OK. So I found that New York is really similar to Los Angeles and Chicago. All this work was basically useless, but it was kind of fun.

Louder, Faster, Harder, Stronger: The Loudness War

If you’re in the music industry or if you nerd out on music, you would have heard of the Loudness War.

The concept is fairly intuitive: the louder the song, the more likely people will hear it, especially when they aren’t paying active attention. Being loud helps a song get noticed when it’s on the radio during your cab ride or at the grocery store. If your song is louder than your rival’s, it gets the listener’s attention, and people are more likely to remember it the next time they are streaming or buying music.

The tradeoff is that when you make a song louder, you sacrifice sound quality. In the process of boosting the overall loudness of a song, all the sounds are made louder, including the softer sounds. The contrast between loud and soft sounds is lost, along with the nuanced interactions between different instruments and the vocals. Basically, it sounds more crappy and boring.

Why do record companies commit this atrocity? Because it sells. Music analytics company Next Big Sound analyzed the audio features of 32,310 musical tracks by 751 artists, and found that loudness is positively correlated with sales, other factors held constant.

loudnessSales

In this contour plot, the x-axis represents loudness, and the y-axis represents sales. The weird multi-layer shape in the plot is the “contour,” which tells you where the various tracks stand in their loudness-sales relationships. The color represents the density: the blue bit indicates that a lot of tracks are clustered in that area on the graph.

Most of the songs in the sample tend to be on the loud side, as the center of the blue bit sits near the right end of the x-axis. Most songs really don’t sell that much, so a large cluster of the tracks sits low on the y-axis. The tracks that turn out to be mega hits are at the upper right corner of the plot, and they are all on the loud side.

What does this tell us? Loudness is correlated with higher sales. The relationship isn’t linear though; it looks more like a roughly log-linear (exponential) relationship.

As a consumer, I’ve adapted. I’ll just listen to the shitty loud songs at the club. At home, I’ll put on some good old classical on a curated sound system.

U.S. Mass Shooting Fatalities Since 2014

shootings

And therefore never send to know for whom the bell tolls;
It tolls for thee.

– John Donne, No Man Is An Island

This chart depicts the number of fatalities in mass shootings in the U.S. from 2014 to 2016. You can see clearly that the toll from the latest shooting in Orlando, Florida far outnumbers even the sum of a number of past shootings.

Frank Bruni sums it up pretty well. This isn’t an attack on a minority subset of a population, but an attack on the “bedrock” of our society: the very idea of democracy, acceptance, and diversity.

How many of these incidents are still going to happen before something is done? Sadly, maybe quite a few. “And to actively do nothing is a decision as well.”

You can’t really say “I’m glad this didn’t/doesn’t happen where I live.” Just because it didn’t, doesn’t mean it couldn’t.

First they came for the Socialists, and I did not speak out—
Because I was not a Socialist.

Then they came for the Trade Unionists, and I did not speak out—
Because I was not a Trade Unionist.

Then they came for the Jews, and I did not speak out—
Because I was not a Jew.

Then they came for me—and there was no one left to speak for me.
– Martin Niemöller

This is a work in progress, and the code is on GitHub. The data source is Gun Violence Archive.

Python Random Forest Classification

Yesterday, I learned to use random forest classification in Python at a workshop hosted by NYC Women in Machine Learning & Data Science, facilitated by data scientists from OnDeck.

Here is the solution file from the instructor.

Justin Law, one of the data scientists from OnDeck, said that this analysis is a very simple one compared to the ones he deals with at work on a daily basis. Still, it took more than three hours just to read through and understand.

Machine learning is a wonderful black box you can do a lot of damage with, even if you have no idea why things are happening in there.

Here’s another render of random forest classification. Once you’ve figured out the steps and the motions, hopefully you’ll understand some of the theories down the road.

Python Prank

Yesterday, I was chatting on Slack with fellow RC members about object-oriented programing and the Python language, when Paul Gowder brought up a prank he had written. It’s supposed to create a security hole, and suppress errors, so it becomes impossible to find bugs.

I started analyzing it line by line, with the help of Paul, Leo Torres and Sean Martin.

Line 3: class foo(str) declares foo to be a subclass of str, which means foo will do everything str does, plus anything else you add to it.

Line 4: Here, we add the __call__ functionality to class foo(str).

Line 6: We are calling exec on self, which interprets the string as executable python code.

Line 7: The except Exception part would suppress any errors we might get.

Line 10: Adding str = foo replaces the standard implementation of str with our new foo. This line is important for the code to be malware, because everything created with str() will actually be a foo(), which means that now your strings created with str() are callable. If they’re called, they’re executed as Python code.

Line 18: If we do evil(), we end up running exec 'print "EVIL"', which is interpreted as the python code print "EVIL", which then just prints EVIL.

In a nutshell, anything that gets converted to a string with str() is turned into a function that you can call. One little typo, entering the name of a string variable rather than a function, and you’ve just executed whatever random code is contained in the string.

The stack trace you get won’t give you any obvious indication that what you called was a string. It’ll just throw errors related to whatever it is that you put in the string. Or, if there’s something that will actually run in the string, then it’ll just execute, and God only knows what happens then.

If our application is reading a value for username = input(), and the user inputs not a username but malicious code, this code ends up being run. Also, the except Exception part would suppress any errors we get from calling nonsense that way.

It kind of felt like music composition.

Newer posts »

© 2018

Theme by Anders NorenUp ↑