Author: Elaine (page 1 of 2)

Paradigm Shift: Functional and Object-Oriented

This is a TL;DR of an hour-long presentation (video and slide deck) by Anjana Vakil on programming paradigms, in light of the war between functional and object-oriented fans. I think it’s safe to say that a paradigm shift is underway — at least in public discourse, if not in practice — regarding which one is to be the dominant trend in software development.

What Are Paradigms?

They are models that enable and define the act of programming.

All models are wrong; some models are useful. —George E. P. Box

Is the model illuminating and useful?

An example of models is the history of the way humans have viewed the universe:

  • Earth-centric
  • Sun-centric
  • Newton: the universe was like a massive clock built by a creating god and set into motion. The force of universal gravitation makes every pair of bodies in the universe attract each other.
  • Einstein: the universe is both spatially infinite and temporally infinite, and space is neither expanding nor contracting.
  • Newer views: the universe is expanding, etc.
Each paradigm supports a set of concepts that makes it the best for a certain kind of problem. —Peter Van Roy

What can a paradigm teach us?

  • Imperative: explicit, focus on and understand implementation
  • Declarative: abstract, understand domain
  • Object-oriented: encapsulate, communicate
  • Functional: specialize, transform data
If the advancement of the general art of programming requires the continuing invention and elaboration of paradigms, advancement of the individual programmer requires that they expand their repertory of paradigms. —Robert W. Floyd

The Paradigms

We, as a species, figured out how to electrify rocks and make them do what we want.
Anjana Vakil

1. Imperative

Examples: C

Characteristics: micromanaged and explicit, must be done in a specific way in every detail

Mantra:

  • Follow my commands
  • In the order I gave them
  • Remember state

2. Object-Oriented

Examples: Java

Advocate: Alan Kay

State and data:

  • Share data = no
  • Mutable state = yes

Characteristics: organic, cellular, ephemeral, mutable

Mantra:

  • Keep your state to yourself
  • Receive my messages
  • Calling methods = objects sending messages to each other
  • Respond as you see fit

3. Functional

Examples: Lisp, Scheme

Advocate: Alonzo Church

State and data:

  • Share data = yes
  • Mutable state = no

Characteristics: mathematical, physical, material, immutable

Mantra:

  • Mutable state is dangerous
  • Pure functions are safe
  • Data goes in, data comes out

4. Declarative

Examples: SQL, Prolog

Characteristics: logic puzzle

Mantra:

  • These are the facts
  • This is what I want
  • I don’t care how you do it

Functional vs Object-Oriented

Functional

Object-Oriented

The Tweet, The Whole Tweet and Nothing But the Tweet So Help Me Twitter

I used the Twitter Search API to collect tweet content for a project and kept getting truncated (incomplete) tweets. I asked for help and Twitter answered.

Problem

If you are doing analysis with text data (e.g. sentiment analysis), the completeness of data matters. For example:

BigCo CEO fired...

has a dramatically different meaning than:

BigCo CEO fired a gun in McMansion and puts a hole in the ceiling

The documentation does not directly address truncated tweets; reading documentation to get unstuck is like reading a medical dictionary during a heart attack. I scoured the Internet, then posted on Twitter Developers forum, and an actual Twitter staff member responded.

The GET Call

First, I’ll try to explain the process of the Twitter API call. In Node, two main libraries exist for consuming the Twitter API: one aptly named Twitter, and the other named twit. They essentially work the same way; I chose ‘Twitter’ because it was the first one I  found. This tutorial is a good introduction to the library.

I’ll skip the part on how to install Node modules and how to get access tokens from Twitter Application Management as they are not central to the problem.

Once you have the tokens, store them in a config.js file in the same directory of the project:

Load the tokens and initiate a new Twitter client:

Each API call contains a set of parameters, for example: the search term, the number of tweets returned, and the geolocation the tweet originated from. A sample parameter that searches for “10 recent English-language tweets originating from New York City containing the hashtag #DonaldTrump” would look like this:

I thought the truncated: false on line 7 meant that the tweets wouldn’t come in truncated, but it didn’t, because the tweets did come in truncated even with this setting. Using this code to make the GET request:

This is a sample of the data object returned by the GET call, and it contains one single tweet object. The text property on line 5 is the actual tweet content:

@RepMaxineWaters on GOP colleagues: They cannot credibly come before the American public and defend #DonaldTrump. They’re a…

They’re a… what?  This isn’t a severe case of misinformation, as the negative tone is clear in the first sentence. Nevertheless, I wanted to know the definitive way of retrieving a complete tweet.

As seen in this answer from Twitter, the key is to add tweet_mode: extended (line 8) and retweeted_status: { truncated: false } (line 9) to the parameters, as seen in the code snippet below.

A GET call with these additional parameters returns a different data object which contains a full_text property as seen on line 5 below. Note that the data object in the previous GET call does not contain a full_text property. In this current GET call, if the data object is an original tweet, it would look like this:

If the data object is a retweet, its own full_text property is truncated, and it would contain a retweeted_status property to hold the original tweet it is citing, as seen on line 21 in the code snippet below. Note that the data object of an original tweet in the previous code snippet does not contain a retweeted_status property.

If you do desire to get the text of the tweet cited by this retweet, you can call the retweet_status.full_text property. That property is not logged by the console, and therefore isn’t visible in the code snippet above, but it does exist in the object and I have tested it.

Conclusion

In any given Twitter API search call:

  1. Original tweets always have complete full_text properties.
  2. Retweets always have truncated full_text properties but complete
    retweet_status.full_text properties.

From a developer’s perspective, the fact that a tweet should ever be truncated is inconvenient. However, this is an acceptable outcome, and I’m grateful to Twitter for responding quickly. The full script for the API call is available here, and the relevant discussion is available here.

Unpacking Values in Python and JavaScript

While reading some TensorFlow code in a Stanford tutorial,  I noticed a type of multiple-variable assignment I’m not familiar with:

x_train, x_test, y_train, y_test = cross_validation.train_test_split(
 iris.data, iris.target, test_size=0.2, random_state=42)

Turns out to be a technique called “unpacking,” typically done via tuples in Python and arrays in JavaScript. Additionally, the technique is referred to as “destructuring” in JavaScript, introduced as a new feature in ES6.

Python Examples

Input:

def return4values():
    return [1,2,3,4]

ONE, TWO, *THREE = return4values()
print("One:{}".format(ONE))
print("Two:{}".format(TWO))
print("Three:{}".format(THREE))

Output:

>> One:1
>> Two:2
>> Three:[3, 4]

When calling the function, the variable preceded by an asterisk gets the “remaining” values not picked up by the other variables:

*ONE, TWO, THREE = return4values()
print("One:{}".format(ONE))
print("Two:{}".format(TWO))
print("Three:{}".format(THREE))

Output:

>> One:[1, 2]
>> Two:3
>> Three:4

JavaScript Examples

Sourced from MDN:

var a, b, rest;
[a, b] = [10, 20];
console.log(a); // 10
console.log(b); // 20

[a, b, ...rest] = [10, 20, 30, 40, 50];
console.log(a); // 10
console.log(b); // 20
console.log(rest); // [30, 40, 50]

({ a, b } = { a: 10, b: 20 });
console.log(a); // 10
console.log(b); // 20

// Stage 3 proposal
({a, b, ...rest} = {a: 10, b: 20, c: 30, d: 40});
console.log(a); // 10
console.log(b); // 20
console.log(rest); //{c: 30, d: 40}

Related

In the last line of scikit-learn’s cross_validation module, a similar technique from Python itertools is used:

return list(chain.from_iterable((safe_indexing(a, train),
                                     safe_indexing(a, test)) for a in arrays))

The Web Can Live

Work with the system we have or build the system we want?

Mike Hearn, a former Google employee and Bitcoin developer, proposed to kill the Web and build a new platform for developing and delivering applications, arguing that the unmanageable complexity of the Web and its security flaws warrant its death. The piece pretty much reads like a marketing manifesto for a product that doesn’t even exist yet. I’m not convinced by the message, but it reminds me of the subway problem in New York.

The Web is like the New York subway system. The Web was born in 1989, and the subway in 1904. When they were conceived, they were not expected to perform at today’s scale.

The original City Hall subway station in New York City. (Untapped Cities)

As New York’s population grew, the subway’s capacity was incrementally added whenever needs arose. Routes were added, new tracks and stations were built, and old trains were replaced by new trains. Incrementally. All trains are dependent on an old signaling system that has not been thoroughly updated. The increasing loads are putting pressure on the system to increase supply of rides, which it fails to, or at least is perceived to have failed.

The Web was designed to display, interlink, share, and browse documents. It was not designed to serve up sophisticated applications to enable business transactions and personal activities. The Web became popular when users discovered they could conduct business and personal activities with an efficiency an order of magnitude better than the way they had been conducting them.

Tim Berners-Lee, inventor of the World Wide Web. (CERN)

To fix the subway, you must disrupt people’s lives in order to make meaningful changes; there is no alternative. The trains and roadways are already saturated. To overhaul the subway signal system, it might not be possible to selectively halt several lines and leave other lines open for service. There will come a time when there has to be a large scale outage to test the signal system. Because failure has potentially grave consequences, the scale and magnitude of the testing has to be considerable.

To fix the Web, you don’t need to kill it. Just offer an alternative and see if it proves to be a worthy replacement. Besides, the Web is not broken at all. It works, and it gives businesses and consumers what they want. The main problem with the Web is that it is simple for users to use it, and complex for developers to make things that users want. It’s not impossible to develop for the Web, just difficult. In that sense, the system is not sufficiently faulty to warrant an imminent and complete overhaul. Users aren’t complaining; developers are.

An obstacle to introducing a new Web applications platform will be the politics. Bitcoin is technically viable and popular, but its rise became derailed when power concentrated, according to Hearn. A core issue of any attempts on a new Web app platform would probably be handling the power structure and negotiating government regulations.

Open source software products like Python and JavaScript enjoyed enormous success in providing programming tools for free. Same with React and Ruby on Rails in Web development frameworks. I don’t see why there shouldn’t be an open source Web development platform that offers a new set of protocols as an alternative to HTML, HTTP, and so on; I could even imagine this innovation on a hardware level. But you really have to prove the efficacy of the alternative before advocating the death of the incumbent. This isn’t a presidential election.

The Importance of Community

From a talk by Georgia candidate for Governor Stacey Abrams on the importance of speaking up.

A charming mix of TED talks, “lightning talk” presentations, business networking, and political rally, Lesbians Who Tech (LWT) is unlike other technology conferences I have had the privilege to attend.

The amount of joy makes it more like a party, even though it contains all the ingredients of a conference. The sense of community and camaraderie is abundant sometimes to a sickening (as in sugar) level, not to mention the marketing overdose, but it is necessary to provide (and fund) a safe space where the invisibles are acknowledged and heard. It’s almost like we are routinely ignored so often in such a matter-of-fact manner, that we can’t stop celebrating when given the chance to do so.

Even at the risk of tokenism and political correctness fatigue, it is generally favorable to encourage, support, and participate in the celebration of diversity, not just because it is decent but also because it is likely profitable.

Regardless of the ultimate motive of the decision to embrace diversity, at least the train is going somewhere: candidates are welcomed into the recruiting process, entrepreneurs get pitch opportunities, and businesses gains access to a non-typical talent pool while enhancing company image as an ally of minorities.

LWT is a kind of psychological compensation and a renewed call to action for a spirited subset of a sexual minority that is historically ostracized and only recently gained global traction as well as national legal rights in America, keeping in mind that these fragile rights are increasingly in flux under the current administration. Given this climate, it is unsafe to stay silent. Absolute objectivity is not always desirable, even for journalists.

Kate Kendall of National Center for Lesbian Rights quoted Elie Wiesel in her presentation on the importance of resistance:

We must always take sides. Neutrality helps the oppressor, never the victim. Silence encourages the tormentor, never the tormented.

Whinefulness, Perseveration, and the Infinite Loop

Why do people whine so much? Presumably, people whine about things they cannot control: a phenomenon, an action they must perform, or a situation they must endure, that are of an unpleasant and undesirable nature.

To me, “whining” is slightly different from “complaining,” in that whining contains an added shade of despair and futility. A complaint is actionable; a whine is useless and best ignored. Complaining is more effective, and whining is more impotent.

“Whinefulness” could be classified as an emotion, perhaps not as essential as joy, sadness, or anger, but definitely on par with resentment. In fact, whinefulness is the product of resentment.

I strive to stay away from whinefulness, both of others and of my own. It is functionless: it produces and changes nothing, transcends and transposes nothing, therefore it does not deserve my time. Repeated whining about the same issue or event without due action is an example of perseveration, a pathological behavior.

Perseveration is the repetition of a particular response (such as a word, phrase, or gesture) despite the absence or cessation of a stimulus. It is usually caused by a brain injury or other organic disorder.

This functionless behavior can be loosely described as an infinite loop, in computer science terms. Perseveration is different from Kierkegaard’s concept of repetition, which has a much broader temporal and psychological scope.

I have often found myself creating and entering infinite loops, but I have become increasingly aware of the early signs both in myself and in others. Effort must be made to prevent the creation of a loop, and boundaries must be drawn to avoid being brought into loops created by others. Upon entry, it takes more energy to exit the loop than the energy taken to create it. Since loop creation is already a waste of energy, entering the loop would result in a multiplied waste of energy.

It’s hard to believe human emotions can be reduced as such, but sometimes they can and should be.

Data Visualization Workflow

Extracted from an OpenVis presentation by Mike Bostock.

1. Prototypes should emphasize speed over polish.
Identify the intent of the prototype.
What hypothesis are you testing?
It needn’t look good, or even have labels.
Make just enough to evaluate the idea.
Then decide whether to go straight or turn.

2. Transition from exploring to refining near deadline.
Choose tools that facilitate transition to final product.

3. Clean as you go.
Be ruthless about deleting code.
You are a chef and the git repo is your kitchen.
You try twenty recipes before deciding on one.
Do you really want to clean up all the mess at the end?

4. Make your process reproducible.
Make a build-system that provides machine-readable documentation.
Accelerate the use of parts from previous projects.

5. Try bad ideas deliberately.
You can’t evaluate a visualization absent the data.
Don’t get too attached to your current favorite.
Don’t get stuck at local maximum; go down to go up.

Cloud Sentinel

Almost all new software businesses these days are helpers. They offer software to help you use other software or help you make other software. Compared to 20 years ago, it’s much harder to build something as fundamentally seismic to humanity as a Google or a Facebook. On the other hand, it’s much easier to make something that makes a selected group of people happy, thanks to open source culture. The wealth of resources and education available for free means that with a little talent, you can create little earthquakes at a very low cost. Software monitoring is one of these rivers in the ocean of software business.

Software monitoring to software is just like the dashboard to your car. Your car dashboard tells you if you have enough gas or if something is wrong with your engine. Software monitoring tools tell you if something is wrong with your software and the things you use to run it. Monitoring had existed since software existed, but offering a set of integrated tools as a comprehensive service is a relatively new practice. In the past, developers would often patch together several tools and keep them in-house. More recently, some companies have begun to offer a full suite of monitoring tools, integrating them with cloud computing services, and topping up the package with advanced features in analytics. They offer this as a subscription, removing the need to host and maintain locally, and they have managed to make it look sexy.

The main perk is efficiency: it frees up local resources to focus on gaining insights from data. Incorporating nifty visualizations and statistical techniques allows you to feel like you are doing a data scientist’s work rather than a system administrator’s. “Data scientist” is definitely sexier than “Sysadmin,” which is why companies market monitoring from a data science angle. What used to be a dreaded chore becomes appealing, and this is a side effect that is not to be overlooked. Having a reputation of installing ping pong tables in the office, making things fun is a staple allure of the software business.

The old model of monitoring is like having a horse carriage. You buy some horses (monitoring tools), a stable (servers) and a full staff to maintain the health of the horses. The new model is more like having a self-driving Iron Man suit with a supercharged assistant like Jarvis. He is more than a helper; he is a sentinel. He is an artificial intelligence that can auto-adjust your power mode according to flight conditions, verbally alert you to engine problems while cracking jokes, and make you green juice in the morning.

A fairly popular general model of software monitoring is Gartner’s Application Performance Management, which is an apt description, but I’ve already started yawning. Words, names, and imagery do matter, a lot (sometimes even, or especially, punctuation). Data scientists used to be called statisticians or analysts. When someone thought of calling the job a different name, granted that the field became associated with artificial intelligence, the same job evolved into the sexiest job of the century. There has to be a more imaginative name to software monitoring as data science is to statistics. Like Sentinel. Software Sentinel. Cloud Sentinel. Cloud Computing. Data Science. Cloud Sentinel. Got a nice ring to it, doesn’t it? A bit strange at first, but remember, cloud computing used to be a headscratcher not long ago, and now it’s become a beloved buzzword in Techspeak.

 

Explaining Cloud Computing With Game of Thrones

The other day, I was trying to explain to my parents what cloud computing was. They are intelligent folks that never had to incorporate computers into their careers whatsoever, so they are the kind of people who need help filling out online forms or signing up for social media accounts. You can’t expect them to understand any vocabulary of Techspeak, a professional dialect that technologists hold so dearly in their hearts for good reasons. I like to explain esoteric concepts to people who don’t care about them, and see if I can make them care. This is a practice of the Feynman technique, which not only reinforces the concepts in my own mind, but also reinforces the reasons I should care about them in the first place.

Before telling the story, I want to talk about semantics, which is the issue at heart in trying to explain technical concepts to the uninitiated. I am using “Techspeak” as a neutral definitive term, in a matter-of-fact manner, without any negative connotations. In every field, there is a set of vocabulary that defines the crucial concepts and functions to summarize the existence of the field. When your doctor uses medical terms in Latin to describe your health conditions, you don’t mock them by saying they are using “Doctorspeak.” The use of those terms is simply an accepted convention, partly because they have been used for centuries, and possibly because Latin is more concise in its ability to explain so much in fewer words than plain English.

In the business world, the running joke is on “Corporatespeak,” describing empty rhetoric that people use simply for dramatic effect, without any or much actual content in the communication. In other words, sometimes when people speak, what they are saying is not in the meaning of the words but the fact that they are saying something and the situation they are saying it in. Nevertheless, Corporatespeak in and of itself is just a dialect available for effective communication within a context.

In the world of computer science and technology, the vocabulary used is crucial to the operations at hand. I would argue that the significance of this language is similar to the way doctors use Latin: based on long-standing conventions and the utility of conciseness. Technologists trained in computer science share a common cryptic (to outsiders) professional vocabulary the same way doctors trained in medical schools share a common vocabulary, and that is the de facto language they use to communicate with each other.

Alright, back to explaining cloud computing to my parents. Let’s just start by thinking about having a business and running it. You need computers, obviously, even if you’re running a deli selling sandwiches and beer. You need computers to calculate and record transactions. For keeping track of inventory. To pay your staff. For a deli, maybe you just need one computer at the cash register for the transaction bit, and another computer for the rest. That’s not a big deal, because it’s easy to get and maintain two machines, and to teach your staff how to use and take care of them.

Now, let’s think of a more complex business: a filmmaking company that provides special effects by combining actual filmed footage with computer-rendered images, in order to create spectacular and imaginative scenes. Star Wars, Captain America, Cloud Atlas, Avatar type of stuff.

You’ll need storage space for all the image files. You’ll need servers to dish them out to you when you need them. You’ll need applications to process the images. All of this is written in computer code, transformed into electronic signals, stored in massive physical structures. This is where the hard part is: maintaining these monstrous physical structures and the code required to create the magic.

Not just the physical space you need to rent or buy to house them, or the electrical bills you run up for keeping them on, but also additional staff who know the ins and outs of the thing and how to fix it when something goes wrong. The bigger your company is, the bigger the monster is, and the more steps there are between different points. More things can go wrong.

It’s useful to think of this computer system as a physical monster: like one of those dragons in Game of Thrones whose owner, Daenerys, uses as fighting machines in battles.

As your business grows, you are likely to have to encounter these monsters. The essence of cloud computing is that you find someone who owns these monsters. You pay her for whatever tricks she makes the monsters perform for you: breathing fire to destroy a hundred ships, for example. You don’t have to rent a big house to keep the dragons, and feed them huge amounts of food. Daenerys would do that.

So there you are. Cloud computing is like having Daenerys as an ally. She would release her dragons to help you win battles, for a modest fee.

Programmer Anarchy, a Post-Agile Model

Just came across a software development organizational theory called Programmer Anarchy, or Developer Anarchy, in a podcast by Software Engineering Daily. Fred George and Antonio Terreno developed the theory back in 2010 as sort of an “Extreme Agile” guerilla-style model to trim fat from a software development organization.

The theory seems to have a Marxist flavor, in that it seeks to eliminate the roles of the capitalist and bourgeois in favor of maximum productivity. Just as a proletarian revolution creates a new kind of state that empowers workers, Programmer Anarchy creates a new kind of organization that empowers software developers, perhaps at the expense of workers of other functions in the organization. Parallel to the political analogy, this model probably works best for a company at its infant stage.

At its heart, Programmer Anarchy might be closer to the idea of “creative destruction,” developed into an economic idea by Joseph Schumpeter in Capitalism, Socialism, and Democracy. The term “anarchy” has a very specific meaning in politics, although the specificity and intensity of the word has somewhat diluted with its frequent use in popular culture. The branding of the theory is sort of a hyperbole and a marketing gimmick; perhaps “Programmer Liberation” might be better in terms of sensitivity, but then again, the word “liberation” has its own controversies.

Individual angle: career development

Compared to traditional organizational structures, this model seems to foster a more holistic environment for individual career development. It promotes continuous learning by doing, and provides a high degree of freedom for doing so, provided that production goals are met.

This rings true for many modern professions, particularly in fields with rapidly changing landscape, such as media and technology. In the case of technology,  there is the constant race to the top for funding and market share. In the case of media, technology is responsible for shaking out a lot of media jobs, in the transition from print to digital.

“When your skill becomes commodity, and people put spreadsheets together and tell you what rates to charge,” George said. “This is when you lose your jobs to the offshore firms.”

Which is roughly how large corporations view their employees: a figure on a spreadsheet to be “streamlined.” That said, only high-performing and highly committed individuals can benefit from the Programmer Anarchy model. In this model, virtually all support roles (quality assurance, project management, business analyst) are permanently eliminated. Only those with high-level skills in software development will remain.

Organizational angle: efficiency

The traditional way is to keep people who are specialized at something to do only that thing, and absolutely nothing else. There is more friction against getting things done, and there are hand-off costs between specialists, because people have to spend time communicating to each other. The more people you have, this process repeats the more, and more man-hours are spent. Programmer Anarchy speeds up the workflow by removing the hand-off costs.

It dissolves the bureaucracy that sometimes prevents things from getting done. In a more traditional setting, a non-developer internal client might rely on a developer to resolve a problem, and no developer would dare to do a thing until a “user story” has been entered into a Jira ticket. This means you need to talk to at least three “managers” before everyone would even know what’s going on.

Developments

Terreno, one of the authors of the original paper for the theory, reported mixed results in 2012, two years into the Programmer Anarchy experiment. George is still promoting the model pretty hard, and I’d be interested to see where it goes.

Older posts

© 2018

Theme by Anders NorenUp ↑