# Piping is Method Chaining

This article is originally published at http://www.win-vector.com/blog

What `R`

users now call piping, popularized by Stefan Milton Bache and Hadley Wickham, is inline function application (this is notationally similar to, but distinct from the powerful interprocess communication and concurrency tool introduced to Unix by Douglas McIlroy in 1973). In object oriented languages this sort of notation for function application has been called “method chaining” since the days of `Smalltalk`

(~1972). Let’s take a look at method chaining in `Python`

, in terms of pipe notation.

Let’s work an example using `Python`

‘s `Pandas`

package (and classes).

`import pandas as pd`

```
data = [['alpha', 'a', 1, 0],
['beta', 'b', 2, 10],
['gamma', 'b', 3, 10]]
df = pd.DataFrame(data,
columns=['name', 'group', 'value', 'cost'])
print(df)
```

```
name group value cost
0 alpha a 1 0
1 beta b 2 10
2 gamma b 3 10
```

Method chaining is when methods return a reference to their host-object (or reference to a replacement for their host-object). This lets us call a sequence of methods one after the other as we show below.

`print(df.groupby("group").agg({"value":["max", "min"], "cost":["mean"]}))`

```
value cost
max min mean
group
a 1 1 0
b 3 2 10
```

This may not be considered legible (especially as it was combined with `print()`

function notation), so we use a common notation convention and insert a line-break before each method dispatch “`.`

“. The parenthesis surrounding the whole expression are a common `Python`

convention to facilitate multi-line expressions.

```
(
df
.groupby("group")
.agg({"value":["max", "min"], "cost":["mean"]})
.pipe(print)
)
```

```
value cost
max min mean
group
a 1 1 0
b 3 2 10
```

Or, to emphasize the similarity to pipes, we can use another convention (that contravenes the PEP8 style guide): end the lines with `.\`

which is the method dispatch “`.`

” symbol plus a line continuation mark.

```
df .\
groupby("group") .\
agg({"value":["max", "min"], "cost":["mean"]}) .\
pipe(print)
```

```
value cost
max min mean
group
a 1 1 0
b 3 2 10
```

The above is just as with the `Bizarro Pipe`

in `R`

: the pipe is available as a convention over the existing language syntax. In `Python`

(for method chaining enabled classes and methods) the glyph “`.`

” is in fact already a method application operator or pipe (as is the glyph “`.\EOL`

“, where `EOL`

denotes the line-break or end of line). With method-chaining conventions the “`.`

” already is “a pipe” organizing method application form left to right without the need for illegible nesting. In `R`

the glyph “`->.;`

” is a function application operator or pipe (which we called the `Bizarro Pipe`

; the `Bizarro Pipe`

is a first-rate pipe, faster than other pipes, and interferes less with debugging than other pipes).

Both languages have had this application capability for a very long time. We are using `Pandas`

and `pipe()`

as our example, but any package that whose methods return a reference to the object being worked on (or a reference to a replacement object) can be treated as a pipe-able object. If the class further implements one function re-director method (such as `pipe()`

) then a lot more becomes practical. Here is another example showing how additional named and unnamed arguments can be handled.

```
def add_delta_to_column(df, colname, delta):
df[colname] = df[colname] + delta
return df
```

```
df .\
pipe(add_delta_to_column, "cost", 5) .\
groupby("group") .\
agg({"value":["max", "min"], "cost":["mean"]}) .\
pipe(print, "DEBUG1", sep = " | ")
```

```
value cost
max min mean
group
a 1 1 5
b 3 2 15 | DEBUG1
```

So depending on your point of view: “piping is poor-persons’s method chaining” or “method chaining is poor-persons’s piping” (taken from the usual quote comparing objects and closures).

If one wants to go further, there are a number of `Python`

packages adding additional significant piping capabilities (either through notation, operator overloading, or other methods).

- sklearn.pipeline
- Stack overflow notes 1
- Stack overflow notes 2
- dplython
- sspipe
- Tidyverse pipes in Pandas
- chainlearn

And that is piping versus method chaining.

Thanks for visiting r-craft.org

This article is originally published at http://www.win-vector.com/blog

Please visit source website for post related comments.