(david-mcneil.com :blog), clojure-conj day 1 notes

clojure-conj day 1 notes

2010 October 24 7:48pm

My rough notes of the messages I heard during day 1 (October 22, 2010) of clojure-conj.

Fogus - Roots of Clojure

slides

A list (some of it vaguely chronological) of the language environment in which Clojure was created:

1994 - Rich praised C++ strong typing
Perl arose as a language that tied the web together
Paul Graham sang the praises of Lisp
Java was born. Took developers half way to Lisp, but ended up on the wrong half.
Ruby, Python, and JavaScript opened the minds of many to dynamic languages
The Kingdom-of-Nouns is dominant, although alternative views are present in languages like Dylan, CLIPS, and Lisp
Rich’s “Are we there yet?” talk challenged the predominance of OO thinking. Referenced the work of Alfred North Whitehead'sf “Process and Reality”.
Prolog and Datalog
Test Driven Development - Should everything be test-driven? Test Driven Dentistry?
Haskell has the important idea of laziness
MonetDB (“an open source column-oriented database… designed to provide high performance on complex queries against large databases, e.g. combining tables with hundreds of columns and multi-million rows”)
Databases offer ** Multi-version concurrency control ** Snapshot isolation ** Transactions
ML - idea of isolating the point of change
Erlang - shares many ideas with Clojure except Clojure is more open because it separates data from the functions that operate on it
Rich has read “all” of the papers from the 1970’s and 1980’s

Luke VanderHart - Zippers

Good talk that made me appreciate the beauty of zippers.

Zipper overview:

Generic tree walking/editing
Purely functional, fast, and elegant
tree of data + focus on specific location = zipper
edits happen at the point of focus of the zipper. It is not necessary to re-create the tree to that point. It is like an “inside out glove”

Zipper implementation:

Zippers are implemented as a thin wrapper around existing Clojure structures.
Creating a zipper on a tree does not copy the tree into a new data structure. Instead zipper functions are applied to the existing tree.
Code demonstrates effective use of metadata, first class functions, and destructuring
The “rest of the tree” (beside the node that is focused on) is stored in a “path”
Metadata stores zipper functions (children, branch?, make-node)
Zipper data structure is very easy to explore (not a byzantine structure)

Zippers in use:

Luke used them to create a word processor in which each version of the document was stored in a zipper. This made it easy to get to the past state of the document at any point in time.

Christophe Grand - DSLs != macros

slides

Talk was packed with API design tips based on real-world experience.

Christophe is the creator of Enlive, Moustache, and Parsely.

Writing DSLs is hard
Try to use macros
Now you have two problems!

Is it a DSL or an API? In a sense every API is a DSL. It is a continuum, but the key attributes of a DSL are:

succinct notation for data or logic
usually declarative

For his products:

users complained about macros ** behavior was too surprising ** they were not composable
users don’t want a DSL, then want data values
values + functional-core = DSL success ** values - allow users to define their data values

So:

rely on Clojure for lexical scope (avoid creating new scope types via macros) and control flow
use closures and higher order functions. Don’t worry if it is ugly. Icing can be added later
no macros - macros are like premature optimization. They provide too much rope to hang yourself with.
start with data
give domain-specific semantics to Clojure’s datatypes (e.g. vectors, maps, etc.). But don’t try and give domain-specific meanings to lists or symbols.

e.g. - Enlive - he removed ½ of the macros and the result was a cleaner, faster system.

Two legitimate cases for macros. But even for these avoid them:

control flow - instead use closures and delays
binding - decouple binding via capturing [I don’t understand this point, but Christophe mentioned he might blog more details about this later.]

Christophe provided an example of a DSL for building regexes

In the example, the function “regex” is like eval for the DSL
Helper functions in the DSL are like macros in the DSL
Make the regex function idempotent so it can be called on input and if necessary the input will be “eval’d” to a regex. This is analogous to how the core sequence functions convert their args into sequences.
Small tip - when the implementation of the DSL is optimizing the input data structure be sure to check vars not symbols to recognize the DSL’s functions. This avoids colliding with other functions of the same name.

Tom Faulhaber - Flow

Flow is the opposite of feeling frustrated and passive. Instead things feel easy and natural.

Lisp promotes flow:

homoiconic - computer as a canvas, manipulate the language itself
symbolic - “
REPL - machine becomes a partner

Functional programming promotes flow:

immutability
composition
horizontal abstractions - [I think this refers to things like sequences which are pervasive in Clojure code.]

These allow you to build new abstractions with familiar mechanisms. It is easy to pick up tools from the Clojure toolbox and start manipulating data.

The focus in on dataflow, not flow-charts.

Good examples of flow in Clojure embody flow in both their implementation and their use. This means:

consistent granularity of data
consistent, comfortable toolset

Code examples that embody flow:

fill-queue ** treat asynchronous events like a seq
enlive ** read the source to see mind blowing code

Sean Devlin - Protocols

Examples of using protocols to get homogeneous behavior on heterogeneous types.

treating dates, strings, longs, etc. as "dates”
making variants of map, etc. that return the same type of object they accept. So calling map on a String produces a String.

When to use Protocols over multi-methods:

for performance
to get different behavior in different namespaces

Chouser - Finger Trees

slides

Great talk.

Implementation of ideas expressed in an academic paper in Haskell code. Finger trees are another persistent collection type that in some cases are more suitable than the built in Clojure types. In particular they are customizable and allow the creation of different kinds of data structures.

For example, you can build a double-list with finger trees:

a list you where you can use (first, rest, pop, peek) both ends without losing the type of the structure
get the count
lookup items in O(n)
split/replace/remove/insert in O(log n)

These are built by defining the “meter” to use for the finger-tree. A single meter function is allowed but it can effectively be the composition of several functions. The meter defines how to split the tree and how to access indexed items in the tree.

The finger tree implementation in Clojure is not finished yet. Still needs to handle metadata and equality.

technomancy - Leiningen

slides

Additional lein features:

checkouts - like multi-module projects, but ad-hoc and opt-in. You can tell lein to use some files in a local “checkouts” sub-directory to satisfy a dependency.
shell wrappers - [for starting a project-less swank?]
test selectors - allow conditional test execution - e.g. for tagging long-running integration tests
tasks - just functions. Use eval-in-project to get JVM isolation between user tasks and lein itself
hooks - use Robert Hooke to customize tasks that are outside of your control (otherise just use functions to customize it)
plug-ins - jars that define tasks or include hooks
lein int - an interactive shell for lein tasks

Community building:

identify low-hanging fruit to make it easy for people to contribute
provide a developer guide
accept patches promptly
make commit rights easy to get

Laurent Petit - Counterclockwise

They now have 80% of paredit.

Rich Hickey - New Clojure Features

Focus of new features is performance, but the results also offer better semantics (!). They are breaking changes. A key to the performance improvements is to produce code that Hotspot will optimize.

Unified primitives and boxed numbers

both act like primitives unless you use bigint operands
bigints are contagious
local numbers are represented as either longs or doubles (not shorts or ints)
+, -, … are non-promoting operators
+’, -’, … are promoting operators <- don’t use these, they are just hear to keep people from whining, instead use contagion (as an aside, ’ is now a valid character in identifiers)

Align map key equality with number equality

the problem is that if a map key is Long(42) a lookup on Int(42) will fail even though Long(42) is numerically equal to Int(42)
this has been a nagging problem
the roots of the problem are in Java. This is how map equality is defined and there is no way to fix the root problem
the solution is: when maps are accessed via Java they will provide the expected (bad semantics), but when they are accessed via Clojure functions they will make map equality match numerical equality

Bindings & threads

current problem is that bindings only apply to the thread that creates them. So if a thread pool is used somewhere down the call stack (e.g. with pmap) then the bindings are lost.
vars have root values
bindings create a mapping from that var (they don’t change the root binding) to a “box”. Using set! the value in the box can be changed. Every time a new binding is executed, a new box is created.
bindings assume a single-threaded model
the key to making bindings work across threads is to make the “boxes” thread safe
the solution is: the thread that established the binding can use set! to change the value, all other threads get read-only access to the binding. So sub-threads (i.e. threads enlisted in part of an overall “task”) can point to the same binding map as their “parent” thread. NOTE: They are not really child threads because typically they are threads from a pool.
this is intimately related to the “scopes” problem in which we would like to create resource scopes (e.g. to close a file, etc) around lazy sequences, in which case the resource needs to be closed in a location far from the lexical scope where the resource is used.

Pay for what you use & dynamic variables

the current “def” is too powerful because it always creates vars which can be dynamically bound. This is “unfair” in that even if this dynamicity is not needed, the user still incurs a performance cost every time the variable is accessed.
most defs are functions that are not going to be rebound
every var access incurs some costs to determine if the variable has been dynamically bound
solution: require the developer to explicitly identify defs that need to be dynamic (stylistically - the names of these variables and functions should include earmuffs, e.g. foo)
Clojure will “presume stability”
but, to accomodate the development-time use-case of redefining functions regularly: at the start of each function invocation a single check will be made against a var-universe-counter to make sure that no vars have been refdefined (any time a var is changed the var-universe-counter will be incremented). For long running functions they should be invoked with #’ to trigger a check of the var-universe-counter.

Functions & primitives

Currently pulling code out to a function causes the args and return values to be boxed.
solution: add type hints to the args and return values to indicate primitives. The return value type hint is placed just before the arg vector to accommodate multiple arities.
future work will handle making higher order functions (e.g. map, reduce) preserve primitive performance.

rss
archive
past life
twitter
about