2010 October 24 7:48pm
My rough notes of the messages I heard during day 1 (October 22, 2010) of clojure-conj.
Fogus - Roots of Clojure
slides
A list (some of it vaguely chronological) of the language environment in which Clojure was created:
- 1994 - Rich praised C++ strong typing
- Perl arose as a language that tied the web together
- Paul Graham sang the praises of Lisp
- Java was born. Took developers half way to Lisp, but ended up on the wrong half.
- Ruby, Python, and JavaScript opened the minds of many to dynamic languages
- The Kingdom-of-Nouns is dominant, although alternative views are present in languages like Dylan, CLIPS, and Lisp
- Rich’s “Are we there yet?” talk challenged the predominance of OO thinking. Referenced the work of Alfred North Whitehead'sf “Process and Reality”.
- Prolog and Datalog
- Test Driven Development - Should everything be test-driven? Test Driven Dentistry?
- Haskell has the important idea of laziness
- MonetDB (“an open source column-oriented database… designed to provide high performance on complex queries against large databases, e.g. combining tables with hundreds of columns and multi-million rows”)
- Databases offer
** Multi-version concurrency control
** Snapshot isolation
** Transactions
- ML - idea of isolating the point of change
- Erlang - shares many ideas with Clojure except Clojure is more open because it separates data from the functions that operate on it
- Rich has read “all” of the papers from the 1970’s and 1980’s
Luke VanderHart - Zippers
Good talk that made me appreciate the beauty of zippers.
Zipper overview:
- Generic tree walking/editing
- Purely functional, fast, and elegant
- tree of data + focus on specific location = zipper
- edits happen at the point of focus of the zipper. It is not necessary to re-create the tree to that point. It is like an “inside out glove”
Zipper implementation:
- Zippers are implemented as a thin wrapper around existing Clojure structures.
- Creating a zipper on a tree does not copy the tree into a new data structure. Instead zipper functions are applied to the existing tree.
- Code demonstrates effective use of metadata, first class functions, and destructuring
- The “rest of the tree” (beside the node that is focused on) is stored in a “path”
- Metadata stores zipper functions (children, branch?, make-node)
- Zipper data structure is very easy to explore (not a byzantine structure)
Zippers in use:
- Luke used them to create a word processor in which each version of the document was stored in a zipper. This made it easy to get to the past state of the document at any point in time.
Christophe Grand - DSLs != macros
slides
Talk was packed with API design tips based on real-world experience.
Christophe is the creator of Enlive, Moustache, and Parsely.
- Writing DSLs is hard
- Try to use macros
- Now you have two problems!
Is it a DSL or an API? In a sense every API is a DSL. It is a continuum, but the key attributes of a DSL are:
- succinct notation for data or logic
- usually declarative
For his products:
- users complained about macros
** behavior was too surprising
** they were not composable
- users don’t want a DSL, then want data values
- values + functional-core = DSL success
** values - allow users to define their data values
So:
- rely on Clojure for lexical scope (avoid creating new scope types via macros) and control flow
- use closures and higher order functions. Don’t worry if it is ugly. Icing can be added later
- no macros - macros are like premature optimization. They provide too much rope to hang yourself with.
- start with data
- give domain-specific semantics to Clojure’s datatypes (e.g. vectors, maps, etc.). But don’t try and give domain-specific meanings to lists or symbols.
e.g. - Enlive - he removed ½ of the macros and the result was a cleaner, faster system.
Two legitimate cases for macros. But even for these avoid them:
- control flow - instead use closures and delays
- binding - decouple binding via capturing [I don’t understand this point, but Christophe mentioned he might blog more details about this later.]
Christophe provided an example of a DSL for building regexes
- In the example, the function “regex” is like eval for the DSL
- Helper functions in the DSL are like macros in the DSL
- Make the regex function idempotent so it can be called on input and if necessary the input will be “eval’d” to a regex. This is analogous to how the core sequence functions convert their args into sequences.
- Small tip - when the implementation of the DSL is optimizing the input data structure be sure to check vars not symbols to recognize the DSL’s functions. This avoids colliding with other functions of the same name.
Tom Faulhaber - Flow
Flow is the opposite of feeling frustrated and passive. Instead things feel easy and natural.
Lisp promotes flow:
- homoiconic - computer as a canvas, manipulate the language itself
- symbolic - “
- REPL - machine becomes a partner
Functional programming promotes flow:
- immutability
- composition
- horizontal abstractions - [I think this refers to things like sequences which are pervasive in Clojure code.]
These allow you to build new abstractions with familiar mechanisms. It is easy to pick up tools from the Clojure toolbox and start manipulating data.
The focus in on dataflow, not flow-charts.
Good examples of flow in Clojure embody flow in both their implementation and their use. This means:
- consistent granularity of data
- consistent, comfortable toolset
Code examples that embody flow:
- fill-queue
** treat asynchronous events like a seq
- enlive
** read the source to see mind blowing code
Sean Devlin - Protocols
Examples of using protocols to get homogeneous behavior on heterogeneous types.
- treating dates, strings, longs, etc. as "dates”
- making variants of map, etc. that return the same type of object they accept. So calling map on a String produces a String.
When to use Protocols over multi-methods:
- for performance
- to get different behavior in different namespaces
Chouser - Finger Trees
slides
Great talk.
Implementation of ideas expressed in an academic paper in Haskell code. Finger trees are another persistent collection type that in some cases are more suitable than the built in Clojure types. In particular they are customizable and allow the creation of different kinds of data structures.
For example, you can build a double-list with finger trees:
- a list you where you can use (first, rest, pop, peek) both ends without losing the type of the structure
- get the count
- lookup items in O(n)
- split/replace/remove/insert in O(log n)
These are built by defining the “meter” to use for the finger-tree. A single meter function is allowed but it can effectively be the composition of several functions. The meter defines how to split the tree and how to access indexed items in the tree.
The finger tree implementation in Clojure is not finished yet. Still needs to handle metadata and equality.
technomancy - Leiningen
slides
Additional lein features:
- checkouts - like multi-module projects, but ad-hoc and opt-in. You can tell lein to use some files in a local “checkouts” sub-directory to satisfy a dependency.
- shell wrappers - [for starting a project-less swank?]
- test selectors - allow conditional test execution - e.g. for tagging long-running integration tests
- tasks - just functions. Use eval-in-project to get JVM isolation between user tasks and lein itself
- hooks - use Robert Hooke to customize tasks that are outside of your control (otherise just use functions to customize it)
- plug-ins - jars that define tasks or include hooks
- lein int - an interactive shell for lein tasks
Community building:
- identify low-hanging fruit to make it easy for people to contribute
- provide a developer guide
- accept patches promptly
- make commit rights easy to get
Laurent Petit - Counterclockwise
They now have 80% of paredit.
Rich Hickey - New Clojure Features
Focus of new features is performance, but the results also offer better semantics (!). They are breaking changes. A key to the performance improvements is to produce code that Hotspot will optimize.
Unified primitives and boxed numbers
- both act like primitives unless you use bigint operands
- bigints are contagious
- local numbers are represented as either longs or doubles (not shorts or ints)
- +, -, … are non-promoting operators
- +’, -’, … are promoting operators <- don’t use these, they are just hear to keep people from whining, instead use contagion (as an aside, ’ is now a valid character in identifiers)
Align map key equality with number equality
- the problem is that if a map key is Long(42) a lookup on Int(42) will fail even though Long(42) is numerically equal to Int(42)
- this has been a nagging problem
- the roots of the problem are in Java. This is how map equality is defined and there is no way to fix the root problem
- the solution is: when maps are accessed via Java they will provide the expected (bad semantics), but when they are accessed via Clojure functions they will make map equality match numerical equality
Bindings & threads
- current problem is that bindings only apply to the thread that creates them. So if a thread pool is used somewhere down the call stack (e.g. with pmap) then the bindings are lost.
- vars have root values
- bindings create a mapping from that var (they don’t change the root binding) to a “box”. Using set! the value in the box can be changed. Every time a new binding is executed, a new box is created.
- bindings assume a single-threaded model
- the key to making bindings work across threads is to make the “boxes” thread safe
- the solution is: the thread that established the binding can use set! to change the value, all other threads get read-only access to the binding. So sub-threads (i.e. threads enlisted in part of an overall “task”) can point to the same binding map as their “parent” thread. NOTE: They are not really child threads because typically they are threads from a pool.
- this is intimately related to the “scopes” problem in which we would like to create resource scopes (e.g. to close a file, etc) around lazy sequences, in which case the resource needs to be closed in a location far from the lexical scope where the resource is used.
Pay for what you use & dynamic variables
- the current “def” is too powerful because it always creates vars which can be dynamically bound. This is “unfair” in that even if this dynamicity is not needed, the user still incurs a performance cost every time the variable is accessed.
- most defs are functions that are not going to be rebound
- every var access incurs some costs to determine if the variable has been dynamically bound
- solution: require the developer to explicitly identify defs that need to be dynamic (stylistically - the names of these variables and functions should include earmuffs, e.g. foo)
- Clojure will “presume stability”
- but, to accomodate the development-time use-case of redefining functions regularly: at the start of each function invocation a single check will be made against a var-universe-counter to make sure that no vars have been refdefined (any time a var is changed the var-universe-counter will be incremented). For long running functions they should be invoked with #’ to trigger a check of the var-universe-counter.
Functions & primitives
- Currently pulling code out to a function causes the args and return values to be boxed.
- solution: add type hints to the args and return values to indicate primitives. The return value type hint is placed just before the arg vector to accommodate multiple arities.
- future work will handle making higher order functions (e.g. map, reduce) preserve primitive performance.
>