Example:
cv = CountVectorizer(analyzer=lambda x: x.split(), input='filename')
cv.fit(['hello world']).vocabulary_
Same for input="file". Not sure if this should be fixed or just documented; I don't like changing the behavior of the vectorizers yet again...