Stefan Krawczyk

Stefan Krawczyk · 2025-08-21T13:04:01.034Z

Great abstractions die hard. Yes, Apache Hamilton (& Apache Burr) are still very relevant and continue to be discovered, apropos post below. Note we're working hard on an Apache release, but because there is a lot of surface area to satisfy release requirements things are taking longer than anticipated (we're limited by contributor bandwidth) .. stay tuned!

San Francisco, California, United States

Sign in to view Stefan’s full profile

Stefan can introduce you to 10+ people at Salesforce

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

7K followers 500+ connections

View mutual connections with Stefan

Stefan can introduce you to 10+ people at Salesforce

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Join to view profile

Salesforce

Y Combinator

About

With over 10 years of experience in building and leading data & ML related systems and…

Articles by Stefan

February Updates

Feb 20, 2025

February Updates

TL;DR: #Hamilton highlights: crossed 2000 github stars, released multithreading based DAG parallelism, RichProgressBar…

3 Comments
Last week of 2024 / first week of 2025

Jan 2, 2025

Last week of 2024 / first week of 2025

TL;DR: #Hamilton + #Burr 2024 stats: 35M+ telemetry events (10x), 100K+ unique IPs (10x) from 1000+ companies, 1M+…

3 Comments
Week of December 9th

Dec 13, 2024

Week of December 9th

TL;DR: #Hamilton release highlights: Better TypedDict support and modular subdag example Office Hours & Meet ups for…
Week of December 2nd

Dec 5, 2024

Week of December 2nd

TL;DR: #Hamilton release highlights: Async Datadog Integration, Polars & Pandas with_columns support. #Burr release…
Week of November 18th

Nov 22, 2024

Week of November 18th

TL;DR: #Hamilton release highlights: SDK configurability #Burr release highlights: parallelism UI modifications, video…
Week of November 11th

Nov 15, 2024

Week of November 11th

TL;DR: #Hamilton release highlights: async support for @pipe + various small fixes #Burr release highlights:…
Week of November 4th

Nov 8, 2024

Week of November 4th

TL;DR: #Hamilton release highlights: @with_columns decorator for Pandas by Jernej Frank & module overrides for async…
Week of October 28th

Oct 31, 2024

Week of October 28th

TL;DR: #Hamilton release highlights: in-memory cache store. #Burr release highlights: release candidate for a first…
Week of October 21st

Oct 24, 2024

Week of October 21st

TL;DR: #Hamilton release highlights: some minor fixes and docs updates from five different OS contributors! Also…
Week of October 14th

Oct 17, 2024

Week of October 14th

TL;DR: Announcing Shreya Shankar as an advisor. #Hamilton release highlights: tweaks to pipe_input, new…

3 Comments

See all articles

Activity

7K followers

Stefan Krawczyk reposted this
Report this post
Stefan Krawczyk reposted this

Aron Kale

Aron Kale

2w

Stefan Krawczyk reposted this
Agent Script and the new Agentforce Builder helped Datasite increase deflection from the low 60s to 82%. Take it from Datasite admin Grant Roberson: “That’s where we are with Script. It really takes away the frustration because you know, without a shadow of a doubt, it’s going to run 100% of the time every time... It’s working so well that there’s no reason to stay in the legacy builder anymore.” The tools are ready. The outcomes are real. Read more: https://lnkd.in/gjQE_X9u

Proving the Power of Script: How Datasite Agents Achieved 82% Deflection and 4.8/5 CSAT

Proving the Power of Script: How Datasite Agents Achieved 82% Deflection and 4.8/5 CSAT
1 Comment
Stefan Krawczyk reposted this
Report this post
Stefan Krawczyk reposted this

Hugo Bowne-Anderson

Hugo Bowne-Anderson

4mo

Stefan Krawczyk reposted this
Building reliable AI-powered software is difficult and we're still figuring it out as a discipline. Stefan Krawczyk & I have had the great pleasure not only to teach what we know, drawing on decades of work in data science & ML, but have also learned so much from the builders in our course and community. If you're interested in joining our final cohort in Q1, 2026, use the code `HAPPYHOLIDAYS` by Dec 31 for 40%: https://lnkd.in/g-PvGsJG aaaaaand happy holidays, everyone! 🥳

public_profile__posts
Stefan Krawczyk reposted this
Report this post
Stefan Krawczyk reposted this

Jose ‘𝒥𝒪’ Reyes

Jose ‘𝒥𝒪’ Reyes

4mo

Stefan Krawczyk reposted this
After completing an AI evals course a few weeks ago, I learned a lot but didn't have time to write about it until now. Taking the course by Hamel Husain and Shreya Shankar significantly reshaped how I think about LLM evaluations. Although it covered many concepts, the most impactful lesson for me was learning how to approach failure analysis properly, starting with collecting traces, then moving through open coding and axial coding. Since this topic goes far deeper than a single article, I’ve started a series exploring how to build evals for a theoretical bike shop chatbot. Building reliable chatbots made easy by using Apache Burr (thanks Stefan Krawczyk). https://lnkd.in/gmesD7-f

#11 - A Developer’s Guide to Failure Analysis and Evaluation for Single-turn and Multi-turn Chatbots

#11 - A Developer’s Guide to Failure Analysis and Evaluation for Single-turn and Multi-turn Chatbots
Stefan Krawczyk reposted this
Report this post
Bryan Bischof

Bryan Bischof

4mo

Stefan Krawczyk reposted this
I really dislike “ai evals are unit tests” and this is partly why!

Eddie Landesberg

Eddie Landesberg

4mo

Stefan Krawczyk reposted this
After my last post, I had a thought-provoking discussion in the comments with Dave Spiegel about the role of unit tests in AI evaluation. It clarified a critical distinction: When do we treat evaluation as an Engineering problem vs. a Science problem? Traditional software testing is actually solving a causal problem (avoiding counterfactual deployments that would tank the product), but we don't treat it like science. Why? Because the causal mechanism is obvious. We don’t need to run a Randomized Control Trial (RCT) to prove that a 500 Internal Server Error or a NullPointerException hurts retention. The link between "The site crashed" and "Users left" is so well-understood that proving it with statistics would be purely performative rigor. So we treat it as an engineering constraint: Pass/Fail. In AI, we have the same "Obvious" failures. If your LLM outputs broken JSON or hallucinates a URL that yields a 404, the causal link to user dissatisfaction is clear. You should absolutely treat these like unit tests. Block the deployment. The danger comes when we treat "Subtle" qualities like "Obvious" ones. Is a helpful 500-word response better than a concise 50-word response? Is a "friendly" tone better than a "neutral" one? Here, the causal mechanism driving revenue and retention is opaque. If you try to "Unit Test" for helpfulness, you end up hard-coding proxies (e.g., "Answer must be >200 words") that often hurt the very user value you’re trying to improve. The heuristic I'll be using going forward: 1. Constraints (The Floor): Flaws where the causal link to failure is obvious (JSON, Toxicity, Crashes). Treat these as Unit Tests. 2. Objectives (The Ceiling): Qualities where the causal link to value is learned (Helpfulness, Reasoning, Style). Treat these as Causal Estimators. You need Unit Tests to define the Admissible Set of models (the ones that aren't broken). But you need Causal Evaluation to rank the Optimal Set (the ones that drive value). Don't mistake the floor for the ceiling. #AI #LLMOps #DataScience #CausalInference #Engineering

public_profile__posts
Stefan Krawczyk

Stefan Krawczyk

5mo
Report this post
Stefan Krawczyk shared this
ICYMI: (1) the thinking that went into #ApacheHamilton and #ApacheBurr has helped us shape some new capabilities coming out at Salesforce. Basically production agents are a mixture of workflows with LLM call loops - the "Agent Graph". If this topic tickles your fancy, would love to chat -- I'll drop a link to the blog post in the comments -- don't know what Apache Hamilton or Apache Burr are? Links in the comments. (2) The next cohort for my Maven course with Hugo Bowne-Anderson starts next week. If you're struggling to get to production with your LLM workflows / Agents, and/or want a primer over the broad range of topics that you need to get right to get them to production -- then sign up with my promo code for 25% off! This will be good until the course starts! Link in the comments.

public_profile__posts
1 Comment
Stefan Krawczyk

Stefan Krawczyk

6mo
Report this post
Stefan Krawczyk shared this
👋 It's been a while. Two updates this week: (1) I'm excited to be at my first "Coachella of B2B/C Software", aka #Dreamforce25, this week and for everyone to get a preview of some of the work Elijah ben Izzy and I have been driving on as part of #Agentforce Salesforce. (2) The first #ApacheHamilton (incubating) release also went out. We have a host of new features, e.g. `@unpack_fields`, all user contributed! This helps us to get the ball rolling to be able to continue to push out updates as an Apache project. You can install it via `pip install apache-hamilton==1.89.0` or `pip install sf-hamilton==1.89.0` -- we'll be dual publishing for the foreseeable future. For release notes join the users@hamilton.apache.org mailing list or slack :)

public_profile__posts
5 Comments
Stefan Krawczyk reposted this
Report this post
Hugo Bowne-Anderson

Hugo Bowne-Anderson

6mo

Stefan Krawczyk reposted this
The 6th Annual MLOps World | GenAI Summit kicks off this week in Austin. For those who can make it, it’s an incredible conference and one I’m proud to have helped shape as part of the program committee. I’ve worked closely with David Scharbach and his team for years. They’re true community builders, bringing together practitioners pushing the boundaries of AI from the model layer to the application layer. Fun fact: our Building LLM-Powered Applications course with Stefan Krawczyk actually started as a workshop here before evolving into the full 4-week course it is today. If you’re attending, don’t miss these sessions from the AI Builder trenches about what's working (and what isn't!). Use the code `friendsofhugo` for 50% off.

David Scharbach

David Scharbach

6mo

Stefan Krawczyk reposted this
Since the most valuable insights come from honest conversations, the goal here is to help you gain real traction with your AI initiatives. We'll discussing a series of honest lessons about what’s working (and what isn’t). Not a Vegas vendor show, but real practical knowledge sharing on scaled enterprise projects. The 6th Annual MLOps World | GenAI Summit agenda is selected by committee. I'm very thankful for all the enterprise leaders here participating in our roundtable lightning talks + round table discussions. These sessions aren't recorded, the focus is on the personal conversations, and the core takeaways we can all share. Here's to everyone democratizing the learning lessons for our community! Big thank you to speakers and roundtable participants; Vaibhav Misra, Director & Distinguished Engineer, Capital One - Rag Architecture Nitin Kumar, Director Data Science, Marriott International - A modular framework for building agentic workforces Devdas Gupta, Sr. Manager Software Development & Engineering Lead, Charles Schwab - AI Powered Development Productivity Milan Rana, Software Engineer Advisor, FedEx - DevSecPrivacyAIops Dippu Singh, Leader For Emerging Data & Analytics, Fujitsu Frontech North America - Explainable AI Naveen Reddy Kasturi, Staff Machine Learning Engineer, Realtor.com -Agent powered code migration Ravi Shankar, Manager, Data Science, DICK'S Sporting Goods - Streamlining ML Collaborations Prasanth Nandanuru, Managing Director, Wells Fargo - ROI of Gen AI Frontier models vs traditional models Balaji Varadarajan, Lead AI Engineer - Digital Personalization, Target - Building Sustainable GenAI Systems If you think this is interesting, you're welcome to join us! -->> 🔗 https://lnkd.in/eH2t_g2M 📍Oct 8-9th, City of Austin, Renaissance Austin Hotel

public_profile__posts
Stefan Krawczyk

Stefan Krawczyk

8mo
Report this post
Stefan Krawczyk shared this
Great abstractions die hard. Yes, Apache Hamilton (& Apache Burr) are still very relevant and continue to be discovered, apropos post below. Note we're working hard on an Apache release, but because there is a lot of surface area to satisfy release requirements things are taking longer than anticipated (we're limited by contributor bandwidth) .. stay tuned!

🎧 Eric Riddoch

🎧 Eric Riddoch

8mo

Stefan Krawczyk shared this
Have you heard of Hamilton by DAGWorks Inc. ? I wasn't a believer in *Apache* (congratulations! 🎉) Hamilton until I realized how much of a beast SageMaker Pipelines, Airflow, Databricks Jobs, and Kubeflow pipelines are. Airflow is not for iteration. SageMaker Pipelines is not for iteration. I'd argue Kubeflow Pipelines and Databricks Jobs are not ideal for iteration. === These tools are for when you already have a polished, ready-to-go ML pipeline, and now you're ready to deploy it in prod on a schedule or trigger. === Some DS has spent a long time iterating towards in a Jupyter notebook Then an MLE comes along and helps port that notebook into a script, builds some docker images and pushes them, adds some alert logic, blah blah blah--until it's all ready to go. When the MLE is finished, the DS doesn't even recognize their own pipeline anymore. If a problem were to arise in the pipeline in production, it's a big lift for the DS to reproduce went wrong in their IDE of choice--at best VS Code / PyCharm with a mix of notebooks and scripts--at worst a notebook. With Hamilton, you organize your code into a DAG, similar to Metaflow or ZenML. The DS can do this in their notebook. They can then hand that DAG to an MLE, who can selectively run portions of that DAG in different Airflow-SageMaker-Kubeflow steps. E.g. run the "preprocessing" part on a CPU machine run the "model train" part on a more expensive GPU machine run the "evaluation" part on a CPU machine ... But it's all the same code. The MLE didn't need to change anything to get the DS code pipeline-ified. This also means, MLE could just share a template with DS that can consume a hamilton DAG--and now the DS is much more able to self serve. This allows DS to use Hamilton to iterate and one of the high-learning-curve pipeline tools to deploy. === There are definitely unanswered questions I have like: if a failure happens in prod, how can the DS get those artifacts into their Hamilton DAG and iterate using those locally (equivalent of Metaflow's "resume" command). But if I were using SageMaker Pipelines on a team, I think I'd want DS to use something like Hamilton. === ZenML and Metaflow and probably Flyte are tools that I think *do* belong in the iteration phase. They are meant for DS to be able to use them directly. I'm not (yet) sold on using Hamilton PLUS one of these-- because if the value prop of both is easier iteration, testing, and handoffs, why learn 2 DAG frameworks? === A last wrench to add to all this is, Runhouse and Modal exist. These are tools that are more lightweight than a full on DAG orchestrator. If you use Airflow or Prefect or Dagster or any other orchestrator, you can trigger fancy, heterogeneous-compute-y ML tasks by kicking them off as Kubetorch jobs. Both these and hamilton lack scheduling. So--would you use hamilton + modal + an orchestrator? There are a lot of possible combinations. I'm going to think about this some more.

public_profile__posts
2 Comments
Stefan Krawczyk reposted this
Report this post
Stefan Krawczyk reposted this

Hamel Husain

Hamel Husain

9mo

Stefan Krawczyk reposted this
Great advice from Bryan Bischof on "What Evals Framework should you use" He's a guest speaker in our evals course! https://lnkd.in/g_C_kXdU

public_profile__posts
8 Comments

Stefan Krawczyk reacted on this
Report this post
Stefan Krawczyk reacted on this

Erran Berger

Erran Berger

15h

Stefan Krawczyk reacted on this
In 2009 I joined LinkedIn as a senior software engineer. Today I become its CTO, Engineering - an opportunity I don’t take lightly. I couldn’t have imagined then what this moment would look like. The transformation underway now goes beyond software; it’s reshaping the fundamental nature of work itself. And yet here I am, at what I believe is the most significant inflection point in the software industry in a generation. The companies that navigate it well will be defined by how clearly they see the shift and how deliberately they move. Technology underpins both more than in any prior moment. The last few years have had a startup-era intensity I haven’t felt since LinkedIn’s early days: every decision matters, the margin for error is thin, and the stakes are real... but I suspect years from now, this will be the work I’m most proud of. Day one. Let’s build.
236 Comments
Stefan Krawczyk reacted on this
Report this post
Stefan Krawczyk reacted on this

Aman Srivastava

Aman Srivastava

5d

Stefan Krawczyk reacted on this
Hot take: I am the reason #ClaudeCode has token limits. My prompts, verbatim: - "Commit and push these changes." - "Pull the logs from gcloud and verify yourself." - "Create a branch, run the tests, fix whatever breaks." Then I go make protein shake. Then I come back, see the context limit warning, and think - Anthropic really needs to fix their infrastructure. Sir, you sent an entire deployment pipeline as a casual afternoon prompt. The limit is not the problem. 🥲
Stefan Krawczyk liked this
Report this post
Capella Kerst, Ph.D.

Capella Kerst, Ph.D.

4d

Stefan Krawczyk liked this
Huge thanks to TechCrunch and Isabelle for featuring us on the Build Mode podcast! It was such an honor to sit down and discuss the journey of geCKo Materials. 🦎✨ Being in "Build Mode" is a constant reality for us as we scale our bio-inspired dry adhesive technology from the lab to global industries like semiconductors and logistics. Grateful for the platform to share how we’re rethinking the way the world (and space!) sticks together. Check out the episode to hear more about our mission and where we're headed next! 🚀 #TechCrunch #BuildMode #Startups #DeepTech #geCKoMaterials #Innovation

TechCrunch

TechCrunch

5d

Stefan Krawczyk liked this
"It works!" This surprise breakthrough led to geCKo Materials launching, and eventually landing customers like Apple, Ford, and General Motors with its especially adhesive product. Listen to the full episode of Build mode to hear how founder and CEO, Capella Kerst, went from Stanford PhD to CEO: http://spr.ly/6043B6fgDh

A Sticky Evolution From Major Lab Breakthrough to VC-Backed Startup │ Build Mode Podcast

A Sticky Evolution From Major Lab Breakthrough to VC-Backed Startup │ Build Mode Podcast
20 Comments
Stefan Krawczyk liked this
Report this post
Stefan Krawczyk liked this

Tytus Cytowski

Tytus Cytowski

6d

Stefan Krawczyk liked this
🚨🦄 We represented 🇭🇷 Croatian based Farseer in its $7.2M series A with AYMO Ventures. Farseer is revolutioning how we interact with excel like products. 👏🙌🤝 Thank you Luka Mijatović and Matija Nakic for the trust. Special shout out also to Silvije Radišić, Nikola Livaković and Mauro Viskovic for the very lively series A negotiations. Our team consisted of me, Eresi Tracy Uche, Kunal Kolhe, Fabiana Morales Centurion and Heidi Fan. More here on the transaction https://lnkd.in/egs47QUn

Farseer Secures €6.07M Series A to Scale FP&A Platform Across Europe and North America

Farseer Secures €6.07M Series A to Scale FP&A Platform Across Europe and North America

See all activities

Experience & Education

Salesforce

******** ****

***** ********* *******
****** ***

**** ******** * **** ***** *********
* **********

-

2023 - 2023
******** **********

****** ** ******* undefined

2008 - 2010

View Stefan’s full experience

See their title, tenure and more.

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

Hamilton: a modular open source declarative paradigm for high level modeling of dataflows

1st International Workshop on Composable Data Management Systems, CDMS@VLDB 2022, Sydney, Australia, September 9, 2022 September 9, 2022
https://cdmsworkshop.github.io/2022/Proceedings/ShortPapers/Paper6\_StefanKrawczyk.pdf

Other authors
See publication
Hamilton: enabling software engineering best practices for data transformations via generalized dataflow graphs

1st International Workshop on Data Ecosystems co-located with 48th International Conference on Very Large Databases (VLDB 2022 September 5, 2022
https://ceur-ws.org/Vol-3306/paper5.pdf

Other authors
See publication
Citation-based bootstrapping for large-scale author disambiguation

Journal of the American Society for Information Science and Technology February 14, 2012
Work that I did with the NLP group.
Abstract: We present a new, two-stage, self-supervised algorithm for author disambiguation in large bibliographic databases. In the first “bootstrap” stage, a collection of high-precision features is used to bootstrap a training set with positive and negative examples of coreferring authors. A supervised feature-based classifier is then trained on the bootstrap clusters and used to cluster the authors in a larger unlabeled dataset. Our self-supervised…

Work that I did with the NLP group.
Abstract: We present a new, two-stage, self-supervised algorithm for author disambiguation in large bibliographic databases. In the first “bootstrap” stage, a collection of high-precision features is used to bootstrap a training set with positive and negative examples of coreferring authors. A supervised feature-based classifier is then trained on the bootstrap clusters and used to cluster the authors in a larger unlabeled dataset. Our self-supervised approach shares the advantages of unsupervised approaches (no need for expensive hand labels) as well as supervised approaches (a rich set of features that can be discriminatively trained). The algorithm disambiguates 54,000,000 author instances in Thomson Reuters' Web of Knowledge with B3 F1 of.807. We analyze parameters and features, particularly those from citation networks, which have not been deeply investigated in author disambiguation. The most important citation feature is self-citation, which can be approximated without expensive extraction of the full network. For the supervised stage, the minor improvement due to other citation features (increasing F1 from.748 to.767) suggests they may not be worth the trouble of extracting from databases that don't already have them. A lean feature set without expensive abstract and title features performs 130 times faster with about equal F1.

Other authors
See publication
Probabilistic Ontology Trees for Belief Tracking in Dialog Systems

Proceedings of SIGDIAL 2010: the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue Sep 2010
Wrote the manual system they used here.

Other authors
See publication
Investigating SMS Text Normalization using Statistical Machine Translation

CS224N Stanford University, Stanford, CA, 2009. 2009
Class project on using statistical machine translation to convert SMS messages in 'text speak' back into normal English. We've had many requests for our data set and have had our project cited a couple times!

Other authors
See publication
Grid resource allocation: allocation mechanisms and utilisation patterns

Proceedings of the sixth Australasian workshop on Grid computing and e-research - Volume 82 2008
Conference paper based off my honours thesis.

Other authors
See publication

Patents

Belief tracking and action selection in spoken dialog systems

US 8676583
An action is performed in a spoken dialog system in response to a user's spoken utterance. A policy which maps belief states of user intent to actions is retrieved or created. A belief state is determined based on the spoken utterance, and an action is selected based on the determined belief state and the policy. The action is performed, and in one embodiment, involves requesting clarification of the spoken utterance from the user. Creating a policy may involve simulating user inputs and spoken…

An action is performed in a spoken dialog system in response to a user's spoken utterance. A policy which maps belief states of user intent to actions is retrieved or created. A belief state is determined based on the spoken utterance, and an action is selected based on the determined belief state and the policy. The action is performed, and in one embodiment, involves requesting clarification of the spoken utterance from the user. Creating a policy may involve simulating user inputs and spoken dialog system interactions, and modifying policy parameters iteratively until a policy threshold is satisfied. In one embodiment, a belief state is determined by converting the spoken utterance into text, assigning the text to one or more dialog slots associated with nodes in a probabilistic ontology tree (POT), and determining a joint probability based on probability distribution tables in the POT and on the dialog slot assignments.

Other inventors
See patent
TEAM MEMBER RECOMMENDATION SYSTEM

US

Courses

Machine Learning

CS229
Natural Language Processing

CS224N
Natural Language Understanding

CS224U
Speech Recognition and Synthesis

CS224S

Projects

Algorithms Tech Branding

Aug 2016 - Present
A self organized group managing https://multithreaded.stitchfix.com/algorithms/blog/ and branding for the Algorithms organization.

Other creators
See project
Nextdoor Feature Config

Jun 2014

Designed and Implemented a library for rolling out new features at Nextdoor.

This utilized our open source zookeeper library (https://github.com/Nextdoor/ndserviceregistry) to handle storing and accessing feature configurations.
R3

Jun 2014

In the theme of grassroots innovation I initiated & lead an effort to give everyone at Nextdoor the time to work on anything they wanted. Giving people the time to do anything can be scary, so to align & orient everyone, the team came up with some steps that people could refer to: "Reflect, Reinvent & Refine", and hence the name R3.
Structured Application Logging

Mar 2014

1) wrote a python log handler that converted application logs to structured json objects for easier ingestion and consumption downstream.
2) implemented ingestion and consumption of structured logs using Apache Flume, linking to elastic search and s3.
Nextdoor.com Holiday Lights Map

Nov 2013
During the holiday season, Nextdoor members are able to add themselves to a map of their neighborhood indicating the homes with holiday lights and upload festive photos and messages.

Other creators
See project
Linkedin Idea Bank

Aug 2012 - Present
The Linkedin Idea Bank is a way for Linkedin employees to reach the entire Linkedin organization with their ideas, find like minded people, iterate on their ideas, and allow people to track their progress.

It sports a Quora meets Pinterest type interface, runs on Play 2.0 with MongoDB on the backend.

Other creators
See project
[in]cubator

Mar 2012 - Present
Worked on program to bring up to 90 days worth of time for employees to spend polishing their hacks.

Specifically I helped form the committee, philosophy, and drove building the internal website that now drives grass roots innovation at Linkedin.

Other creators
See project
Trunkstats

Jan 2012 - Jul 2012

For hackday I wrote a tool to keep metrics on the status of trunk here at Linkedin. This tool was a simple website using the play framework and google charts to show trunk health and checkin trends. It was used weekly for engineering status reports by the VP of engineering. It fell out of use 6 months later when the tools team finally caught up and wrote their own more integrated tool.
[in]sightful

Apr 2011 - May 2011

For hackday I mashed together the feedback that comes into the site with the user's profile data, and stuck that into lucene. I used Linkedin's addons Bobo & Zoie to add real time indexing and faceted search. So rather than doing text search in an email client, people could do a full text search with lucene and also slice and dice by profile data.
Stanford Masters Admissions Committee 2010

Jan 2010 - Mar 2010

Was one of the student representatives on the Stanford CS Master's Admissions Committee.
SMS Text Normalization

Apr 2009 - Jun 2009
A system for converting textspeak (language used in SMS communication) to proper English using statistical machine translation. Presented in the PhD Poster Session of Stanford Computer Forum's annual affiliates meeting in April 2010. Poster available at http://forum.stanford.edu/events/posterslides/SMSTextNormalizationusingStatisticalMachineTranslation.pdf.

Other creators
See project

Languages

English

Native or bilingual proficiency
Polish

Professional working proficiency
Japanese

Elementary proficiency
Spanish

Elementary proficiency

Organizations

VUW Handball Club

President, Webmaster, VUWSA Club Sports Council Representative

Jan 2005 - Dec 2006

Was a founding member and helped run the club. Was president for one year.
VUWSA Blues Panel

Elected Member

Mar 2006 - Nov 2006

This position involved reviewing applications, debating and determining VUW sports awards for student athletes.
VUWSA Sports Council

Elected Member

Feb 2006 - Nov 2006

Dealt with sports related topics concerning the VUWSA.

View Stefan’s full profile

See who you know in common
Get introduced
Contact Stefan directly

Join to view full profile

Explore more posts

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses

See all courses

About

Articles by Stefan

February Updates

Last week of 2024 / first week of 2025

Week of December 9th

Week of December 2nd

Week of November 18th

Week of November 11th

Week of November 4th

Week of October 28th

Week of October 21st

Week of October 14th

Activity

7K followers

Aron Kale

Hugo Bowne-Anderson

Jose ‘𝒥𝒪’ Reyes

Bryan Bischof

Eddie Landesberg

Stefan Krawczyk

Stefan Krawczyk

Hugo Bowne-Anderson

David Scharbach

Stefan Krawczyk

🎧 Eric Riddoch

Hamel Husain

Erran Berger

Aman Srivastava

Capella Kerst, Ph.D.

TechCrunch

Tytus Cytowski

Experience & Education

Salesforce

******** *********

View Stefan’s full experience

See their title, tenure and more.

Publications

1st International Workshop on Composable Data Management Systems, CDMS@VLDB 2022, Sydney, Australia, September 9, 2022 September 9, 2022

1st International Workshop on Data Ecosystems co-located with 48th International Conference on Very Large Databases (VLDB 2022 September 5, 2022

Journal of the American Society for Information Science and Technology February 14, 2012

Proceedings of SIGDIAL 2010: the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue Sep 2010

CS224N Stanford University, Stanford, CA, 2009. 2009

Proceedings of the sixth Australasian workshop on Grid computing and e-research - Volume 82 2008

Patents

US 8676583

TEAM MEMBER RECOMMENDATION SYSTEM

US

Courses

Machine Learning

CS229

Natural Language Processing

CS224N

Natural Language Understanding

CS224U

Speech Recognition and Synthesis

CS224S

Projects

Aug 2016 - Present

Nextdoor Feature Config

Jun 2014

R3

Jun 2014

Structured Application Logging

Mar 2014

Nov 2013

Aug 2012 - Present

Mar 2012 - Present

Trunkstats

Jan 2012 - Jul 2012

[in]sightful

Apr 2011 - May 2011

Stanford Masters Admissions Committee 2010

Jan 2010 - Mar 2010

Apr 2009 - Jun 2009

Languages

English

Native or bilingual proficiency

Polish

Professional working proficiency