When I worked at Meta in 2017, I was on a team of 17 people. 15 of the 17 were on H1b visas. I was one of two Americans on the team
Just for core growth data engineering, that’s $1.5m in visa fees under the new rules.
If you’re an American looking to land a big tech role, now
AI Engineering has levels to it:
– Level 1: Using AI
Start by mastering the fundamentals:
-- Prompt engineering (zero-shot, few-shot, chain-of-thought)
-- Calling APIs (OpenAI, Anthropic, Cohere, Hugging Face)
-- Understanding tokens, context windows, and parameters
Getting into #dataengineering is actually pretty easy
- learn SQL
- learn Python
- learn Snowflake/BigQuery/DataBricks
- learn data modeling
- learn data pipelines with Airflow
If you learn these 5 things, you’ll be interview-ready for a junior position for sure
SQL has levels to it:
- level 1
SELECT, FROM, WHERE, GROUP BY, HAVING, LIMIT
Master these basic keywords and you’ll be well on your way to mastering SQL.
- level 2
Mastering JOINs:
Most common JOINs: INNER and LEFT
Less common JOINs: FULL OUTER
Joins you should avoid
I created a public Github repo with all the resources, books, companies, and social media accounts you should be following to stay current on data engineering topics.
I'm accepting PRs so we can crowdsource this effort!
github.com/DataEngineer-i…#dataengineering
You only need to read four books to truly get what’s going on in ML and data engineering:
- Fundamentals of Data Engineering by Joe Reis
- Designing Data Intensive Applications by Martin Kleppmann
- AI engineering by Chip Huyen
- Designing Machine Learning Systems by Chip
SQL has levels to it:
- level 1
SELECT, FROM, WHERE, GROUP BY, HAVING, LIMIT
Master these basic keywords and you’ll be well on your way to mastering SQL.
- level 2
Mastering JOINs:
Most common JOINs: INNER and LEFT
Less common JOINs: FULL OUTER
Joins you should avoid
If I had to start learning #dataengineering all over again, I’d follow this plan, mostly in order:
- Learn SQL
— Aggregations with GROUP BY
— Joins (INNER, LEFT, FULL OUTER)
— Window functions
— Common table expressions
- Learn about data modeling
— read about data
I worked 2 years each at Meta, Airbnb and Netflix. Their engineering stacks are different and cultures have pros and cons.
- Meta
Stack I used: Hive, Spark, HDFS, Dataswarm, Unidash, Deltoid
Pros:
Tons of motivated people willing to help you
Great social events to make