M
murillohl's photo
Murillo Henrique Pessoa de Lima
From Brazil 02:15 AM (GMT-03:00)
$35/hr or $70,000/yr

Active over a week ago


Member since Dec 2025

Share this profile:

Data Engineer

Data Engineer
Available for hire
Years of experience
6+ years
Available for
Full-time, Part-time, Contract
Download Resume / CV

I have been a Data Engineer for 6 years, with 5 years contributing remotely to international teams, demonstrating a strong ability to design and implement scalable data solutions. I have achieved significant results in developing and optimizing ETL/ELT pipelines using modern tools like DBT and Apache Airflow. My expertise covers end-to-end data platform components, including data ingestion, transformation, data quality checks, and CI/CD implementation. I am looking for Data Engineering role where I can leverage my expertise in Python, SQL, AWS, GCP, and Terraform to build robust, scalable, and efficient cloud-native data architectures. What sets me apart is my proven track record in architecting and executing end-to-end data cloud pipelines and my ability to work cross-functionally with data scientists, analysts, and BI developers to define and address complex data requirements.

Languages

Employment History

Data Engineer (L3) at Cherre Current 2025 - Now
Orchestrated complex data ingestion workflows in a fast pace startup by designing and deploying 30+ ETL/ELT pipelines using Python, Apache Airflow and DBT, which successfully integrated disparate data sources including SFTP, APIs, and internal systems to create a unified, reliable data layer for downstream analytics. Optimized high-volume SQL queries and transformation logic with BigQuery by implementing advanced partitioning and clustering strategies, which reduced pipeline execution time by over 60% and significantly lowered monthly compute costs. Established data quality and observability framework by using Datafold and custom Python scripts along with DBT workflow, enabling data reliability and proactively detecting anomalies before they impacted production datasets. Standardized development workflows by implementing Git-based version control and dependency management with Poetry, facilitating smoother collaboration among distributed engineers and reducing environment setup time and errors. Improved pipeline efficiency by configuring source freshness checks on DBT transformation models to trigger runs only upon detection of source data changes, effectively eliminating useless compute costs from redundant executions.
Data Engineer Specialist at Bayer Crop Science 2022 - 2025
Accelerated the software delivery lifecycle by building automated CI/CD workflows using GitHub Actions and Terraform (IaC), drastically cutting ingestion pipeline deployment time from days to hours and eliminating manual deployment errors. Designed and maintained 30+ scalable end-to-end data pipelines using DBT and Dataform, transforming raw data from diverse sources (APIs, WebScraping, Databases) into actionable insights with BigQuery for cross-functional data science and analytics teams. Developed and deployed high-performance RESTful APIs using FastAPI on GCP Cloud Run, creating a secure and scalable interface allowing external systems to push data to our Data Warehouse with low latency. Optimized data ingestion pipelines for web-sourced files by implementing incremental logic and file bookmarking, which eliminated redundant downloads and significantly reduced pipeline execution time while ensuring data freshness. Revamped high-latency GraphQL endpoints within a Python Flask application, optimizing query performance to reduce data retrieval time from 10 seconds to under 1 second, thereby drastically improving the user experience for data consumers.
Senior Data Engineer at Bayer Crop Science 2021 - 2022
Architected a large-scale cloud migration strategy, moving mission-critical on-premises infrastructure (SAP HANA, Oracle) to AWS, enabling the client to leverage scalable cloud native services and reducing operational performance costs. Transformed legacy data logic by migrating 50+ complex SQL views into PySpark jobs within AWS Glue, enhancing processing for large datasets and aligning with best practices adopting partition strategy improving 20% query performance. Implemented a secure Data Mesh architecture by configuring AWS Lake Formation for granular permission management, ensuring strict data governance and compliance across producer and consumer domains. Developed modular Python package with AWS SDK to standardize AWS data catalog interactions and metadata, which was adopted by the data engineering team to improve code reusability and streamline development across PySpark pipelines. Streamlined team collaboration and release management by designing a structured code repository and CI/CD pipelines with Terraform, which increased deployment frequency and reduced merge conflicts for the engineering team.
Data Engineer at Traive 2021 - 2021
Development of 10+ ETL pipelines from multiple sources (Databases, WebScraping, flat files, etc.). Developed GitLab CI/CD pipelines using Python AWS CDK to streamline deployment and cloud infrastructure management.
Data Analyst at Bayer Crop Science 2020 - 2021
Automated 20+ ETL tasks with Python and R scripts for Analytics and Engineers reporting purposes. Development of Web Application for data collection on AWS services, using React for front-end and Python for API backend.

Education

DBT Fundamentals at dbt Current 2025 - Now
Snowflake Hands-On Essentials: Data Warehousing at Snowflake Current 2025 - Now
Databricks Certified Data Engineer - Associate at Databricks Current 2024 - Now
Astronomer Certification for Apache Airflow Fundamentals at Astronomer Current 2024 - Now
AWS Certified Developer - Associate at AWS Current 2022 - Now
AWS Certified Solutions Architect - Associate at AWS Current 2022 - Now
Udacity Data Engineering Nanodegree at Udacity Current 2021 - Now
AWS Certified Cloud Practitioner at AWS Current 2021 - Now
Bachelor's Degree in Computer Engineering at Federal University of Uberlândia 2015 - 2020