Skip to content

Data Engineer LATAM (Python/PySpark/AWS Glue/Amazon Athena/SQL/Apache Airflow)

  • Remote
    • Remote, São Paulo, Brazil
  • Software Development

Job description

Let’s be direct: We’re looking for a technical powerhouse. If you’re the developer who:

  • Is the clear technical leader on your team

  • Consistently solves problems others can’t crack

  • Ships complex features in half the time it takes others

  • Writes code so clean it could be published as a tutorial

  • Takes pride in elevating the entire codebase

Then we want to talk to you.
This isn’t a role for everyone, and that’s by design.
We’re seeking developers who know they’re exceptional and have the track record to prove it.

What you’ll do

  • Build, optimize, and scale data pipelines and infrastructure using Python, TypeScript, Apache Airflow, PySpark, AWS Glue, and Snowflake.

  • Design, operationalize, and monitor ingest and transformation workflows: DAGs, alerting, retries, SLAs, lineage, and cost controls.

  • Collaborate with platform and AI/ML teams to automate ingestion, validation, and real-time compute workflows; work toward a feature store.

  • Integrate pipeline health and metrics into engineering dashboards for full visibility and observability.

  • Model data and implement efficient, scalable transformations in Snowflake and PostgreSQL.

  • Build reusable frameworks and connectors to standardize internal data publishing and consumption.

Job requirements

Required qualifications

  • 4+ years of production data engineering experience.

  • Deep, hands-on experience with Apache Airflow, AWS Glue, PySpark, and Python-based data pipelines.

  • Strong SQL skills and experience operating PostgreSQL in live environments.

  • Solid understanding of cloud-native data workflows (AWS preferred) and pipeline observability (metrics, logging, tracing, alerting).

  • Proven experience owning pipelines end-to-end: design, implementation, testing, deployment, monitoring, and iteration.

Preferred qualifications

  • Experience with Snowflake performance tuning (warehouses, partitions, clustering, query profiling) and cost optimization.

  • Real-time or near-real-time processing experience (e.g., streaming ingestion, incremental models, CDC).

  • Hands-on experience with a backend TypeScript framework (e.g., NestJS) is a strong plus.

  • Experience with data quality frameworks, contract testing, or schema management (e.g., Great Expectations, dbt tests, OpenAPI/Protobuf/Avro).

  • Background in building internal developer platforms or data platform components (connectors, SDKs, CI/CD for data).

Additional Information:

  • This is a fully remote position.

  • Compensation will be in USD.

  • Work hours are aligned with the EST time zone (9 AM to 6 PM EST) or PT time zone.

or