Research EXL before you apply
Check ratings, real-employee reviews, verified pay, and interview difficulty.
-
Design, build, and maintain scalable data pipelines and ETL/ELT workflows for large volumes of structured and semi-structured data
-
Develop and optimize data models, tables, and transformations for analytics and reporting
-
Work with large datasets using SQL, PySpark, and modern data platforms to ensure efficient processing
-
Build and manage data workflows using orchestration tools like Apache Airflow
-
Develop automation scripts using Python and Shell scripting for pipeline execution and monitoring
-
Monitor, troubleshoot, and optimize pipelines for performance, scalability, and reliability
-
Collaborate with data scientists, analysts, and business stakeholders to enable data-driven insights
-
Ensure adherence to data engineering best practices including data quality, documentation, and governance
Responsibilities
-
Must have:
-
Strong proficiency in SQL and Python
-
Experience with distributed data processing frameworks like PySpark
-
Hands-on experience with Databricks
-
Experience in building and managing workflows using Apache Airflow
-
Knowledge of ETL/ELT processes and scalable data pipeline architecture
-
Experience with Shell scripting for automation
-
Strong understanding of data modeling concepts
-
Ability to work with large-scale data systems
-
-
Good to Have:
-
Experience with cloud platforms like AWS, GCP, or Azure
-
Familiarity with data lakehouse architecture
-
Understanding of data governance and modern data platform practices
-
Experience working with global stakeholders in cross-functional environments
Eligibility:
-
Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or related field
-
5+ years of experience in data engineering or data platform development
-
Experience working in large-scale data environments
-
-
Qualifications
Must have:
-
Strong proficiency in SQL and Python
-
Experience with distributed data processing frameworks like PySpark
-
Hands-on experience with Databricks
-
Experience in building and managing workflows using Apache Airflow
-
Knowledge of ETL/ELT processes and scalable data pipeline architecture
-
Experience with Shell scripting for automation
-
Strong understanding of data modeling concepts
-
Ability to work with large-scale data systems
Good to Have:
-
Experience with cloud platforms like AWS, GCP, or Azure
-
Familiarity with data lakehouse architecture
-
Understanding of data governance and modern data platform practices
-
Experience working with global stakeholders in cross-functional environments
Eligibility:
-
Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or related field
-
5+ years of experience in data engineering or data platform development
-
Experience working in large-scale data environments
-