As a Cloud engineer in our team, you work with large scale manufacturing data coming from our globally distributed plants. You will focus on building efficient, scalable & data-driven applications.
The data sets produced by these applications – whether data streams or data at rest – need to be highly available, reliable, consistent and quality-assured so that they can serve as input to wide range of other use cases and downstream applications.
We run these applications on a Azure databricks, you will be building applications, you will also contribute to scaling the platform including topics such as automation and observability.
Finally, you are expected to interact with customers and other technical teams e.g. for requirements clarification & definition of data models.
Primary responsibilities: ·
Engaging in design discussion of Data Pipelines in Azure.Creating design for data pipelines and conceptualize data architecture for large-scale projects in Azure.Architectural proposal and estimation for the application, technical leadership to the teamDefine data model for data pipelines in Azure.Coordination/Collaboration with central teams for tasks and standardsDevelop data integration workflow in AzureDeveloping pyspark applications for processing Streaming data.Integrating the end-to-end Azure Databricks pipeline to take data from source systems to target system ensuring the quality and consistency of data.Writing python scripts to automate manual activities.Defining data quality and validation checks.Configuring data processing and transformation.Writing unit test cases for data pipelines.Defining and implementing data quality and validation check.Tuning pipeline configurations for optimal performance.Participate in Peer review and PR review for the code written by team membersQualificationsBachelor’s degree in computer science, Computer Engineering, relevant technical field, or equivalent; Master’s degree preferred.6+ years’ experience in data engineering, ETL tools and working with large data sets.Proven experience with cloud platform, particularly in Azure DatabricksMin 6 years of Experience in Design Development and integration applications using Various technologies and frameworksMinimum 5 years of working experience of distributed cluster.Additional InformationKey Competencies:
At least 2-3 years of Azure Databricks Cloud experience in Data EngineeringExperience of Delta table, ADLS, DBFS, ADF.At least 6 years of experience in large scale Python software development (other object-oriented languages are also acceptable)Deep level of understanding in distributed systems for data storage and processing (e.g. Kafka, pyspark, Azure Cloud)Experience with Cloud based SQL Database: Azure SQL EditorExcellent software engineering skills (i.e., data structures, algorithms, software design).Excellent problem-solving, investigative, and troubleshooting skillsExperience with CI/CD tools such as Azure DevOpsAbility to work independently.Soft Skills:
Good Communication SkillsAbility to coach and Guide young Data EngineersDecent Level in English as Business Language