Summary
Overview
Work History
Education
Skills
Accomplishments
Work Availability
Timeline
Professional Accreditations
Professional Accreditations
Hi, I’m

Ved Prakash

Staff Data Engineer
Almere Stad
The real test is not whether you avoid this failure, because you won’t. It’s whether you let it harden or shame you into inaction, or whether you learn from it; whether you choose to persevere.
Barack Obama
Ved Prakash

Summary

Dynamic and experienced data engineer with certifications in Google and Snowflake, I bring a wealth of expertise in developing high-performance, scalable, and secure data solutions. With 7+ years of experience in Python and PySpark development, I have a deep understanding of data architecture, cloud infrastructure, and data warehousing. My strong skills in Apache Iceberg enable me to process and manage data at scale.

I have proven expertise in designing, developing, and deploying enterprise-level data solutions on the cloud. My proficiency in building data-intensive applications in Snowflake, coupled with my expertise in dimensional designing of table structures, has enabled me to deliver innovative data solutions that drive business growth and productivity.

As a Python developer, I have developed API connectors for data pipelines and implemented data models using DBT for data modelling. With a strong background in data warehousing, I am capable of performing performance administration, cube development, and data integration for databases.

With my exceptional communication, Cross Team collaboration, and problem-solving skills, I am dedicated to delivering high-quality results that exceed business expectations. I am also committed to continuous learning and keeping up with the latest technologies to deliver innovative and effective solutions.

Overview

13
years of professional experience
4
years of post-secondary education

Work History

Gitlab
Almere Stad

Staff Data Engineer
06.2022 - Current

Job overview

  • Designed and implemented scalable and reliable data pipelines using Google PubSub, PySpark, and Apache Iceberg, resulting in a 70% improvement in data processing time.
  • Managed cloud computing platforms GCP and snowflake resulting in 30% reduction in infrastructure and snowflake cost.
  • Leading and mentoring junior data engineers, providing technical guidance and support.
  • Creating technical documentation for data solutions, including design documents, architecture diagrams, and user manuals.
  • Responsible for implementation and onboarding of Data observability tool MonteCarlo.
  • Working closely with cross-functional teams, including data analysts, data scientists, and business stakeholders to ensure data solutions meet business requirements

Gitlab
Almere Stad

Senior Data Engineer
02.2021 - 06.2022

Job overview

  • Principle developer on Meltano Taps to build Connector to different data source like Adaptive, Xactly, Edcast , Zendesk.
  • Responsible for Deployment of Meltano In GKE setup to dynamically scale.
  • Enable monitoring of Airflow by integrating it into GitLab main infrastructure monitoring system.
  • Host and manage Airflow in GKE using Terraform and helm packages , which improved testing data pipeline timeline by 80%.
  • Built scalable data pipeline which is responsible to perform heavy data pull from SaaS Data Platform.
  • Develop and maintain DBT workflows to automate ETL processes, data transformations, and data quality checks.
  • Key player in building building Sisense Dashboard.
  • Expert in using Fivetran and Stitch as data connectors.

Infosys Limited

Senior Data Engineer
08.2016 - 02.2021

Job overview

  • Directly working with Product Owner and Engineering Head for planning and product-wide technology changes, modifying to cloud based infrastructure to meet product goals.
  • Partnered with architecture team to develop wholistic technology roadmap, accounting for Engineering priorities and overall business goals for migrating data lake to Snowflake(DWaaS).
  • Managed Product Team of 10 following Scrum/Kanban Model.
  • Implemented Snowflake RBAC security framework and to map organization security policy.
  • Designed and developed ELT (Talend) for data pipe line to work on SaaS service Cloud database Snowflake (AWS & Azure)
  • Built Migration tool to move data from on - premise to Public Cloud in AWS using Python ETL tool (Apache Airflow and Pandas)
  • Developed CI/CD to deploy lambda function using SAM ,AWS CLI reducing timeline of upgrade by 70%.
  • Setting up complete automated mechanism of setting Talend on Docker to run overnight batch process.
  • POC Kibana stack (ELK)to report batch processing of daily load.
  • Setup Pentaho High Availability Data Integration server for Production Environment in Community Edition.
  • Experience in developing Data Lake / DataMart using Pentaho/Talend for Cloud DataWare House Snowflake.
  • Responsible for resolving issues related to data sourcing, data quality, ETL development, system performance, Quality Assurance and User Acceptance.
  • Performed data profiling on several client data set in order to check that their data fit in analytical product to give meaning full insight from data.
  • Prepared Bill Of Material Template for Product based storage/compute/cloud service reducing business for easy RFP time by 30%.

Infosys Limited

Data Engineer
07.2014 - 07.2016

Job overview

  • Designed STAR Schema Data Lake for assortment reporting by defining dimensions and fact tables and key attributes.
  • Designed and Developed ETL (Pentaho) to integrate Planogram with Point of Sale Data.
  • Build API in Python (Flask) to call ETL Job in back end.
  • Work in Scrum Team of 8 and worked pro actively with product Owner for story grooming.
  • Designed and Developed MySQL data model to store data in coming in from DataMart.
  • Implemented CI/CD of MySQL changes using Liquibase and Bamboo.
  • Built a framework in Python (Pandas) to scramble customer PII information for Dev/Test environment.
  • Performed clustered installation, design, backup, recovery, security for MySQL database.
  • Docker Container for MySQL database reducing over all server commissioning time by 90%.

Infosys Limited

Senior Systems Engineer
02.2010 - 07.2013

Job overview

  • Designed and Migrated legacy scheduler Control_M to SKYBOT.
  • Developed batch auto recovery job leveraging shell and SQL to reduce recovery time by 30%.
  • SME in doing RCA for production incident , perform routine tests on databases (Oracle , Kognitio MPP) and provide extended support to all ETL applications(Pentaho).

Education

VTU

Bachelor of Engineering from Computer Science
01.2005 - 01.2009

Skills

    GCP: GKE, CLoud SQL, DataProc, DataFlow,PubSub

undefined

Accomplishments

  • Saved Licensed cost of 200k £ by building entire ETL pipeline in Pentaho Community Edition.
  • Implemented Docker container orchestration for MySQL for whole application stack.
  • Developed Multi Tenancy Sampled Test Environment of EDWH saving cost of Kognitio Appliance.
  • Implemented Rundeck scheduler and migrated 300 jobs.
  • Developed DB refresh tool saving 80 days DBA effort in a year.
Availability
See my work availability
Not Available
Available
monday
tuesday
wednesday
thursday
friday
saturday
sunday
morning
afternoon
evening
swipe to browse

Timeline

Staff Data Engineer

Gitlab
06.2022 - Current

Senior Data Engineer

Gitlab
02.2021 - 06.2022

Senior Data Engineer

Infosys Limited
08.2016 - 02.2021

Data Engineer

Infosys Limited
07.2014 - 07.2016

Senior Systems Engineer

Infosys Limited
02.2010 - 07.2013

VTU

Bachelor of Engineering from Computer Science
01.2005 - 01.2009

Professional Accreditations

  • Google Certified Data Engineer
  • Snowflake Certified Developer
  • Snowflake Certified Advance Architect
  • Infosys Outstanding Performer for year 2016 ,2017, 2018 & 2019.
  • AIMIA Engineering Star Player Of the year 2018.

Professional Accreditations

  • Google Certified Data Engineer
  • Snowflake Certified Developer
  • Infosys Outstanding Performer for year 2016 ,2017, 2018 & 2019.
  • AIMIA Engineering Star Player Of the year 2018.
  • Infosys MOST-VALUABLE award for excellent performance 2015.
Ved PrakashStaff Data Engineer