Ce deal a malheureusement pris fin le 22 août 2023.
47°
Publié le 19 août 2023

Sélection de cours de Data Engineering en promotion sur Udemy (dématérialisé)

9,99€19,99€-50%
Avatar de GelouaS
Partagé par
GelouaS
Membre depuis 2016
10
569

À propos de ce deal

Ce deal n'est plus valable, mais voici des idées pour trouver votre bonheur :

Plus de Cours & formations

Parcourez les deals de la catégorie Cours & formations

Plus de deals

Explorez les deals qui ont la cote en ce moment sur notre page "À la Une" !

Hello,

Je suis en reconversion professionnelle, et j'utilise pas mal Udemy. Il y a des cours faits par un formateur que je trouve très complets (parfois trop, ça peut aller jusque à 50+ heures de cours) sur le Data Engineering, allant de Python aux cloud providers (AWS, Azure, GCP) et les essentiels de ce métier : Spark Python, Spark Scala, SQL, Databricks, etc.

Si je ne dis pas de bêtises, Udemy a baissé les prix de tous les cours il y a peu, néanmoins le formateur a mis en promo certains de ses cours à 9.99€ l'unité. Par rapport à la quantité de savoir, c'est vraiment intéressant.

La durée est limitée et il reste 3 jours au moment où je poste le deal.

Prenez bien en compte qu'il s'agit d'un Indien donc accent anglais très particulier ! Les sous-titres Udemy sont générés automatiquement, donc il ne faut pas trop compter dessus.

Voici la liste des cours en promo :

Python & Data Engineering essentials

Python for Beginners: Learn Python with Hands-on Projects : 75h de cours
  • In depth coverage of Python collections such as list, tuple, dict, set and basics of file I/O
  • Exception Handling, Unit Testing, Object Oriented Concepts using Python
  • Develop Application for File Format Conversion using Python Pandas and improve performance using Multiprocessing
  • Overview of Software Development Life Cycle
  • Build Application to send emails using Python libraries such as Sendgrid and PyMongo
  • Web Scraping using Python libraries such as BeautifulSoup and Scrapy
  • Build Application to store scraped data in Mongodb using Python libraries such as Scrapy and PyMongo
  • Develop Web Application using Python Flask
  • Setup CI/CD Pipeline for Python Flask Application using GitHub Actions
  • In depth coverage of Git such as branches, pull requests, GitHub Actions
  • Develop Application to serve REST APIs using Python Flask, SQLAlchemy, etc
  • Performance Tuning of SQL Queries used by Python Flask Applications
  • Troubleshooting and Debugging of Python Applications
  • Developing Python Applications using IDEs such as Visual Studio Code
  • Reviewing REST APIs using Postman
  • Using Generative AI Tools such as Chat GPT for Python Application Development
  • Basics of Python Programming - Conditionals, Loops, Data types, String Manipulation, Date Manipulation, User Defined Functions, etc
  • Processing JSON Data and REST Payloads using Python
  • Database Programming using Python and Postgresql
  • Build Executable Command using Python


Data Engineering Essentials using SQL, Python, and PySpark : 56h de cours
  • Setup Environment to learn SQL and Python essentials for Data Engineering
  • Database Essentials for Data Engineering using Postgres such as creating tables, indexes, running SQL Queries, using important pre-defined functions, etc.
  • Data Engineering Programming Essentials using Python such as basic programming constructs, collections, Pandas, Database Programming, etc.
  • Data Engineering using Spark Dataframe APIs (PySpark) using Databricks. Learn all important Spark Data Frame APIs such as select, filter, groupBy, orderBy, etc.
  • Data Engineering using Spark SQL (PySpark and Spark SQL). Learn how to write high quality Spark SQL queries using SELECT, WHERE, GROUP BY, ORDER BY, ETC.
  • Relevance of Spark Metastore and integration of Dataframes and Spark SQL
  • Ability to build Data Engineering Pipelines using Spark leveraging Python as Programming Language
  • Use of different file formats such as Parquet, JSON, CSV etc in building Data Engineering Pipelines
  • Setup Hadoop and Spark Cluster on GCP using Dataproc
  • Understanding Complete Spark Application Development Life Cycle to build Spark Applications using Pyspark. Review the applications using Spark UI.


Sqoop, Hive and Impala for Data Analysts (Formerly CCA 159) : 20.5h de cours
  • Overview of Big Data ecosystem such as Hadoop HDFS, YARN, Map Reduce, Sqoop, Hive, etc
  • Overview of HDFS Commands such as put or copyFromLocal, get or copyToLocal, cat, etc along with concepts such as block size, replication factor, etc
  • Managing Tables in Hive Metastore using DDL Commands
  • Load or Insert data into Hive Metastore Tables using commands such as LOAD and INSERT
  • Overview of Functions in Hive to manipulate strings, dates, etc
  • Writing Basic Hive QL Queries using WHERE, JOIN, GROUP BY, etc
  • Analytical or Windowing Functions in Hive
  • Overview of Impala and understanding similarities and differences between Hive and Impala
  • Getting Started with Sqoop by reviewing official documentation and also exploring commands such as Sqoop eval
  • Importing Data from RDBMS tables into HDFS using Sqoop Import
  • Importing Data from RDBMS tables into Hive tables using Sqoop Import
  • Exporting Data from Hive or HDFS to RDBMS tables using Sqoop Export
  • Incremental Imports using Sqoop Import into HDFS or Hive Tables


Data Engineering using Kafka and Spark Structured Streaming :9.5h de cours
  • Setting up self support lab with Hadoop (HDFS and YARN), Hive, Spark, and Kafka
  • Overview of Kafka to build streaming pipelines
  • Data Ingestion to Kafka topics using Kafka Connect using File Source
  • Data Ingestion to HDFS using Kafka Connect using HDFS 3 Connector Plugin
  • Overview of Spark Structured Streaming to process data as part of Streaming Pipelines
  • Incremental Data Processing using Spark Structured Streaming using File Source and File Target
  • Integration of Kafka and Spark Structured Streaming - Reading Data from Kafka Topics


Master Apache Spark using Spark SQL and PySpark 3 : 32h de cours
  • Setup the Single Node Hadoop and Spark using Docker locally or on AWS Cloud9
  • Review ITVersity Labs (exclusively for ITVersity Lab Customers)
  • All the HDFS Commands that are relevant to validate files and folders in HDFS.
  • Quick recap of Python which is relevant to learn Spark
  • Ability to use Spark SQL to solve the problems using SQL style syntax.
  • Pyspark Dataframe APIs to solve the problems using Dataframe style APIs.
  • Relevance of Spark Metastore to convert Dataframs into Temporary Views so that one can process data in Dataframes using Spark SQL.
  • Apache Spark Application Development Life Cycle
  • Apache Spark Application Execution Life Cycle and Spark UI
  • Setup SSH Proxy to access Spark Application logs
  • Deployment Modes of Spark Applications (Cluster and Client)
  • Passing Application Properties Files and External Dependencies while running Spark Applications


Spark SQL and Spark 3 using Scala Hands-On with Labs : 24h de cours
  • All the HDFS Commands that are relevant to validate files and folders in HDFS.
  • Enough Scala to work Data Engineering Projects using Scala as Programming Language
  • Spark Dataframe APIs to solve the problems using Dataframe style APIs.
  • Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark Dataframe APIs
  • Inner as well as outer joins using Spark Data Frame APIs
  • Ability to use Spark SQL to solve the problems using SQL style syntax.
  • Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark SQL
  • Inner as well as outer joins using Spark SQL
  • Basic DDL to create and manage tables using Spark SQL
  • Basic DML or CRUD Operations using Spark SQL
  • Create and Manage Partitioned Tables using Spark SQL
  • Manipulating Data using Spark SQL Functions
  • Advanced Analytical or Windowing Functions to perform aggregations and ranking using Spark SQL




Azure

Master Data Engineering using Azure Data Analytics : 12.5h de cours
  • Data Engineering leveraging Services under Azure Data Analytics such as Azure Storage, Data Factory, Azure SQL, Synapse, Databricks, etc.
  • Setup Development Environment using Visual Studio Code on Windows
  • Building Data Lake using Azure Storage (Blob and ADLS)
  • Build Data Warehouse using Azure Synapse
  • Implement ETL Logic using ADF Data Flow with Azure Storage as Source and Target
  • In Depth Coverage of Orchestration using ADF Pipeline
  • Overview of Azure SQL and Azure Synapse Serverless and Dedicated Pool Features
  • Implement ETL Logic using ADF Data Flow with Azure SQL as Source and Azure Synapse as Target
  • Using Data Copy to copy data between different sources and targets
  • Performance Tuning Scenarios of ADF Data Flow and Pipelines
  • Build Big Data Solutions using Azure Databricks
  • Overview of Spark SQL and Pyspark Data Frame APIs
  • Build ELT Pipelines using Databricks Jobs and Workflows
  • Orchestrate Databricks Notebooks using ADF Pipelines



GCP

Master Data Engineering using GCP Data Analytics : 19.5h de cours
  • Data Engineering leveraging Services under GCP Data Analytics
  • Setup Development Environment using Visual Studio Code on Windows
  • Building Data Lake using GCS
  • Process Data in the Data Lake using Python and Pandas
  • Build Data Warehouse using Google BigQuery
  • Loading Data into Google BigQuery tables using Python and Pandas
  • Setup Development Environment using Visual Studio Code on Google Dataproc with Remote Connection
  • Big Data Processing or Data Engineering using Google Dataproc
  • Run Spark SQL based applications as Dataproc Jobs using Commands
  • Build Spark SQL based ELT Data Pipelines using Google Dataproc Workflow Templates
  • Run or Instantiate ELT Data Pipelines or Dataproc Workflow Template using gcloud dataproc commands
  • Big Data Processing or Data Engineering using Databricks on GCP
  • Integration of GCS and Databricks on GCP
  • Build and Run Spark based ELT Data Pipelines using Databricks Workflows on GCP
  • Integration of Spark on Dataproc with Google BigQuery
  • Build and Run Spark based ELT Pipeline using Google Dataproc Workflow Template with BigQuery Integration



AWS

Data Engineering using AWS Data Analytics : 25.5h de cours

  • Data Engineering leveraging Services under AWS Data Analytics
  • AWS Essentials such as s3, IAM, EC2, etc
  • Understanding AWS s3 for cloud based storage
  • Understanding details related to virtual machines on AWS known as EC2
  • Managing AWS IAM users, groups, roles and policies for RBAC (Role Based Access Control)
  • Managing Tables using AWS Glue Catalog
  • Engineering Batch Data Pipelines using AWS Glue Jobs
  • Orchestrating Batch Data Pipelines using AWS Glue Workflows
  • Running Queries using AWS Athena - Server less query engine service
  • Using AWS Elastic Map Reduce (EMR) Clusters for building Data Pipelines
  • Using AWS Elastic Map Reduce (EMR) Clusters for reports and dashboards
  • Data Ingestion using AWS Lambda Functions
  • Scheduling using AWS Events Bridge
  • Engineering Streaming Pipelines using AWS Kinesis
  • Streaming Web Server logs using AWS Kinesis Firehose
  • Overview of data processing using AWS Athena
  • Running AWS Athena queries or commands using CLI
  • Running AWS Athena queries using Python boto3
  • Creating AWS Redshift Cluster, Create tables and perform CRUD Operations
  • Copy data from s3 to AWS Redshift Tables
  • Understanding Distribution Styles and creating tables using Distkeys
  • Running queries on external RDBMS Tables using AWS Redshift Federated Queries
  • Running queries on Glue or Athena Catalog tables using AWS Redshift Spectrum


Mastering Amazon Redshift and Serverless for Data Engineers : 16h de cours
  • Getting Started with Amazon Redshift using AWS Web Console
  • Copy Data from s3 into AWS Redshift Tables using Redshift Queries or Commands
  • Develop Applications using Redshift Cluster using Python as Programming Language
  • Copy Data from s3 into AWS Redshift Tables using Python as Programming Language
  • Create Tables using Databases setup on AWS Redshift Database Server using Distribution Keys and Sort Keys
  • Run AWS Redshift Federated Queries connecting to traditional RDBMS Databases such as Postgres
  • Perform ETL using AWS Redshift Federated Queries using Redshift Capacity
  • Integration of AWS Redshift and AWS Glue Catalog to run queries using Redshift Spectrum
  • Run AWS Redshift Spectrum Queries using Glue Catalog Tables on Datalake setup using AWS s3
  • Getting Started with Amazon Redshift Serverless by creating Workgroup and Namespace
  • Integration of AWS EMR Cluster with Amazon Redshift using Serverless Workgroup
  • Develop and Deploy Spark Application on AWS EMR Cluster where the processed data will be loaded into Amazon Redshift Serverless Workgroup


Master AWS Lambda Functions for Data Engineers using Python : 13h de cours
  • Setup required tools on Windows to develop the code for ETL Data Pipelines using Python and AWS Services
  • Setup Project or Development Environment to develop applications using Python and AWS Services
  • Getting Started with AWS by creating account in AWS and also configure AWS CLI as well as Review Data Sets used for the project
  • Develop Core Logic to Ingest Data from source to AWS s3 using Python boto3
  • Getting Started with AWS Lambda Functions using Python 3 Run-time Environment
  • Refactor the application, build zip file to deploy as AWS Lambda Function
  • Create AWS Lambda Function using Zip file and Validate
  • Troubleshoot issues related to AWS Lambda Functions using AWS Cloudwatch
  • Build custom docker image for the application and push to AWS ECR
  • Create AWS Lambda Function using the custom docker image in AWS ECR
  • Develop Applications using AWS Lambda Functions by adding Python Modules as Layers


Mastering AWS Elastic Map Reduce (EMR) for Data Engineers : 11.5h de cours
  • Creating Clusters using AWS Elastic Map Reduce Web Console
  • Setup Remote Application Development using AWS Elastic Map Reduce (EMR) and Visual Studio Code
  • Develop and Validate Simple Spark Application using Visual Studio Code and AWS Elastic Map Reduce (EMR)
  • Deploy Spark Application as Step to AWS Elastic Map Reduce (EMR)
  • Manage AWS Elastic Map Reduce (EMR) based Pipelines using Boto3 and Python
  • Build End to End AWS Elastic Map Reduce (EMR) based Pipelines using AWS Step Functions
  • Develop Applications using Spark SQL on AWS EMR Cluster
  • Build State Machine or Pipeline using AWS Step Functions using Spark SQL Script on AWS EMR Cluster
  • Understand how to pass parameters to Spark SQL Scripts deployed on EMR



Databricks

Databricks Certified Associate Developer - Apache Spark 2022 : 14.5h de cours
  • Databricks Certified Associate Developer for Apache Spark exam details
  • Setting up Databricks Platform for practice to also to prepare for Databricks Certified Associate Developer for Apache Spark Exam
  • Selecting, renaming and manipulating columns using Spark Data Frame APIs
  • Filtering, dropping, sorting, and aggregating rows using Spark Data Frame APIs
  • Joining, reading, writing and partitioning DataFrames using Spark Data Frame APIs
  • Working with UDFs and Spark SQL functions using Spark Data Frame APIs
  • Spark Architecture and Adaptive Query Execution (AQE)


Data Engineering using Databricks on AWS and Azure : 18.5h de cours
  • Data Engineering leveraging Databricks features
  • Databricks CLI to manage files, Data Engineering jobs and clusters for Data Engineering Pipelines
  • Deploying Data Engineering applications developed using PySpark on job clusters
  • Deploying Data Engineering applications developed using PySpark using Notebooks on job clusters
  • Perform CRUD Operations leveraging Delta Lake using Spark SQL for Data Engineering Applications or Pipelines
  • Perform CRUD Operations leveraging Delta Lake using Pyspark for Data Engineering Applications or Pipelines
  • Setting up development environment to develop Data Engineering applications using Databricks
  • Building Data Engineering Pipelines using Spark Structured Streaming on Databricks Clusters
  • Incremental File Processing using Spark Structured Streaming leveraging Databricks Auto Loader cloudFiles
  • Overview of Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications
  • Differences between Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications
  • Differences between traditional Spark Structured Streaming and leveraging Databricks Auto Loader cloudFiles for incremental file processing.


Databricks Essentials for Spark Developers (Azure and AWS) : 4h de cours
  • Using Community Edition of Databricks to explore the platform
  • Signing up for Full Trial using Azure Databricks
  • Signing up for Full Trial using Databricks on AWS
  • Develop and Deploy Notebooks using Scala, Python as well as SQL using Databricks Platform
  • Understand the difference between interactive and job clusters
  • Formal Development and Deployment Life Cycle
  • Run jobs by attaching application as jar along with libraries
  • Overview of Cluster Pools
  • Installing and using databricks-cli

Mastering Databricks SQL Warehouse and Spark SQL : 14h de cours
  • Setup Databricks SQL Warehouse Environment using Azure Databricks for hands-on Practice
  • Getting Started with Databricks SQL for Data Analysis or Data Engineering
  • Features of Databricks SQL Warehouse - Clusters, Query Editor, Visualizations and Dashboards, etc
  • Overview of building reports and dashboards using Databricks SQL
  • Creating Databases and Tables using Databricks SQL or Spark SQL
  • Writing Basic Queries using Databricks SQL or Spark SQL
  • DML to load data into Databricks SQL or Spark SQL Tables
  • Advanced Operations such as Ranking and Aggregations using Databricks SQL or Spark SQL
  • Processing Semi-Structured Data using Databricks SQL or Spark SQL
  • In-depth Coverage about Delta Tables including all possible DML Operations such as Insert, Update, Delete, Merge, etc
  • End to End Life Cycle of Data Analysis of Data in Files using Databricks (Uploading File to Databricks to Reports and Dashboards)
Udemy Plus de détails sur
Informations supplémentaires
Édité par la modération, 19 août 2023

13 commentaires

triés par
Avatar de
  1. Avatar de Batmanu
    Udemy baisse le prix des cours plusieurs fois  par mois. 
    Rien n'est gratuit en ce bas monde mais les vraies promo chez Udemy c'est lorsque que les cours sont gratuits.... Si tu vois un cours qui t'intéresse à plusieurs dizaines d'euros, attends quelques semaines. 
    Pour les cours à 10 euros chez Udemy il y en à bcp plus souvent que des poules qui volent
    Avatar de GelouaS
    Auteur(e)
    Alors ça fait plus d'un an que j'achète régulièrement des cours sur Udemy, auparavant ils étaient à une centaine d'euros et plusieurs fois par mois il y avait des promos allant de 14.99€ à 19.99€

    Depuis peu, la majorité des cours sont à 19.99€, et les promos les baissent à 14.99€ mais sont moins courantes.

    Ici c'est le formateur qui met lui-même ses cours en promotion à 9.99€, soit moins que les promotions automatiques faites par Udemy, et jamais ils ne seront gratuits.

    Avoir des cours de qualité gratuit sur Udemy c'est très rare, les nombreux deals de cours gratuits ce sont souvent des trucs bidon, par Mr Expert en tout qui fait des cours très basiques et pas du tout approfondis à partir de vidéos Youtube.

    Edit : je me rappelle très bien avoir pris un cours d'une experte en IA avec Pyhton ici, c'était un copier coller de Google Trad et parfois elle dictait même des mots clés Python traduits, la personne faisait juste de la lecture de texte. Mention spéciale au "pour i dans", qui sont des mots clés d'une boucle Python et doivent avoir la forme "for i in", i étant une variable.

    Ces cours à ce prix, c'est seulement quand le formateur le décide, et c'est la première fois que ça arrive. (modifié)
  2. Avatar de SHORT_CIRCUIT
    +1 pour l'explication du contexte et bon courage pour la reconversion professionnelle !
  3. Avatar de Lemerou
    Tu fais quoi comme receonversion pro, tu peux donner plus de détail ?
    Ca m'intéresse.
    Avatar de GelouaS
    Auteur(e)
    Data Engineer

    En gros j'ai dépensé environ 6500€ pour faire un bootcamp Data Analyst (à la Wild Code School), et la je fais une alternance Data Engineer toujours à la même école.

    Dans les 2 cas, j'ai pas appris grand chose en école, je me suis surtout formé sur Udemy. La formation à l'école n'a pas grand chose à voir avec le marché du travail : en Data Analyst on fait 1/3 de machine learning alors que c'est quasi pas demandé pour ce poste. En Data Engineer, on survole les compétences de ce métier.

    J'ai fait l'école juste car elle délivre un diplôme à la fin, et aussi car je voulais faire une alternance pour valoriser une première expérience pro.

    Il reste possible de faire sa reconversion sans école, mais c'est beaucoup plus compliqué : sans au moins un stage les entreprises sont pas très partantes pour recruter, d'autant que les bootcamps Data Analyst pullulent de partout (il y a un stage en fin de bootcamp).

    En Data Engineer, un élément très important (plus qu'un diplôme), ce sont les projets persos que tu vas faire, mais surtout les certifications Cloud qui apportent un vrai + :
    - AWS Cloud Practitionner
    - AWS Data Analyst associate
    - Azure Cloud Fundamentals
    - Azure Data Engineer
    etc.

    Pour ma part, j'ai fait un processus "démission-reconversion", qui permet de toucher le chômage après une démission si le projet est validé par une commission comme "réel, motivé et sérieux" (il faut donc faire un dossier), il y a néanmoins un risque car même si le jury valide le dossier, pôle-emploi peut dire non au chômage pour des raisons exotiques. C'est arrivé à certaines personnes et tu le sais que quelques semaines APRES avoir quitté ton entreprise. En général, à cette étape tu as déjà payé l'école et t'es juste dans une merde monstre.

    Pour la partie alternance, si tu as plus de 26 ans tu seras payé au SMIC, mais si tu as un chômage supérieur, pôle-emploi te verse un complément mensuel pour que tu gagnes PLUS que ton chômage (pour inciter à reprendre une activité en gros), du coup j'ai une alternance à 2300€ / mois !

    Voilà, je ne sais pas si j'ai pu t'aider, mais un projet de reconversion ça doit bien se préparer, j'avais absolument tout anticipé et au final en 1.5 ans je vais faire bootcamp + alternance avec diplôme à la clé (bac +4), mais je travaille énormément sur mon temps libre pour compenser les lacunes de l'école, et passer des certifications qui vont valoriser mes compétences, car le diplôme c'est "Concepteur Développeur d'applications", qui n'a rien à voir avec Data Engineer
Avatar de