Deepayan Sarkar

Deepayan Sarkar, Data Scientist
Available Now · French Work Authorisation

Deepayan
Sarkar.

Data Scientist & Analytics Professional

Data is not just my profession — it's how I think. 4+ years delivering data pipelines and analytics at scale at Accenture. Finalising an MSc in Data Analytics for Business at KEDGE Business School — open to internship and CDI opportunities across the EU.

Top 3 Impact Highlights

10M+

Records Processed Daily

35%

Processing Time Reduction

<1 hr

Research Time (was 4–8 wks)

Experience

Where I've Built
Things.

10M+Records/day35%Faster processing99.9%Uptime40+ hrsSaved/month
  • 1Engineered data pipelines processing 10M+ records daily and developed Power BI dashboards tracking 50+ KPIs, reducing processing time by 35% and improving operational efficiency by 20% through ETL optimisation and data visualisation solutions.
  • 2Designed and maintained cloud infrastructure on Google Cloud Platform (BigQuery, Dataflow, Cloud Storage) ensuring 99.9% uptime, conducted A/B testing and statistical analysis on datasets with 1M+ rows to optimise product features and improve customer experience by 30%.
  • 3Collaborated with 15+ cross-functional stakeholders to deliver 12+ analytical projects, automated reporting processes saving 40+ hours monthly, and mentored junior analysts on data analysis best practices and SQL optimisation techniques.
PythonSQLPower BIBigQueryDataflowGCPA/B TestingETL

Impact Highlights

10M+
Records/day
35%
Faster processing
99.9%
Uptime
40+ hrs
Saved/month

Projects

Selected
Work.

01
NLP · Hackathon

Multi-Label Skincare Product Classifier

L'Oréal Hackathon · KEDGE Business School

  • Developed and deployed a multi-label text classification model using LinearSVC and One-vs-Rest classification to classify 6,240 products across 33 categories, achieving a weighted F1 Score of 0.67, in line with industry benchmarks.
  • Engineered NLP pipeline with TF-IDF vectorisation (word and character n-grams) and optimised per-class thresholds for improved performance.
0.67
F1 Score
6,240
Products
33
Categories
LinearSVCOne-vs-RestTF-IDFscikit-learnPythonNLP
View on GitHub
02
ML · Clustering

Spotify Music Recommendation System

Unsupervised Learning · KEDGE Business School

  • Built an unsupervised music recommendation system by applying K-Means clustering to song-level audio features to uncover latent user taste patterns and evaluated performance using the Silhouette, Calinski-Harabasz, and Davies-Bouldin indices.
  • Performed feature engineering and preprocessing to enhance clustering stability and designed a similarity-based recommendation approach to enable personalised and cold-start recommendations.
K-Means
Algorithm
3
Eval Indices
Cold-Start
Enabled
K-MeansPythonscikit-learnFeature EngineeringSilhouette Index
View on GitHub
03
GenAI · Hackathon

AI Persona Bots for Marketing Research

BNP Paribas & CGI Hackathon · KEDGE Business School

  • Built AI-simulated customer personas using Azure AI Foundry and GPT-4o to accelerate credit product launches, processing 2,438 survey responses across 8 distinct customer segments with 88.6% relevance, 97.8% coherence, and 100% fluency scores.
  • Engineered end-to-end pipeline with persona generation, NLP-based sentiment analysis, and automated insight synthesis, reducing marketing research time from 4–8 weeks to under 1 hour whilst maintaining high-quality customer simulation accuracy.
88.6%
Relevance
97.8%
Coherence
100%
Fluency
<1 hr
vs 4–8 wks
Azure AI FoundryGPT-4oNLPSentiment AnalysisPython
04
Mobile App · SLM

Offline Vision Assistant for the Visually Impaired

Personal Project · Flutter / Supervised ML

  • Built a fully offline Android accessibility app using Flutter that leverages on-device supervised machine learning to identify objects and scenes in real time, converting visual information into spoken audio feedback for visually impaired users.
  • Integrated camera, flutter_tts, and speech_to_text packages to enable a hands-free, voice-driven experience — allowing users to ask questions and receive instant audio descriptions without an internet connection.
  • Implemented a Provider-based state architecture and CI/CD workflow via GitHub Actions to automate APK builds, ensuring reliable releases and a production-ready deployment pipeline.
100%
Offline
Real-time
Inference
Android
Platform
FlutterDartTensorFlow LiteCamera APIText-to-SpeechSpeech-to-TextGemma 3 1BSmall Language ModelSDGs
View on GitHub
05
Data Visualisation

China Import/Export Transport Analysis

Tableau Public

  • Built an interactive Tableau dashboard exploring China's import/export transport patterns — analysing trade volumes, shipping modes, and commodity flows across global corridors.
  • Designed multi-layered filters and drill-down views enabling dynamic exploration of trade data by year, commodity type, and transport mode (sea, air, rail, road), surfacing actionable insights for supply chain analysis.
  • Applied calculated fields and LOD expressions to derive year-over-year growth rates and market share breakdowns, visualising shifts in China's top trading partners and strategic export corridors.
4
Transport Modes
YoY
Growth Trends
Tableau
View Dashboard

Skills

Technical
Arsenal.

Programming & Machine Learning

PythonpandasNumPyMatplotlibscikit-learnXGBoostLightGBMPyTorchTensorFlowSQLStatistical ModellingSupervised LearningUnsupervised LearningNatural Language Processing (NLP)Deep LearningFeature EngineeringModel Evaluation & Cross-ValidationMLOps

Data Tools

Power BITableauGoogle AnalyticsJupyterGitApache SparkAirflow

Cloud & Databases

Google Cloud PlatformBigQueryDataflowCloud StorageAzure AI FoundrySQL Server

Core Competencies

Recommendation SystemsForecastingCustomer AnalyticsA/B TestingETL PipelinesModel Interpretability

Education

Academic
Foundation.

Current

Master of Science in Data Analytics for Business

KEDGE Business School

Master 2nd Year

Sep 2025 — Present Bordeaux, France

Bachelor of Technology in Electronics and Communication Engineering

University of Engineering & Management

BTech

Jul 2017 — May 2021 Kolkata, India

Certifications

☁️

Google Cloud Certified: Associate Cloud Engineer

Google Cloud

2023

🤖

Microsoft Azure AI Fundamentals — AI-900

Microsoft

2026

🗄️

SQL for Data Science

Coursera · UC Davis

2022

📋

Professional Scrum Master I

Scrum.org

2026

Contact

Let's Build
Something.

I'm actively looking for internships and CDI opportunities across the EU. Whether you have a role, a project, or just want to talk data — I'd love to hear from you.

Available from Now

Based in Paris, France with French Work Authorisation. Open to internship and CDI roles in data science, machine learning, and analytics across the EU.

Data Science InternML EngineeringAnalyticsNLP Research

Send a message

Or email directly: deepayans77@gmail.com