Dilbar Isakova

EDUCATION

Master Thesis

Apr 2025 - Present

Inria/LISN

Fourth Semester in the Big Data Management and Analytics Master's program - BDMA Erasmus Mundus

Master Thesis Topic:

"Intelligent Ambient Data Visualization on Non-Planar Displays: A Multi-Modal Machine Learning Approach for Spatial Audio-Visual Analytics"

Master of Science - Computer Science & Artificial Intelligence

Sep 2024 - Present

CentraleSupélec - Université Paris-Saclay

Third Semester in the Big Data Management and Analytics Master's program - BDMA Erasmus Mundus

Key Coursework:

Massive Graph Management and Analytics Advanced Machine Learning Visual Analytics Decision Modelling Deep Learning Research Project Law and Intellectual Property French A2

Master of Technology - Big Data Management

Feb 2024 - Aug 2024

Universitat Politècnica de Catalunya

Second Semester in the Big Data Management and Analytics Master's program - BDMA Erasmus Mundus

Key Coursework:

Big Data Management Machine Learning Viability of Business Projects Semantic Data Management Big Data Seminar Ethics in Big Data Spanish A1

Master of Science - Business Intelligence Fundamentals

Sep 2023 - Jan 2024

Université libre de Bruxelles

First Semester in the Big Data Management and Analytics Master's program - BDMA Erasmus Mundus

Key Coursework:

Data Warehouses Database System Architectures Management of Business and Data Science Workflows Advanced Databases Data Mining French A1

Bachelor of Information Systems

Aug 2018 - Jun 2022

KIMEP University

Grade: Cum Laude GPA: 4.14/4.33 (Top 5%)

Major & Achievements:

Computer Information Technologies Theis Topic: Forex Trading Using Machine Learning Algorithms Scholarship for Citizens of Central Asian Countries 2018-2022

ERASMUS ICM Program with Scholarship

Jan 2021 - Jun 2021

Uppsala University

Grade: Pass with Distinction

Completed Courses:

Business Intelligence Applications Big Data Management and Analysis Business Analytics Digital Business

Global Korean Scholarship - 3D Modeling

Jun 2019 - Aug 2019

Kyungwoon University

Grade: Pass with Distinction

Achievements & Courses:

Airfoil Project was honored with a 2nd place for high research and project development Calculus Advanced Programming Information Systems and Networking Physics Drone Control

EXPERIENCES

🔬

Apr 2025 - Present

Research Intern - Data Science and Visualization

Inria/LISN

Gif-sur-Yvette, France

Responsibilities & Achievements:

Developed a 3D interactive prototype of non-planar(spherical) ambient display using Three.js and WebGL featuring particle-based waveform rendering, spatial sound localization, and dynamic meeting state visualization, achieving 60+ FPS with real-time processing of 100+ audio frames per second.
Engineered a dual-microphone ESP32 IoT system for real-time meeting room analytics, implementing FFT-based audio processing and synchronized stereo capture, with low-latency data streaming via WebSocket to support spatial audio visualization.
Implementing ML pipeline combining Spatial Audio Transformer (SAT) architecture with stereo-aware attention mechanisms for multi-task learning: speaker counting (3-class), meeting classification (5-class), and continuous engagement prediction.
Creating comprehensive data augmentation framework generating 50+ hours of synthetic training data from limited recordings through room impulse response simulation, voice conversion, and meeting scenario synthesis, reducing data collection time by 80%.
Optimizing deep learning models (SAT) for edge deployment on ESP32 microcontroller through knowledge distillation and 8-bit quantization, achieving <50ms inference time while maintaining 90%+ accuracy for real-time IoT implementation.

Technologies:

Python Three.js WebGL PyTorch ESP32 Machine Learning Audio Processing WebSocket

🇰🇷

Dec 2022 - Aug 2023

Data Science Intern

KOTRA (Korea Trade Promotion Corporation)

Tashkent, Uzbekistan

Responsibilities & Achievements:

Architected predictive analytics pipeline using Python (Pandas, Scikit-Learn, NumPy) to identify high-potential Uzbek buyers for Korean exporters, integrating CRM data, market intelligence, and headquarters datasets to enhance B2B lead generation accuracy by 35%.
Engineered 9+ interactive Tableau dashboards for event analytics, enabling data-driven planning of Korean-Uzbek trade conferences and exhibitions while improving attendee targeting precision and post-event ROI measurement.
Optimized end-to-end ML workflows including feature engineering, hyperparameter tuning, and model validation using logistic regression and ensemble methods (Random Forest, Gradient Boosting) for buyer classification tasks.
Employed predictive analytics with Python to pinpoint potential Uzbek buyers for Korean companies, increasing B2B engagement by analyzing sales data and market dynamics from industry reports and specific data provided by headquarters.
Developed 9+ dashboards in Tableau to visualize export data which helped to plan and execute major events (regional Korean-Uzbek conferences and exhibitions) with improved attendee targeting and engagement metrics.

Technologies:

Python Pandas Scikit-Learn NumPy Tableau Machine Learning Data Analytics Data Visualization

🎓

Jan 2022 - Sep 2022

Research Assistant

KIMEP University

Almaty, Kazakhstan

Responsibilities & Achievements:

Orchestrated data pipeline for sociological research projects on Asian politics including North Korean elite networks, and regional political structures, processing 10k+ datasets using R (dplyr, ggplot2) and Excel for statistical modeling and visualization.
Applied statistical analysis, natural language processing, and social network analysis to study political discourse and elite relationships in East Asian societies using Python and R.
Managed data collection and database queries using SQL to extract relevant subsets for comparative political studies across Asian regions.
Implemented data entry, data cleaning and producing figures in R and Excel
Manipulated data and datasets using SQL
Succeeded in performing advanced statistical analysis, NLP and SNA using Python and R

Technologies:

R Python SQL Excel ggplot2 NLP Social Network Analysis Statistical Analysis

📱

Jan 2022 - Apr 2022

iOS Developer Intern

ICE MEDIA & Tech Group

Almaty, Kazakhstan

Responsibilities & Achievements:

Learned the basics of functional programming
Completed SwiftUI framework integration successfully
Refined project management skills by developing mobile application designs, user flows
Developed from the scratch a weather app (code is available on GitHub)
Applied modern iOS development practices and design patterns
Collaborated with cross-functional teams to deliver mobile solutions

Technologies:

Swift SwiftUI iOS Development Xcode Git GitHub

🔬

Sep 2021 - Dec 2021

Data Science Intern

Data Science Lab

Almaty, Kazakhstan

Responsibilities & Achievements:

Acquired basic knowledge in Machine Learning fundamentals and methodologies
Learned how to collect, clean, and preprocess large datasets including handling missing values, normalizing data, and transforming data for analysis using Python with libraries such as pandas and NumPy
Developed skills in exploratory data analysis and data visualization techniques
Gained hands-on experience with data preprocessing workflows and best practices
Applied statistical methods for data validation and quality assessment

Technologies:

Python Pandas NumPy Machine Learning Data Analysis Data Visualization

💻

Jan 2020 - Sep 2020

Information Technology Assistant

KIMEP University

Almaty, Kazakhstan

Responsibilities & Achievements:

Supervised and monitored the work of administrative staff in the Center for Entrepreneurship and Innovation
Performed official report of IT and educational project for the Bang College of Business
Accomplished IT, social media and educational projects for KIMEP University
Provided technical support for educational initiatives and digital transformation projects
Collaborated with academic staff to improve IT infrastructure and processes

Technologies:

Microsoft Office System Administration Technical Support Documentation

PROJECTS

COMPETITION PROJECT

Chess Puzzle Difficulty Prediction

Apr 2025 – Present

♟️

Constructed advanced ML pipeline to predict chess puzzle difficulty ratings using 4.5M+ training instances. Architected custom PyTorch Transformer with specialized feature embeddings and designed hybrid Tree+Neural model combining LightGBM with deep neural networks.

🔗

RESEARCH PROJECT

Variational Autoencoder for Facial Verification

Sep 2024 – Feb 2025

👤

Developed face verification system using VAE and AE as feature extractors, achieving 86.65% accuracy on LFW benchmark dataset. Engineered deep learning architectures including ResNet18 with ArcFace loss processing 494k+ facial images.

🔗

RESEARCH PROJECT

3D Interactive Ambient Display System

Apr 2025 – Present

🌐

Developed 3D interactive prototype using Three.js and WebGL with particle-based waveform rendering and spatial sound localization. Engineered dual-microphone ESP32 IoT system for real-time meeting room analytics with ML pipeline.

🔗

ACADEMIC PROJECT

AmiGo Social App

Jan 2024 – Jun 2024

📱

Engineered ETL pipeline using PySpark and managed data with Delta Lake. Developed personalized recommendation algorithms via Apache Spark and implemented real-time data stream processing with Apache Kafka.

🔗

RESEARCH PROJECT

Predictive Analytics for B2B Lead Generation

Feb 2023 – Sep 2023

📊

Architected predictive analytics pipeline using Python to identify high-potential Uzbek buyers for Korean exporters, enhancing B2B lead generation accuracy by 35%. Engineered 9+ interactive Tableau dashboards for event analytics.

🔗

RESEARCH PROJECT

Political Discourse Analysis System

Jan 2022 – Sep 2022

🏛️

Orchestrated data pipeline for sociological research on Asian politics processing 10k+ datasets using R and Python. Applied NLP and social network analysis to study political discourse and elite relationships in East Asian societies.

🔗

MASTER COURSE PROJECT

Benchmarking the MySQL DBMS on TPC-DS Benchmark

Sep 2023 – Dec 2023

This project demonstrates the implementation of TPC-DS benchmarking on a MySQL database using Python scripts to automate the process. The primary goal is to evaluate the performance and scalability of the MySQL database under different conditions, as defined by the TPC-DS benchmark.

🔗

MACHINE LEARNING PROJECT

Multi-Modal Sentiment Analysis

Mar 2024 – Jun 2024

🎭

Developed a multi-modal sentiment analysis system combining text, audio, and visual features using transformer architectures. Implemented BERT for text processing, CNN for image analysis, and LSTM for audio features, achieving 94% accuracy on multimodal datasets.

🔗

DEEP LEARNING PROJECT

Graph Neural Networks for Drug Discovery

Oct 2024 – Jan 2025

🧬

Implemented Graph Convolutional Networks (GCN) and GraphSAGE for molecular property prediction in drug discovery. Built molecular graph representations and achieved state-of-the-art performance on BACE, BBBP, and Tox21 benchmark datasets using PyTorch Geometric.

🔗

NLP PROJECT

Automated Code Review Assistant

Nov 2023 – Feb 2024

🤖

Created an AI-powered code review assistant using fine-tuned CodeBERT and GPT-3.5 models. Implemented automated bug detection, code quality assessment, and suggestion generation, reducing manual review time by 60% across 500+ repositories.

🔗

COMPUTER VISION PROJECT

Real-Time Object Tracking System

Aug 2023 – Nov 2023

👁️

Developed a real-time multi-object tracking system using YOLO v8 for detection and DeepSORT for tracking. Implemented Kalman filtering and Hungarian algorithm for object association, achieving 95% tracking accuracy at 30 FPS on surveillance footage.

🔗

TIME SERIES PROJECT

Financial Market Prediction Platform

Jun 2022 – Aug 2022

📈

Built a comprehensive financial prediction platform using LSTM, ARIMA, and Prophet models for cryptocurrency and stock price forecasting. Integrated technical indicators, sentiment analysis from news data, and real-time trading signals with 78% accuracy.

🔗

Knowledge discovery for data streaming requires online feature selection to reduce the complexity of real-world datasets and significantly improve the learning process. This paper presents a comprehensive survey of feature selection (FS) algorithms for both static and dynamic environments, providing a detailed taxonomy that categorizes these methods based on search strategy, evaluation process, and feature structure.

Data Science Intern

EDUCATION

Master Thesis

Master Thesis Topic:

Master of Science - Computer Science & Artificial Intelligence

Key Coursework:

Master of Technology - Big Data Management

Key Coursework:

Master of Science - Business Intelligence Fundamentals

Key Coursework:

Bachelor of Information Systems

Major & Achievements:

ERASMUS ICM Program with Scholarship

Completed Courses:

Global Korean Scholarship - 3D Modeling

Achievements & Courses:

EXPERIENCES

Responsibilities & Achievements:

Technologies:

Responsibilities & Achievements:

Technologies:

Responsibilities & Achievements:

Technologies:

Responsibilities & Achievements:

Technologies:

Responsibilities & Achievements:

Technologies:

Responsibilities & Achievements:

Technologies:

PROJECTS

Chess Puzzle Difficulty Prediction

Variational Autoencoder for Facial Verification

3D Interactive Ambient Display System

AmiGo Social App

Predictive Analytics for B2B Lead Generation

Political Discourse Analysis System

Benchmarking the MySQL DBMS on TPC-DS Benchmark

Multi-Modal Sentiment Analysis

Graph Neural Networks for Drug Discovery

Automated Code Review Assistant

Real-Time Object Tracking System

Financial Market Prediction Platform

SKILLS

Programming Languages

Data Science & AI

Big Data Technologies

Databases

Data Visualization

Web Technologies

Research & Analysis

Specialized Tools

PUBLICATIONS

Streaming Feature Selection

Contact Me