|

A Master's student with a robust background in Machine Learning and a passion for Security who enjoys writing code to solve problems. Equipped with strong analytical thinking, problem-solving skills and attention to detail eager to work on real-world problems.

// About

I hold a BSc Degree in Computer Science from Aristotle University of Thessaloniki (AUTh), and I am currently finishing my MSc Degree on Data and Web Science. I am honored to be a Google DeepMind Scholar, with my master's studies fully funded by the scholarship.
As a Data Scientist, I use Python for data analysis and Machine Learning with Pandas, NumPy, and Scikit-learn. I'm also skilled in TensorFlow, PyTorch, R, SQL, and big data technologies like Apache Spark.
When I'm not crunching data, I'm solving sudoku puzzles, doodling my latest inspirations, or enjoying a good book.

// Areas of Interest

Machine Learning Icon

Machine Learning

I'm passionate about transforming problems so ML models can uncover hidden patterns and solutions. I admire ML's versatility across fields and industries. I'm also committed to ensuring the fairness and explainability of ML models, aiming for transparent, ethical, and equitable systems for all users.

LLMs Icon

Large Language Models

I am interested in large language models (LLMs) for their advancements in language generation and comprehension. I seek to explore how LLMs can be utilized to solve complex problems across various domains.

NLP Icon

Natural Language Processing

I am focused on Natural Language Processing (NLP) due to its critical role in understanding and interpreting complex language data. My aim is to leverage NLP techniques to analyze text and improve conversational AI, enhancing interactions between humans and machines.

Security Icon

Cybersecurity

Passionate about providing solutions with respect to security. I am committed to protecting sensitive data and ensuring its confidentiality and integrity. Additionally, I focus on using ML techniques to detect, analyze, and mitigate cyber threats.

// Studies

Master's in Data and Web Science

Aristotle University of Thessaloniki

I am currently pursuing my master's in Data and Web Science at AUTh. For my master's I was awarded with the Google DeepMind scholarship. The courses I attended during my master's are:

  • Managing and Mining Complex Networks
  • Technologies for Big Data Analytics
  • Statistical Data Analysis
  • Social Network Analysis
  • Decentralized Technologies
  • Knowledge Graphs and Ontology Engineering
  • Advanced Topics in Databases
  • Advanced Topics in Machine Learning

// Projects

Fine-Tuning LLMs
for Political Speech Classification

Fine-tuned different models (e.g. Llama, gpt3.5) to test whether fine-tuning improves the predictive ability of a simple LLM. We managed to achieve better performance with the fine-tuned gpt-3.5-turbo-0613 achieving an F1-score of 0.7. The task was performed on the HPC infrastructure of AUTh.

Fine-Tuning, QLoRA, LoRA, Slurm, LLMs, NLP

Synthetic Data Generation
for Medical Applications

Employed a WGAN and a WGAN-GP to evaluate synthetic data in medical datasets (KneeXrayOA, chestXray). Assessed the performance of the GANs in terms of image quality, speed, and usability. Our WGAN-GP achieved an Inception Score (IS) of 1.92 and a Kernel Inception Distance (KID) of 0.288 on the KneeXrayOA dataset.

Generative Adversarial Networks (GANs), Medical Imaging, Synthetic Data

Microblogging Fake
Account Detection based on ML and Graph Methods

Extracted features from the dataset and applied feature selection techniques such as mRMR, PCA, and Spearman's to create a refined dataset for training machine learning models. Ultimately, developed a LightGBM model that achieved an F1 score of 0.997, outperforming state-of-the-art classifiers in the literature.

Machine Learning, APIs, Feature Selection, Feature Extraction

Software Cost Analysis

Analysis of software project costs, identifying key factors such as application size, personnel skills in tools, and logical complexity that significantly influence effort requirements. Through statistical modeling and hypothesis testing, we developed a predictive forward model that explained 76.93% of effort (target) variance.

R, Statistical Data Analysis, ANOVA, Levene's

Secure Help

We worked on Secure Help, a Django and React-based refugee management system, focusing on identifying and mitigating vulnerabilities. Our approach involved thorough analysis using OWASP guidelines to pinpoint potential security issues. Additionally, we conducted comprehensive threat modeling, assessing business assets, technical risks, and defining security requirements. Our work resulted in a detailed plan to test and strengthen the system's defenses.

Django, React, OWASP, Threat Modelling, RMF

Search Engine for the
Greek Parliament Proceedings

We built a Python-based search engine that uses an inverted index for efficient text searches. We enhanced text analysis and search accuracy with NLP techniques. Additionally, we applied LSA to identify key themes and used the Apriori algorithm to generate association rules from speech data, providing insights for political analysis.

Python Flask, Front/Back-end development, Reverse Index, LSA

Algorithmic Triangle
Counting: Exact, Approximate, and Streaming Methods

Implemented and compared several triangle counting algorithms, both exact and approximate. Exact algorithms served as benchmarks. DOULION for approximate counting performed best at a probability of 0.75 with minimal error, and TRIEST-improved consistently achieved lower absolute errors than TRIEST-base across all graph sizes.

Graph Algorithms, Data Structures

Spatial Database for Urban
Infrastructure and Community Activities

Designed a spatial database for modelling city interactions between businesses and citizens, using PostgreSQL with PostGIS and QGIS for visualization. The process included creating an ER diagram, converting it to a relational schema, filling the database with data, and developing (spatial) queries to test the capabilities of the database.

ER diagram, (Spatial) SQL, PostgreSQL, PostGIS

Graph-based
Music Recommendation Systems Evaluation

Implemented the CORLP (Complex Representation-based Link Prediction) method as part of an evaluation for music recommendation systems on a subset of the Spotify Million Playlist Dataset.

Link Prediction, Recommendation System

Amazon Books Rating Knowledge Graph

Developed an ontology based on the Amazon Books Rating dataset and with the use of RML the dataset was mapped into a Knowledge Graph. This graph was loaded into GraphDB and SPARQL queries were performed.

Semantic Web, Knowledge Graphs, Ontology Engineering, GraphDB, SPARQL

// Contact

You can connect with me here: