Auditing and Safeguarding Large Language Models

This project focuses on developing comprehensive methods for auditing and safeguarding Large Language Models (LLMs) to ensure their safe and responsible deployment in real-world applications. Key Components SafeNudge A real-time safeguarding method designed to protect Large Language Models against red teaming attacks and harmful prompt injections. SafeNudge provides tunable safety-performance trade-offs, allowing organizations to customize protection levels based on their specific use cases and risk tolerance. This paper is currently under submission, an early preprint is available on ArXiv and here: SafeNudge: Real-Time Safeguarding for Large Language Models. ...

June 2025 · João Fonseca

Machine Learning Explainability Frameworks

This project focuses on developing frameworks for explaining and interpreting machine learning model predictions. Key Components ShaRP (Shapley for Rankings and Preferences) A framework that explains the contributions of features to different aspects of ranked outcomes, based on Shapley values. This approach is particularly useful for understanding how different features influence ranking decisions in recommendation systems, search results, and other ranked outputs. See the paper for more details: ShaRP: Explaining Rankings with Shapley Values. ...

October 2024 · João Fonseca

Multi-agent Algorithmic Recourse Over Time

This project studies the importance of time in the reliability of algorithmic recourse. We highlight the lack of reliability in recourse recommendations over several competitive settings, potentially setting misguided expectations that could result in detrimental outcomes. These findings emphasize the importance of meticulous consideration when AI systems offer guidance in dynamic environments. Our paper, "Setting the Right Expectations: Algorithmic Recourse Over Time", won the Best AI Track Paper award at EAAMO'23! ...

July 2023 · João Fonseca

ML-Research - An Open Source Library for Machine Learning Research

PyPI Anaconda ML-Research is an open source library for machine learning research. It contains the software implementation of most algorithms used or developed in my research. Specifically, it contains scikit-learn compatible implementations for Active Learning, Oversampling, Datasets and various utilities to assist in experiment design and results reporting. Other techniques, such as self-supervised learning and semi-supervised learning are currently under development and are being implemented in pytorch and intended to be scikit-learn compatible. ...

March 2022 · João Fonseca

MapIntel - Interactive Visual Analytics Platform for Competitive Intelligence

This research project aims to develop a Competitive Intelligence platform through Natural Language Processing and different visualization techniques. We employ text preprocessing and embedding techniques to encode a large corpus of text as well as Self-Organizing Maps, an unsupervised neural network that facilitates the development of multiple machine learning tasks and visualize high dimensional data. This project is funded by “Fundação para a Ciência e Tecnologia” (Portugal) and is being developed at NOVA IMS. ...

March 2021 · João Fonseca

Winning Project - DSSG Summit 2020 Challenge - Keyboard Layout Optimization for ALS Patients Competition

This competition was organized by NILG.AI, together with the DSSG Summit 2020. The competition’s aim was to optimize a keyboard layout to minimize the workload for usage by an ALS patient. Anthony Carbajal inspired this challenge, a full-time daily life hacker that aims to find innovative ways to improve his and other ALS-patient lives, with whom NILG.AI worked together for developing a first version of this solution. Team members João Fonseca David Silva Relevant links Competition’s homepage Competition’s GitHub Repository Project’s GitHub Repository Competition Winner Certificate DSSG Summit 2020 - Winning Community Challenge Submission

October 2020 · João Fonseca

Amphan - Analyzing Experiences of Extreme Weather Events using Online Data

Cyclone Amphan made landfall in South Asia on May 20, 2020. It was the most damaging storm in the history of the Indian Ocean, rendering hundreds of thousands of people homeless, ravaging agricultural lands and causing billions of dollars in damage. How were people affected by the storm? What were the responses of individuals, governments, corporates and NGOs? How was it covered by local, national and international media, as opposed to individuals’ accounts? Who has created the dominant narratives of Cyclone Amphan; and whose voices go unheard? We aim to use online data – such as Twitter posts, news headlines and research publications – to analyze people’s experiences of Cyclone Amphan. ...

September 2020 · João Fonseca

IPSTERS - IPSentinel Terrestrial Enhanced Recognition System

This project focused on the exploration of several machine learning (ML) techniques, covering different stages of a Land Use/Land Cover Classification (LULC) pipeline. These techniques aimed to minimise problems typically found in this kind of data, namely data ingestion, feature selection, data filtering and classification. The results shown here are the result of a joint effort. Manvel Khudinyan developed all active learning experiments, which I converted into a Command-line Interface, in order to facilitate its use by the remote sensing specialists at Direção Geral do Território. ...

May 2020 · João Fonseca

Harnessing Big Data to Inform Tourism Destination Management Organizations

Reports Online report Tourist flows in Portugal (August 2017) Master thesis Github repositories Telecom data analysis Airbnb data analysis Social Media Crawler Abstract In the last few years, Portugal has been witnessing a rapid growth of tourism, which reflects positively in many aspects, especially in what regards economic factors. Although, it also leads to a number of challenges, all of them difficult to quantify: tourist congestions, loss of city identity, degradation of patrimony, etc. It is important to ensure that the required foundations and tools to understand and efficiently manage tourism flows exist, both in the city-level and country-level. ...

January 2019 · João Fonseca