Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
Published:
This project studied the potential of Big data to inform destination management organizations. To do so, three sources of Big data are discussed: Telecom, Social media and Airbnb data. This is done through the demonstration and analysis of a set of visualizations and tools, as well as a discussion of applications and recommendations for challenges that have been identified in the market.
Published:
This project focused on the exploration of several machine learning (ML) techniques, covering different stages of a Land Use/Land Cover Classification (LULC) pipeline. These techniques aimed to minimise problems typically found in this kind of data, namely data ingestion, feature selection, data filtering and classification. This work was a joint effort between me and Manvel Khudinyan.
Published:
Cyclone Amphan made landfall in South Asia on May 20, 2020. It was the most damaging storm in the history of the Indian Ocean, rendering hundreds of thousands of people homeless, ravaging agricultural lands and causing billions of dollars in damage. How were people affected by the storm? What were the responses of individuals, governments, corporates and NGOs? How was it covered by local, national and international media, as opposed to individuals’ accounts? Who has created the dominant narratives of Cyclone Amphan; and whose voices go unheard? We aim to use online data – such as Twitter posts, news headlines and research publications – to analyze people’s experiences of Cyclone Amphan.
Published:
The competition’s aim was to optimize a keyboard layout to minimize the workload for usage by an ALS patient. Anthony Carbajal inspired this challenge, a full-time daily life hacker that aims to find innovative ways to improve his and other ALS-patient lives, with whom NILG.AI worked together for developing a first version of this solution.
Published:
This project aims to develop a Competitive Intelligence platform through Natural Language Processing and various visualization techniques. We employ text preprocessing and embedding techniques to encode a large corpus of text as well as Self-Organizing Maps, an unsupervised neural network that facilitates the development of multiple machine learning tasks and visualize high dimensional data.
Published:
ML-Research contains the software implementation of most algorithms used or developed in my research. Specifically, it contains scikit-learn compatible implementations for Active Learning, Oversampling, Datasets and various utilities to assist in experiment design and results reporting. Other techniques, such as self-supervised learning and semi-supervised learning are currently under development and are being implemented in pytorch and intended to be scikit-learn compatible.
Published in Remote Sensing, 2019
In this paper, we address the imbalanced learning problem, a common and difficult conundrum in remote sensing that affects the quality of classification results, by proposing Geometric-SMOTE, a novel oversampling method, as a tool for addressing the imbalanced learning problem in remote sensing.
Recommended citation: Douzas, G., Bacao, F., Fonseca, J., & Khudinyan, M. (2019). Imbalanced Learning in Land Cover Classification: Improving Minority Classes’ Prediction Accuracy Using the Geometric SMOTE Algorithm. Remote Sensing, 11(24), 3040. https://doi.org/10.3390/rs11243040
Published in IJCAI 2021 Workshop on AI for Social Good, 2021
In this paper, we contribute two novel methodologies that leverage Twitter discourse to characterize narratives and identify unmet needs in response to Cyclone Amphan, which affected 18 million people in May 2020.
Recommended citation: Crayton A, Fonseca J, Mehra K, Ng M, Ross J, Sandoval-Castañeda M, von Gnecht R. (2021). Narratives and Needs: Analyzing Experiences of Cyclone Amphan Using Twitter Discourse, in IJCAI 2021 Workshop on AI for Social Good. https://crcs.seas.harvard.edu/publications/narratives-and-needs-analyzing-experiences-cyclone-amphan-using-twitter-discourse
Published in Information, 2021
In this paper, we address the imbalanced learning problem, by using K-means and the Synthetic Minority Oversampling TEchnique (SMOTE) as an improved oversampling algorithm. K-Means SMOTE improves the quality of newly created artificial data by addressing both the between-class imbalance, as traditional oversamplers do, but also the within-class imbalance, avoiding the generation of noisy data while effectively overcoming data imbalance.
Recommended citation: Fonseca, J., Douzas, G., Bacao, F. (2021). Improving Imbalanced Land Cover Classification with K-Means SMOTE: Detecting and Oversampling Distinctive Minority Spectral Signatures. Information, 12(7), 266. https://doi.org/10.3390/info12070266
Published in Remote Sensing, 2021
In this paper, we introduce a new component to the typical AL framework, the data generator, a source of artificial data to reduce the amount of user-labeled data required in AL.
Recommended citation: Fonseca, J., Douzas, G., Bacao, F. (2021). Increasing the Effectiveness of Active Learning: Introducing Artificial Data Generation in Active Learning for Land Use/Land Cover Classification. Remote Sensing, 13(13), 2619. https://doi.org/10.3390/rs13132619
Published in arXiv, 2022
In this paper we identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature.
Recommended citation: Fonseca, J., & Bacao, F. (2022). Research Trends and Applications of Data Augmentation Algorithms. arXiv preprint arXiv:2207.08817. https://arxiv.org/abs/2207.08817
Published in International Journal of Intelligent Systems, 2023
This paper proposes a new AL framework, which relies on the effective use of artificial data.
Recommended citation: Fonseca, J., & Bacao, F. (2023). Improving Active Learning Performance through the Use of Data Augmentation. International Journal of Intelligent Systems, 2023. https://doi.org/10.1155/2023/7941878
Published in UNDER SUBMISSION, 2023
In this paper, we propose Geometric SMOTE for Nominal and Continuous features (G-SMOTENC), based on a combination of G-SMOTE and SMOTENC.
Recommended citation: Fonseca, J., & Bacao, F. (2023). Geometric SMOTE for Imbalanced Datasets with Nominal and Continuous Features. Under Submission.
Published in UNDER SUBMISSION, 2023
In this paper we analyse tabular and latent space synthetic data generation algorithms.
Recommended citation: Fonseca, J., Bacao, F. (2023). Tabular and Latent Space Synthetic Data Generation: A Literature Review. Under Submission.
Undergraduate course, Universidade NOVA de Lisboa, NOVA School of Business and Economics, 2018
Course taught to Economics and Management undergraduate students.
Masters course, Universidade NOVA de Lisboa, NOVA Information Management School, 2019
Course taught to Information Management masters students.
Masters course, Universidade NOVA de Lisboa, NOVA School of Business and Economics, 2020
Course taught to Management masters students.
Masters course, Universidade NOVA de Lisboa, NOVA Information Management School, 2021
Academic years 2020/2021 until 2022/2023
Masters course, Universidade NOVA de Lisboa, NOVA Information Management School, 2022
Academic years 2020/2021 and 2021/2022