List of posts under category: Data wrangling
-
May 26, 2024
In this notebook we will explore running an LLM locally, as well as how our model can utilize RAG to alleviate problems of hallucination and knowledge gaps in the training data.
-
May 20, 2024
End to end recommendation system project covering data collection to web application deployment. The aim is to create personalized recommendations based on their preferences and viewing habits. This system also considers multiple users simultaneously for group-based suggestions.
-
May 8, 2024
In Part 3 of these notebooks we will improve our previous implementations and approaches to webscraping the popular anime database/community site MyAnimeList, discussing some limitations with the previous approaches and implementing their solutions
-
Apr 14, 2024
In this notebook we be scrape a popular anime database/community MyAnimeList, aiming to collect enough raw data from the anime titles available on the website for further processing and learning purposes.
-
Apr 12, 2024
In this notebook we be scrape a popular anime database/community MyAnimeList, aiming to collect enough raw data from the anime titles available on the website for further processing and learning purposes.
-
Apr 5, 2024
In this notebook we will explore 4 dirty datasets sourced from the internet that has data incorrectly recorded and clean them using Pandas package in Python.
-
Apr 4, 2024
In this notebook we will explore 4 dirty datasets sourced from the internet that has been structured poorly and clean them using Pandas package in Python.
-
Apr 1, 2024
In this notebook we explore a few HDB Resale Prices datasets fgrom Jan1990 to Mar2024, analysing the data to answer a few common questions homebuyers have in recent times.
-
Mar 19, 2024
This notebook explores a dataset containing the top 50 bestselling books on Amazon from the years 2010 to 2020 inclusive. Data was scraped from Amazon webpages and additional information was obtained from Google Books API.
-
Jan 1, 2024
In this notebook we look at changes in surface temperature of areas around the world from 1961-2020 using a dataset from the Food and Agriculture Organization of the United Nations. The changes in temperature are with reference to a baseline climatology found within the reference period of 1951-1980. Thereafter, we will attempt to forecast temperature changes using two popular techniques.
-
Jan 1, 2024
In this notebook we train classification models to identify the activities and subjects from a smartphone sensor dataset.
-
Jan 1, 2024
In this notebook we explore and analyse chat data of a controversial chat group where users discussed issues relating to the COVID-19 pandemic.
-
Jan 1, 2024
In this notebook we build and deploy a flask web application to detect and recognise handwritten words from an image.
-
Jan 1, 2024
In this notebook we will be analysing and discussing Covid-19 related data from all around the world, looking at how the pandemic hits different places differently and how to understand some statistics commonly quoted on mainstream/social media.
See all categories here