Posts listed under tag: matplotlib
-
May 30, 2024
In this notebook we will be working on the following Kaggle Challenge on a flood detection problem where the goal is to predict the probability of a region flooding based on various factors.
-
May 20, 2024
End to end recommendation system project covering data collection to web application deployment. The aim is to create personalized recommendations based on their preferences and viewing habits. This system also considers multiple users simultaneously for group-based suggestions.
-
May 3, 2024
In this notebook we will be exploring recommendation systems using 3 different approaches, applying them to data we have scraped from a popular anime database/community website.
-
Apr 28, 2024
In this notebook we will explore the datasets scraped by our webscraping scripts. Exploration in this notebook will be guided by some key questions within each section.
-
Apr 1, 2024
In this notebook we explore a few HDB Resale Prices datasets fgrom Jan1990 to Mar2024, analysing the data to answer a few common questions homebuyers have in recent times.
-
Mar 19, 2024
This notebook explores a dataset containing the top 50 bestselling books on Amazon from the years 2010 to 2020 inclusive. Data was scraped from Amazon webpages and additional information was obtained from Google Books API.
-
Mar 15, 2024
In this notebook we will explore a synthetic bank customer churn dataset used in a Kaggle community prediction competition, treating this like a real world problem and avoiding the use of any performance-boosting tricks that is are only specific to this competition dataset (i.e. utilizing data leakages due to the syntheticity of the data.
-
Mar 13, 2024
In this notebook we take a look at a [Kaggle Playground Series](https://www.kaggle.com/competitions/playground-series-s4e2) competition where users submit their predictions for a multi-class classification problem on the sample's weight class.
-
Feb 20, 2024
In this notebook we will be exploring the IMDB dataset available on Kaggle, containing 50,000 reviews categorised as either positive or negative reviews. A text classification model will then be fine-tuned over DistilBERT and evaluated.
-
Feb 2, 2024
Capstone project for Google's Advanced Data Analytics Course on Coursera, simulating a scenario where the HR department of a large consulting firm is looking for insights from our data analysis and predictions on employee churn data.
-
Jan 1, 2024
In this notebook we look at changes in surface temperature of areas around the world from 1961-2020 using a dataset from the Food and Agriculture Organization of the United Nations. The changes in temperature are with reference to a baseline climatology found within the reference period of 1951-1980. Thereafter, we will attempt to forecast temperature changes using two popular techniques.
-
Jan 1, 2024
In this notebook we will be exploring rainfall patterns in Singapore, showing the seasonal patterns of rainfall and how some areas of the island receive more rainfall than others. Models will also be built and tested to forecast monthly rainfall on the island.
-
Jan 1, 2024
In this notebook we train classification models to identify the activities and subjects from a smartphone sensor dataset.
-
Jan 1, 2024
In this notebook we will be using an autoencoder on the fraud dataset used in a previous notebook for novelty detection. Novelty detection refers to the identification of new or unknown signals not available to a machine learning system during training. In this case it refers to training a machine learning model only on normal(non-fradulent) transactions data but the resultant model has the ability to recognise fraudulent transactions.
-
Jan 1, 2024
In this notebook we train classification models to identify the activities and subjects from a smartphone sensor dataset.
-
Jan 1, 2024
In this notebook we explore and analyse chat data of a controversial chat group where users discussed issues relating to the COVID-19 pandemic.
-
Jan 1, 2024
In this notebook we will be analysing smartphone sensors data collected from an experiment and analysing the information retrieved from the data, studying the extent that these data can be used to identify the user.
-
Jan 1, 2024
In this notebook we build and deploy a flask web application to detect and recognise handwritten words from an image.
-
Jan 1, 2024
This notebook explores a dataset of credit card transactions over a span of two days, analysing the data and tackling the extremely imbalanced classification problem of fraud detection.
-
Jan 1, 2024
In this notebook we will be analysing and discussing Covid-19 related data from all around the world, looking at how the pandemic hits different places differently and how to understand some statistics commonly quoted on mainstream/social media.
1
numpy
pandas
matplotlib
seaborn
scikit-learn
classification
statistics
nlp
fun
scipy
dimensionality_reduction
webscrape
tensorflow
computer_vision
requests
html
bs4
transfer_learning
regression
pytorch
nltk
multiprocessing
kaggle
generative_ai
featured
competition
transformers
statsmodels
statsmodel
sql
recommendation
ollama
object_detection
langchain
forecast
flask
embedding
database
cv2
automation
api
tkinter
statistics
math
gradio