Sign in

An engineer by profession, a bibliophile by heart!

A step-by-step guide to build a Python-based Movie Recommender System using Cosine Similarity

Have you ever imagined that a simple formula that you have studied in high school would play a part in recommending you a movie on the basis of the one you already like?

Well, here we are, using the Cosine Similarity (the dot product for normalized vectors) to build a Movie Recommender System!

What are Recommender Systems?

Recommender systems are an important class of machine learning algorithms that offer “relevant” suggestions to users. Youtube, Amazon, Netflix, all function on recommendation systems where the system recommends you the next video or product based on your past activity (Content-based Filtering) or based on activities and preferences…


Implementing Machine Learning Algorithms to classify emails

Email Classification is a Machine Learning problem that falls under the category of Supervised Learning.

This mini-project of Email Classification is inspired by J.K. Rowling’s publishing of a book under a pen-name. Udacity’s “Introduction to Machine Learning” provides a comprehensive study of the algorithms and the project.

A couple of years ago, Rowling wrote a book, “The Cuckoo’s Calling,” under the name Robert Galbraith. The book received some good reviews, but no one paid much attention to it — until an anonymous tipster on Twitter said it was J.K. Rowling. The London Sunday Times enlisted two experts to compare the…


The Reality of Revolutions unfolded by the Animals of Manor Farm

“Animal Farm” is a political satire written by George Orwell in 1946. Written as a short novel of 112 pages, the novella revolves around the story of a farm, where animals began a revolution to free themselves of human control.

Through the use of animal character and subtle meanings embedded within the script, Orwell talks about revolutions, how they are started, and how much benefit do they actually bring.

George Orwell

George Orwell was the pen name of Eric Arthur Blake, a critique writer, and journalist best known for “Animal Farm” and the dystopian novel “1984”.


Using the K-Means Algorithm for Vector Quantization of a Raccoon Grayscale Image

Vector Quantization

Vector Quantization is a lossy data compression technique. It allows the modeling of the probability density function by the distribution of the prototype vectors. There is some modification of data that renders the compression lossy.

Vector Quantization works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms.

(Definition from Wikipedia)

K-Means Algorithm

K-Means is a clustering algorithm, which clusters together data points based on the number of clusters you want to identify in your data…


Using Python to implement the various SVC kernels on the Iris Dataset

In this article, we will go through the SVC algorithm in the Sklearn library and experiment with the different kernels on the Iris Dataset.

Support Vector Classifier

Support Vector Classifier (SVC) is a supervised machine learning model used for two-group classification problems. After giving an SVC model set of labeled training data for each category, they’re able to categorize new test data.


Implementing the DBSCAN Algorithm to find Core Samples

DBSCAN — short for Density-Based Spatial Clustering of Application with Noise, is a density-based clustering algorithm. Clusters are formed based on the density parameters.

Density, in terms of DBSCAN, means the number of points that are located in a given area. The closer the points are to each other, the greater the density will be.

DBSCAN algorithm takes 2 parameters; ε —epsilon, which is the radius of the core points and the minimum number of data points in the cluster.

In the diagram below which is taken from Wikipedia, the minimum points have been selected as 4, minPts = 4.


Implementing Machine Learning Classification Algorithms to Recognize Handwritten Digits

Handwritten Digit Recognition is an interesting machine learning problem in which we have to identify the handwritten digits through various classification algorithms. There are a number of ways and algorithms to recognize handwritten digits, including Deep Learning/CNN, SVM, Gaussian Naive Bayes, KNN, Decision Trees, Random Forests, etc.

In this article, we will deploy a variety of machine learning algorithms from the Sklearn’s library on our dataset to classify the digits into their categories.

Let us first look at the dataset:

Downloading the Dataset

We will use Sklearn’s load_digits dataset, which is a collection of 8x8 images (64 features)of digits. …


A step by step guide to using PCA’s Eigenfaces & SVM for Facial Recognition

In this article, we will learn to use Principal Component Analysis and Support Vector Machines for building a facial recognition model.

First, let us understand what PCA and SVM are:

Principal Component Analysis:

Principal Component Analysis (PCA) is a machine learning algorithm that is widely used in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lower-dimensional data while preserving as much of the data’s variation as possible.


Using Python to dive into the biggest corporate fraud in American History and derive insights

The Enron fraud is a big, messy and totally fascinating story about corporate malfeasance of nearly every imaginable type.

In this article, we will use Python to analyze the dataset, and find out patterns and clues through data exploration, as well as build a regression model that could predict the bonus of a person at Enron based on the salaries they receive.

But first, we need to know a bit about the biggest corporate fraud in American history!

The Enron Case

Enron Corporation was an American energy, commodities, and services company based in Houston, Texas. …


The 5 Technical Indicators to Predict the Market

In this article, we will go through the 5 basic (yet powerful) technical indicators to understand the bullish and bearish market trends.

We will start by understanding the stock market prediction and then dive into a few indicators in order to understand bullish and bearish trends!

Stock market prediction is the act of trying to determine the future value of company stock or other financial instruments traded on an exchange. The successful prediction of a stock’s future price could yield a significant profit. …

Mahnoor Javed

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store